Expand description
SIMD-accelerated operations for training hot loops.
Provides vectorized versions of weight updates, reward shaping, and
priority computation. Gated behind the simd feature flag.
Strategy: use chunks_exact with manual unrolling to give LLVM strong
auto-vectorization hints. On x86_64, the compiler will emit AVX2/SSE
instructions for these patterns. No nightly features or std::simd required.
Functionsยง
- average_
weights_ simd - SIMD-friendly weight vector averaging:
result[i] = sum(vectors[j][i]) / n - compute_
priorities_ simd - SIMD-friendly LAP priority:
priority[i] = |loss[i]| + epsilon - copy_
pixel_ row - Copy a contiguous row of pixels using
copy_from_slice(auto-vectorizes to SIMD memcpy on all targets). - pbrs_
simd - SIMD-friendly PBRS:
result[i] = rewards[i] + gamma * phi_next[i] - phi_current[i] - polyak_
update_ simd - SIMD-friendly Polyak update:
target[i] = tau * source[i] + (1 - tau) * target[i] - reptile_
update_ simd - SIMD-friendly Reptile update:
target[i] += lr * (source[i] - target[i])