Module simd_ops

Module simd_ops 

Source
Expand description

SIMD-accelerated operations for training hot loops.

Provides vectorized versions of weight updates, reward shaping, and priority computation. Gated behind the simd feature flag.

Strategy: use chunks_exact with manual unrolling to give LLVM strong auto-vectorization hints. On x86_64, the compiler will emit AVX2/SSE instructions for these patterns. No nightly features or std::simd required.

Functionsยง

average_weights_simd
SIMD-friendly weight vector averaging: result[i] = sum(vectors[j][i]) / n
compute_priorities_simd
SIMD-friendly LAP priority: priority[i] = |loss[i]| + epsilon
copy_pixel_row
Copy a contiguous row of pixels using copy_from_slice (auto-vectorizes to SIMD memcpy on all targets).
pbrs_simd
SIMD-friendly PBRS: result[i] = rewards[i] + gamma * phi_next[i] - phi_current[i]
polyak_update_simd
SIMD-friendly Polyak update: target[i] = tau * source[i] + (1 - tau) * target[i]
reptile_update_simd
SIMD-friendly Reptile update: target[i] += lr * (source[i] - target[i])