Crate rlox_core

Crate rlox_core 

Source
Expand description

rlox-core: Rust-accelerated reinforcement learning primitives.

This crate provides the high-performance data plane for rlox:

  • Buffers (buffer): Ring buffer, prioritized replay (SumTree), memory-mapped replay, offline dataset buffer, columnar storage.
  • Environments ([env]): Native CartPole, vectorized stepping (Rayon), Gymnasium wrapper via PyO3.
  • Training (training): GAE (single + batched), V-trace, expectile loss.
  • LLM ops (llm): Token-level KL divergence, GRPO group advantages, sequence packing, DPO pairs — all with f32/f64 variants and Rayon parallelism.
  • Pipeline (pipeline): Async rollout collector with crossbeam channels, backpressure, and flat RolloutBatch format.

All operations release the GIL when called from Python via PyO3.

Modules§

buffer
Experience replay buffers for reinforcement learning.
env
error
llm
pipeline
seed
training