Expand description
Candle-powered rollout collector.
Connects CandleActorCritic directly to AsyncCollector so that
policy inference runs in pure Rust — no Python calls during collection.
§Architecture
┌─────────────────────────────────────────┐
│ Background Thread (pure Rust) │
│ │
│ loop { │
│ obs = VecEnv.step_all() ~96μs │
│ act = Candle.act(obs) ~15μs │ ← no Python dispatch
│ val = Candle.value(obs) ~10μs │
│ buffer.push(obs, act, val) │
│ GAE = compute_gae_batched() ~5μs │
│ channel.send(batch) │
│ } │
└─────────────────────────────────────────┘
↕ crossbeam channel
┌─────────────────────────────────────────┐
│ Python Main Thread │
│ │
│ batch = collector.recv() │
│ loss = ppo_loss(batch) (PyTorch) │
│ loss.backward() │
│ optimizer.step() │
│ collector.sync_weights(flat_params) │
└─────────────────────────────────────────┘Structs§
- Shared
Policy - Shared policy wrapper that allows the collection thread to read the policy while the main thread updates weights.
Functions§
- make_
candle_ callbacks - Build
action_fnandvalue_fnclosures that read from a shared policy.