Structs§
- Action
Output - Action output from a policy.
- DQNStep
Config - DQN step configuration.
- Eval
Output - Output from evaluating a policy on (obs, actions) pairs.
- MLPConfig
- Configuration for building an MLP.
- PPOStep
Config - PPO step configuration.
- SACStep
Config - SAC step configuration.
- TD3Step
Config - TD3 step configuration.
- Train
Metrics - Training metrics dictionary.
Enums§
- Activation
- Activation function.
Traits§
- Actor
Critic - Actor-Critic policy for on-policy algorithms (PPO, A2C).
- ContinuousQ
Function - Continuous Q-function for SAC/TD3 (takes obs + action as input).
- Deterministic
Policy - Deterministic policy for TD3.
- Entropy
Tuner - Entropy tuning for SAC.
- QFunction
- Q-value network for off-policy algorithms (DQN).
- Stochastic
Policy - Continuous stochastic policy for SAC.