pub struct CandleTwinQ { /* private fields */ }Implementations§
Source§impl CandleTwinQ
impl CandleTwinQ
pub fn new( obs_dim: usize, act_dim: usize, hidden: usize, lr: f64, device: Device, ) -> Result<Self, NNError>
Sourcepub fn q1_forward(&self, obs: &Tensor, actions: &Tensor) -> Result<Tensor>
pub fn q1_forward(&self, obs: &Tensor, actions: &Tensor) -> Result<Tensor>
Forward through Q1 with autograd preserved (for actor loss in SAC/TD3).
Sourcepub fn twin_q_forward(
&self,
obs: &Tensor,
actions: &Tensor,
) -> Result<(Tensor, Tensor)>
pub fn twin_q_forward( &self, obs: &Tensor, actions: &Tensor, ) -> Result<(Tensor, Tensor)>
Forward through both Q-networks with autograd preserved (for SAC actor loss).
Trait Implementations§
Source§impl ContinuousQFunction for CandleTwinQ
impl ContinuousQFunction for CandleTwinQ
Source§fn q_value(
&self,
obs: &TensorData,
actions: &TensorData,
) -> Result<TensorData, NNError>
fn q_value( &self, obs: &TensorData, actions: &TensorData, ) -> Result<TensorData, NNError>
Compute Q-value for (obs, action). Returns [batch_size].
Source§fn twin_q_values(
&self,
obs: &TensorData,
actions: &TensorData,
) -> Result<(TensorData, TensorData), NNError>
fn twin_q_values( &self, obs: &TensorData, actions: &TensorData, ) -> Result<(TensorData, TensorData), NNError>
Compute twin Q-values for (obs, action). Returns (q1, q2), each [batch_size].
Source§fn target_twin_q_values(
&self,
obs: &TensorData,
actions: &TensorData,
) -> Result<(TensorData, TensorData), NNError>
fn target_twin_q_values( &self, obs: &TensorData, actions: &TensorData, ) -> Result<(TensorData, TensorData), NNError>
Compute target twin Q-values (from target networks).
Source§fn critic_step(
&mut self,
obs: &TensorData,
actions: &TensorData,
targets: &TensorData,
) -> Result<TrainMetrics, NNError>
fn critic_step( &mut self, obs: &TensorData, actions: &TensorData, targets: &TensorData, ) -> Result<TrainMetrics, NNError>
Perform one TD gradient step on both critics.
Source§fn soft_update_targets(&mut self, tau: f32)
fn soft_update_targets(&mut self, tau: f32)
Polyak soft update of target networks.
impl Send for CandleTwinQ
impl Sync for CandleTwinQ
Auto Trait Implementations§
impl Freeze for CandleTwinQ
impl !RefUnwindSafe for CandleTwinQ
impl Unpin for CandleTwinQ
impl !UnwindSafe for CandleTwinQ
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
§impl<T> Instrument for T
impl<T> Instrument for T
§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more