pub struct HERBuffer { /* private fields */ }Expand description
Hindsight Experience Replay buffer.
Stores transitions with goal information and performs goal relabeling
during sampling. The obs vector layout is:
[obs_core | achieved_goal | desired_goal | ...]
Implementations§
Source§impl HERBuffer
impl HERBuffer
Sourcepub fn new(
capacity: usize,
obs_dim: usize,
act_dim: usize,
goal_dim: usize,
achieved_goal_start: usize,
desired_goal_start: usize,
strategy: HERStrategy,
goal_tolerance: f32,
) -> Self
pub fn new( capacity: usize, obs_dim: usize, act_dim: usize, goal_dim: usize, achieved_goal_start: usize, desired_goal_start: usize, strategy: HERStrategy, goal_tolerance: f32, ) -> Self
Create a new HER buffer.
§Arguments
capacity- maximum transitionsobs_dim- full observation dimension (includes goal components)act_dim- action dimensiongoal_dim- goal vector dimensionachieved_goal_start- index within obs where achieved goal startsdesired_goal_start- index within obs where desired goal startsstrategy- relabeling strategygoal_tolerance- tolerance for sparse reward computation
Sourcepub fn push_slices(
&mut self,
obs: &[f32],
next_obs: &[f32],
action: &[f32],
reward: f32,
terminated: bool,
truncated: bool,
) -> Result<(), RloxError>
pub fn push_slices( &mut self, obs: &[f32], next_obs: &[f32], action: &[f32], reward: f32, terminated: bool, truncated: bool, ) -> Result<(), RloxError>
Push a single transition, notifying the episode tracker.
Sourcepub fn sample_with_relabeling(
&self,
batch_size: usize,
her_ratio: f32,
seed: u64,
) -> Result<SampledBatch, RloxError>
pub fn sample_with_relabeling( &self, batch_size: usize, her_ratio: f32, seed: u64, ) -> Result<SampledBatch, RloxError>
Sample a batch with HER relabeling.
her_ratio controls the fraction of samples that get relabeled goals.
The remaining samples use their original goals.
Sourcepub fn compute_relabel_indices(
&self,
episode: &EpisodeMeta,
transition_offset: usize,
seed: u64,
) -> Vec<usize>
pub fn compute_relabel_indices( &self, episode: &EpisodeMeta, transition_offset: usize, seed: u64, ) -> Vec<usize>
Compute relabeling indices for a given episode and transition.
Returns indices (offsets within the episode) to use as substitute goals.
Sourcepub fn num_complete_episodes(&self) -> usize
pub fn num_complete_episodes(&self) -> usize
Number of complete episodes currently tracked.
Trait Implementations§
Auto Trait Implementations§
impl Freeze for HERBuffer
impl RefUnwindSafe for HERBuffer
impl Send for HERBuffer
impl Sync for HERBuffer
impl Unpin for HERBuffer
impl UnwindSafe for HERBuffer
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more