API 1.0.2.2. Indexer - Reinforcement-Learning-TU-Vienna/dice_rl_TU_Vienna GitHub Wiki
The Indexer
class provides helper functionality to convert state-action pairs
def __init__(self, n_obs, n_act):
Args:
-
n_obs
(int): Number of observations (states)$|S|$ . -
n_act
(int): Number of actions$|A|$ .
@property
def dimension(self):
Returns the size of the state-action space
def get_index(self, obs, act):
Returns a flattened index
Args:
-
obs
(int or tensor): Observation (state)$s$ . -
act
(int or tensor): Action$a$ .
Returns:
-
index
(int or tensor): Scalar or tensor index$i$ corresponding to the state-action pair$(s, a)$ .
All tabular arrays for value functions or densities are flattened over the state-action space.
def obs_act_to_index(
obs, act,
n_obs=None, n_act=None,
neighbours="act"):
Returns a flattened index
Args:
-
obs
(int or tensor): Observation (state)$s$ . -
act
(int or tensor): Action$a$ . -
n_obs
(int, optional): Number of observations (states)$|S|$ . -
n_act
(int, optional): Number of actions$|A|$ . -
neighbours
(str, optional): Whether to place actions ("act"
) or observations ("obs"
) as the fast-changing dimension.
Depending on whether
neighbours
is"act"
or "obs
",n_act
orn_obs
must be proveded, respectively.
def index_to_obs_act(
index,
n_obs=None, n_act=None,
neighbours="act"):
Inverse of obs_act_to_index
. Recovers the state-action pair
Args:
-
index
(int or tensor): Index$i$ to be converted into state-action pair$(s, a)$ . -
n_obs
(int, optional): Number of observations (states)$|S|$ . -
n_act
(int, optional): Number of actions$|A|$ . -
neighbours
(str, optional): Whether to place actions ("act"
) or observations ("obs"
) as the fast-changing dimension.
Depending on whether
neighbours
is"act"
or "obs
",n_act
orn_obs
must be provided, respectively.