API 1.0.2.2. Indexer - Reinforcement-Learning-TU-Vienna/dice_rl_TU_Vienna GitHub Wiki
Indexer
(Utility Class)
The Indexer
class provides helper functionality to convert state-action pairs $(s, a) \in S \times A$ into flattened indices $i$ over the tabular space $\{ 0, \dots, |S \times A| - 1 \}$.
🏗️ Constructor
def __init__(self, n_obs, n_act):
Args:
n_obs
(int): Number of observations (states) $|S|$.n_act
(int): Number of actions $|A|$.
📦 Properties
@property
def dimension(self):
Returns the size of the state-action space $|S \times A|$.
📐 Methods
def get_index(self, obs, act):
Returns a flattened index $i$ of a state-action pair $(s, a)$ via
$$ i = s \times |A| + a. $$
Args:
obs
(int or tensor): Observation (state) $s$.act
(int or tensor): Action $a$.
Returns:
index
(int or tensor): Scalar or tensor index $i$ corresponding to the state-action pair $(s, a)$.
All tabular arrays for value functions or densities are flattened over the state-action space.
🔧 External Supporting Functions
def obs_act_to_index(
obs, act,
n_obs=None, n_act=None,
neighbours="act"):
Returns a flattened index $i$ for each state-action pair $(s, a)$.
Args:
obs
(int or tensor): Observation (state) $s$.act
(int or tensor): Action $a$.n_obs
(int, optional): Number of observations (states) $|S|$.n_act
(int, optional): Number of actions $|A|$.neighbours
(str, optional): Whether to place actions ("act"
) or observations ("obs"
) as the fast-changing dimension.
Depending on whether
neighbours
is"act"
or "obs
",n_act
orn_obs
must be proveded, respectively.
def index_to_obs_act(
index,
n_obs=None, n_act=None,
neighbours="act"):
Inverse of obs_act_to_index
. Recovers the state-action pair $(s, a)$ from a flattened index $i$.
Args:
index
(int or tensor): Index $i$ to be converted into state-action pair $(s, a)$.n_obs
(int, optional): Number of observations (states) $|S|$.n_act
(int, optional): Number of actions $|A|$.neighbours
(str, optional): Whether to place actions ("act"
) or observations ("obs"
) as the fast-changing dimension.
Depending on whether
neighbours
is"act"
or "obs
",n_act
orn_obs
must be provided, respectively.