API 1.0.2.2. Indexer - Reinforcement-Learning-TU-Vienna/dice_rl_TU_Vienna GitHub Wiki

Indexer (Utility Class)

The Indexer class provides helper functionality to convert state-action pairs $(s, a) \in S \times A$ into flattened indices $i$ over the tabular space $\{ 0, \dots, |S \times A| - 1 \}$.

🏗️ Constructor

def __init__(self, n_obs, n_act):

Args:

  • n_obs (int): Number of observations (states) $|S|$.
  • n_act (int): Number of actions $|A|$.

📦 Properties

@property
def dimension(self):

Returns the size of the state-action space $|S \times A|$.

📐 Methods

def get_index(self, obs, act):

Returns a flattened index $i$ of a state-action pair $(s, a)$ via

$$ i = s \times |A| + a. $$

Args:

  • obs (int or tensor): Observation (state) $s$.
  • act (int or tensor): Action $a$.

Returns:

  • index (int or tensor): Scalar or tensor index $i$ corresponding to the state-action pair $(s, a)$.

All tabular arrays for value functions or densities are flattened over the state-action space.

🔧 External Supporting Functions

def obs_act_to_index(
    obs, act,
    n_obs=None, n_act=None,
    neighbours="act"):

Returns a flattened index $i$ for each state-action pair $(s, a)$.

Args:

  • obs (int or tensor): Observation (state) $s$.
  • act (int or tensor): Action $a$.
  • n_obs (int, optional): Number of observations (states) $|S|$.
  • n_act (int, optional): Number of actions $|A|$.
  • neighbours (str, optional): Whether to place actions ("act") or observations ("obs") as the fast-changing dimension.

Depending on whether neighbours is "act" or "obs", n_act or n_obs must be proveded, respectively.

def index_to_obs_act(
    index,
    n_obs=None, n_act=None,
    neighbours="act"):

Inverse of obs_act_to_index. Recovers the state-action pair $(s, a)$ from a flattened index $i$.

Args:

  • index (int or tensor): Index $i$ to be converted into state-action pair $(s, a)$.
  • n_obs (int, optional): Number of observations (states) $|S|$.
  • n_act (int, optional): Number of actions $|A|$.
  • neighbours (str, optional): Whether to place actions ("act") or observations ("obs") as the fast-changing dimension.

Depending on whether neighbours is "act" or "obs", n_act or n_obs must be provided, respectively.