AIToolbox
A library that offers tools for AI problem solving.
AIToolbox::MDP::Experience Class Reference

This class keeps track of registered events and rewards. More...

#include <AIToolbox/MDP/Experience.hpp>

Public Member Functions

 Experience (size_t S, size_t A)
 Basic constructor. More...
 
template<IsNaive3DTable V>
void setVisitsTable (const V &v)
 This function sets the internal visits table to the input. More...
 
void setVisitsTable (const Table3D &v)
 This function sets the internal visits table to the input. More...
 
template<IsNaive2DMatrix R>
void setRewardMatrix (const R &r)
 This function sets the internal reward matrix to the input. More...
 
void setRewardMatrix (const Matrix2D &r)
 This function sets the internal reward matrix to the input. More...
 
template<IsNaive2DMatrix MM>
void setM2Matrix (const MM &mm)
 This function sets the internal m2 matrix to the input. More...
 
void setM2Matrix (const Matrix2D &mm)
 This function sets the internal m2 matrix to the input. More...
 
void record (size_t s, size_t a, size_t s1, double rew)
 This function adds a new event to the recordings. More...
 
void reset ()
 This function resets all experienced rewards, transitions and M2s. More...
 
unsigned long getTimesteps () const
 This function returns the number of times the record function has been called. More...
 
unsigned long getVisits (size_t s, size_t a, size_t s1) const
 This function returns the current recorded visits for a transition. More...
 
unsigned long getVisitsSum (size_t s, size_t a) const
 This function returns the current recorded visits for a state-action pair. More...
 
double getReward (size_t s, size_t a) const
 This function returns the average reward for a state-action pair. More...
 
double getM2 (size_t s, size_t a) const
 This function returns the M2 statistic for a state-action pair. More...
 
const Table3DgetVisitsTable () const
 This function returns the visits table for inspection. More...
 
const Table2DgetVisitsTable (size_t a) const
 This function returns the visits table for inspection. More...
 
const Table2DgetVisitsSumTable () const
 This function returns the visits sum table for inspection. More...
 
const QFunctiongetRewardMatrix () const
 This function returns the rewards matrix for inspection. More...
 
const Matrix2DgetM2Matrix () const
 This function returns the rewards squared matrix for inspection. More...
 
size_t getS () const
 This function returns the number of states of the world. More...
 
size_t getA () const
 This function returns the number of available actions to the agent. More...
 

Friends

std::istream & operator>> (std::istream &is, Experience &)
 

Detailed Description

This class keeps track of registered events and rewards.

This class is a simple aggregator of events. It keeps track of both the number of times a particular transition has been visited, and the average reward gained per state-action pair (i.e. the maximum likelihood estimator of a QFunction from the data). It also computes the M2 statistic for the rewards (avg sum of squares minus square avg).

It does not record each event separately (i.e. you can't extract the results of a particular transition in the past).

Constructor & Destructor Documentation

◆ Experience()

AIToolbox::MDP::Experience::Experience ( size_t  S,
size_t  A 
)

Basic constructor.

Parameters
SThe number of states of the world.
AThe number of actions available to the agent.

Member Function Documentation

◆ getA()

size_t AIToolbox::MDP::Experience::getA ( ) const

This function returns the number of available actions to the agent.

Returns
The total number of actions.

◆ getM2()

double AIToolbox::MDP::Experience::getM2 ( size_t  s,
size_t  a 
) const

This function returns the M2 statistic for a state-action pair.

Parameters
sOld state.
aPerformed action.

◆ getM2Matrix()

const Matrix2D& AIToolbox::MDP::Experience::getM2Matrix ( ) const

This function returns the rewards squared matrix for inspection.

Returns
The rewards squared matrix.

◆ getReward()

double AIToolbox::MDP::Experience::getReward ( size_t  s,
size_t  a 
) const

This function returns the average reward for a state-action pair.

Parameters
sOld state.
aPerformed action.

◆ getRewardMatrix()

const QFunction& AIToolbox::MDP::Experience::getRewardMatrix ( ) const

This function returns the rewards matrix for inspection.

The reward matrix contains the current average rewards computed for each state-action pairs.

Returns
The rewards matrix.

◆ getS()

size_t AIToolbox::MDP::Experience::getS ( ) const

This function returns the number of states of the world.

Returns
The total number of states.

◆ getTimesteps()

unsigned long AIToolbox::MDP::Experience::getTimesteps ( ) const

This function returns the number of times the record function has been called.

Returns
The number of recorded timesteps.

◆ getVisits()

unsigned long AIToolbox::MDP::Experience::getVisits ( size_t  s,
size_t  a,
size_t  s1 
) const

This function returns the current recorded visits for a transition.

Parameters
sOld state.
aPerformed action.
s1New state.

◆ getVisitsSum()

unsigned long AIToolbox::MDP::Experience::getVisitsSum ( size_t  s,
size_t  a 
) const

This function returns the current recorded visits for a state-action pair.

Parameters
sOld state.
aPerformed action.

◆ getVisitsSumTable()

const Table2D& AIToolbox::MDP::Experience::getVisitsSumTable ( ) const

This function returns the visits sum table for inspection.

This table contains per state-action pair visit counts.

Returns
The visits sum table.

◆ getVisitsTable() [1/2]

const Table3D& AIToolbox::MDP::Experience::getVisitsTable ( ) const

This function returns the visits table for inspection.

Returns
The visits table.

◆ getVisitsTable() [2/2]

const Table2D& AIToolbox::MDP::Experience::getVisitsTable ( size_t  a) const

This function returns the visits table for inspection.

Parameters
aThe action requested.
Returns
The visits table.

◆ record()

void AIToolbox::MDP::Experience::record ( size_t  s,
size_t  a,
size_t  s1,
double  rew 
)

This function adds a new event to the recordings.

Parameters
sOld state.
aPerformed action.
s1New state.
rewObtained reward.

◆ reset()

void AIToolbox::MDP::Experience::reset ( )

This function resets all experienced rewards, transitions and M2s.

◆ setM2Matrix() [1/2]

void AIToolbox::MDP::Experience::setM2Matrix ( const Matrix2D mm)

This function sets the internal m2 matrix to the input.

The dimensions of the input must match the ones specified during the Experience construction (for two dimensions: S, A). BE CAREFUL. The tables MUST be SxS, while the std::vector containing them MUST be of size A.

This is important, as this function DOES NOT perform any size checks on the external containers.

Parameters
mmThe external M2 container.

◆ setM2Matrix() [2/2]

template<IsNaive2DMatrix MM>
void AIToolbox::MDP::Experience::setM2Matrix ( const MM &  mm)

This function sets the internal m2 matrix to the input.

This function takes an arbitrary two dimensional container and tries to copy its contents into the M2 matrix.

The container needs to support data access through operator[]. In addition, the dimensions of the container must match the ones specified during the Experience construction (for two dimensions: S,A).

This is important, as this function DOES NOT perform any size checks on the external containers.

Template Parameters
MMThe external M2 container type.
Parameters
mmThe external M2 container.

◆ setRewardMatrix() [1/2]

void AIToolbox::MDP::Experience::setRewardMatrix ( const Matrix2D r)

This function sets the internal reward matrix to the input.

The dimensions of the input must match the ones specified during the Experience construction (for two dimensions: S, A). BE CAREFUL. The tables MUST be SxS, while the std::vector containing them MUST be of size A.

This is important, as this function DOES NOT perform any size checks on the external containers.

Parameters
rThe external rewards container.

◆ setRewardMatrix() [2/2]

template<IsNaive2DMatrix R>
void AIToolbox::MDP::Experience::setRewardMatrix ( const R &  r)

This function sets the internal reward matrix to the input.

This function takes an arbitrary two dimensional container and tries to copy its contents into the rewards matrix.

The container needs to support data access through operator[]. In addition, the dimensions of the container must match the ones specified during the Experience construction (for two dimensions: S,A).

This is important, as this function DOES NOT perform any size checks on the external containers.

Template Parameters
RThe external rewards container type.
Parameters
rThe external rewards container.

◆ setVisitsTable() [1/2]

void AIToolbox::MDP::Experience::setVisitsTable ( const Table3D v)

This function sets the internal visits table to the input.

This function copies the input Table3D into the visits table. It automatically updates the visitsSum table as well.

The dimensions of the input must match the ones specified during the Experience construction (for three dimensions: A, S, S). BE CAREFUL. The tables MUST be SxS, while the std::vector containing them MUST be of size A.

This is important, as this function DOES NOT perform any size checks on the external containers.

Parameters
vThe external visits container.

◆ setVisitsTable() [2/2]

template<IsNaive3DTable V>
void AIToolbox::MDP::Experience::setVisitsTable ( const V &  v)

This function sets the internal visits table to the input.

This function takes an arbitrary three dimensional container and tries to copy its contents into the visits table. It automatically updates the visitsSum table as well.

The container needs to support data access through operator[]. In addition, the dimensions of the container must match the ones specified during the Experience construction (for three dimensions: S,A,S).

This is important, as this function DOES NOT perform any size checks on the external containers.

Template Parameters
VThe external visits container type.
Parameters
vThe external visits container.

Friends And Related Function Documentation

◆ operator>>

std::istream& operator>> ( std::istream &  is,
Experience  
)
friend

The documentation for this class was generated from the following file: