AIToolbox
A library that offers tools for AI problem solving.
|
This class keeps track of registered events and rewards. More...
#include <AIToolbox/MDP/SparseExperience.hpp>
Public Member Functions | |
SparseExperience (size_t S, size_t A) | |
Basic constructor. More... | |
template<IsNaive3DTable V> | |
void | setVisitsTable (const V &v) |
This function sets the internal visits table to the input. More... | |
void | setVisitsTable (const SparseTable3D &v) |
This function sets the internal visits table to the input. More... | |
template<IsNaive2DMatrix R> | |
void | setRewardMatrix (const R &r) |
This function sets the internal reward matrix to the input. More... | |
void | setRewardMatrix (const SparseMatrix2D &r) |
This function sets the internal reward matrix to the input. More... | |
template<IsNaive2DMatrix MM> | |
void | setM2Matrix (const MM &mm) |
This function sets the internal m2 matrix to the input. More... | |
void | setM2Matrix (const SparseMatrix2D &mm) |
This function sets the internal m2 matrix to the input. More... | |
void | record (size_t s, size_t a, size_t s1, double rew) |
This function adds a new event to the recordings. More... | |
void | reset () |
This function resets all experienced rewards, transitions and M2s. More... | |
unsigned long | getTimesteps () const |
This function returns the number of times the record function has been called. More... | |
unsigned long | getVisits (size_t s, size_t a, size_t s1) const |
This function returns the current recorded visits for a transition. More... | |
unsigned long | getVisitsSum (size_t s, size_t a) const |
This function returns the current recorded visits for a state-action pair. More... | |
double | getReward (size_t s, size_t a) const |
This function returns the average reward for a state-action pair. More... | |
double | getM2 (size_t s, size_t a) const |
This function returns the M2 statistic for a state-action pair. More... | |
const SparseTable3D & | getVisitsTable () const |
This function returns the visits table for inspection. More... | |
const SparseTable2D & | getVisitsTable (size_t a) const |
This function returns the visits table for inspection. More... | |
const SparseTable2D & | getVisitsSumTable () const |
This function returns the visits sum table for inspection. More... | |
const SparseMatrix2D & | getRewardMatrix () const |
This function returns the rewards matrix for inspection. More... | |
const SparseMatrix2D & | getM2Matrix () const |
This function returns the rewards squared matrix for inspection. More... | |
size_t | getS () const |
This function returns the number of states of the world. More... | |
size_t | getA () const |
This function returns the number of available actions to the agent. More... | |
Friends | |
std::istream & | operator>> (std::istream &is, SparseExperience &) |
This class keeps track of registered events and rewards.
This class is a simple aggregator of events. It keeps track of both the number of times a particular state-action pair has been visited, and the average reward gained in transitions from it (i.e. the maximum likelihood estimator of a QFunction from the data). It also computes the M2 statistic for it (avg sum of squares minus square avg).
It does not record each event separately (i.e. you can't extract the results of a particular transition in the past).
The difference between this class and the MDP::Experience class is that this class stores recorded events in sparse matrices. This results in very high space savings when the state space of the environment being logged is very high but only a small subset of the states are really possible, at the cost of some efficiency (possibly offset by cache savings).
AIToolbox::MDP::SparseExperience::SparseExperience | ( | size_t | S, |
size_t | A | ||
) |
Basic constructor.
S | The number of states of the world. |
A | The number of actions available to the agent. |
size_t AIToolbox::MDP::SparseExperience::getA | ( | ) | const |
This function returns the number of available actions to the agent.
double AIToolbox::MDP::SparseExperience::getM2 | ( | size_t | s, |
size_t | a | ||
) | const |
This function returns the M2 statistic for a state-action pair.
s | Old state. |
a | Performed action. |
const SparseMatrix2D& AIToolbox::MDP::SparseExperience::getM2Matrix | ( | ) | const |
This function returns the rewards squared matrix for inspection.
double AIToolbox::MDP::SparseExperience::getReward | ( | size_t | s, |
size_t | a | ||
) | const |
This function returns the average reward for a state-action pair.
s | Old state. |
a | Performed action. |
const SparseMatrix2D& AIToolbox::MDP::SparseExperience::getRewardMatrix | ( | ) | const |
This function returns the rewards matrix for inspection.
size_t AIToolbox::MDP::SparseExperience::getS | ( | ) | const |
This function returns the number of states of the world.
unsigned long AIToolbox::MDP::SparseExperience::getTimesteps | ( | ) | const |
This function returns the number of times the record function has been called.
unsigned long AIToolbox::MDP::SparseExperience::getVisits | ( | size_t | s, |
size_t | a, | ||
size_t | s1 | ||
) | const |
This function returns the current recorded visits for a transition.
s | Old state. |
a | Performed action. |
s1 | New state. |
unsigned long AIToolbox::MDP::SparseExperience::getVisitsSum | ( | size_t | s, |
size_t | a | ||
) | const |
This function returns the current recorded visits for a state-action pair.
s | Old state. |
a | Performed action. |
const SparseTable2D& AIToolbox::MDP::SparseExperience::getVisitsSumTable | ( | ) | const |
This function returns the visits sum table for inspection.
This table contains per state-action pair visit counts.
const SparseTable3D& AIToolbox::MDP::SparseExperience::getVisitsTable | ( | ) | const |
This function returns the visits table for inspection.
const SparseTable2D& AIToolbox::MDP::SparseExperience::getVisitsTable | ( | size_t | a | ) | const |
This function returns the visits table for inspection.
a | The action requested. |
void AIToolbox::MDP::SparseExperience::record | ( | size_t | s, |
size_t | a, | ||
size_t | s1, | ||
double | rew | ||
) |
This function adds a new event to the recordings.
The new state is not really used, but is left in the API for clarity.
s | Old state. |
a | Performed action. |
s1 | New state. |
rew | Obtained reward. |
void AIToolbox::MDP::SparseExperience::reset | ( | ) |
This function resets all experienced rewards, transitions and M2s.
void AIToolbox::MDP::SparseExperience::setM2Matrix | ( | const MM & | mm | ) |
This function sets the internal m2 matrix to the input.
This function takes an arbitrary two dimensional container and tries to copy its contents into the M2 matrix.
The container needs to support data access through operator[]. In addition, the dimensions of the container must match the ones specified during the Experience construction (for two dimensions: S,A).
This is important, as this function DOES NOT perform any size checks on the external containers.
MM | The external M2 container type. |
mm | The external M2 container. |
void AIToolbox::MDP::SparseExperience::setM2Matrix | ( | const SparseMatrix2D & | mm | ) |
This function sets the internal m2 matrix to the input.
The dimensions of the input must match the ones specified during the Experience construction (for two dimensions: S, A). BE CAREFUL. The tables MUST be SxS, while the std::vector containing them MUST be of size A.
This is important, as this function DOES NOT perform any size checks on the external containers.
mm | The external M2 container. |
void AIToolbox::MDP::SparseExperience::setRewardMatrix | ( | const R & | r | ) |
This function sets the internal reward matrix to the input.
This function takes an arbitrary two dimensional container and tries to copy its contents into the rewards matrix.
The container needs to support data access through operator[]. In addition, the dimensions of the container must match the ones specified during the Experience construction (for two dimensions: S,A).
This is important, as this function DOES NOT perform any size checks on the external containers.
R | The external rewards container type. |
r | The external rewards container. |
void AIToolbox::MDP::SparseExperience::setRewardMatrix | ( | const SparseMatrix2D & | r | ) |
This function sets the internal reward matrix to the input.
The dimensions of the input must match the ones specified during the Experience construction (for two dimensions: S, A). BE CAREFUL. The tables MUST be SxS, while the std::vector containing them MUST be of size A.
This is important, as this function DOES NOT perform any size checks on the external containers.
r | The external rewards container. |
void AIToolbox::MDP::SparseExperience::setVisitsTable | ( | const SparseTable3D & | v | ) |
This function sets the internal visits table to the input.
This function copies the input Table3D into the visits table. It automatically updates the visitsSum table as well.
The dimensions of the input must match the ones specified during the Experience construction (for three dimensions: A, S, S). BE CAREFUL. The tables MUST be SxS, while the std::vector containing them MUST be of size A.
This is important, as this function DOES NOT perform any size checks on the external containers.
v | The external visits container. |
void AIToolbox::MDP::SparseExperience::setVisitsTable | ( | const V & | v | ) |
This function sets the internal visits table to the input.
This function takes an arbitrary three dimensional container and tries to copy its contents into the visits table. It automatically updates the visitsSum table as well.
The container needs to support data access through operator[]. In addition, the dimensions of the container must match the ones specified during the Experience construction (for three dimensions: S,A,S).
This is important, as this function DOES NOT perform any size checks on the external containers.
V | The external visits container type. |
v | The external visits container. |
|
friend |