AIToolbox
A library that offers tools for AI problem solving.
AIToolbox::Factored::MDP::CooperativeExperience Class Reference

This class keeps track of registered events and rewards. More...

#include <AIToolbox/Factored/MDP/CooperativeExperience.hpp>

Public Types

using RewardMatrix = std::vector< Vector >
 
using VisitsTable = std::vector< Table2D >
 
using Indeces = std::vector< size_t >
 

Public Member Functions

 CooperativeExperience (const DDNGraph &graph)
 Basic constructor. More...
 
const Indecesrecord (const State &s, const Action &a, const State &s1, const Rewards &rew)
 This function adds a new event to the recordings. More...
 
void reset ()
 This function resets all experienced rewards and transitions. More...
 
unsigned long getTimesteps () const
 This function returns the number of times the record function has been called. More...
 
const VisitsTablegetVisitsTable () const
 This function returns the visits table for inspection. More...
 
const RewardMatrixgetRewardMatrix () const
 This function returns the rewards matrix for inspection. More...
 
const RewardMatrixgetM2Matrix () const
 This function returns the rewards squared matrix for inspection. More...
 
const StategetS () const
 This function returns the number of states of the world. More...
 
const ActiongetA () const
 This function returns the number of available actions to the agent. More...
 
const DDNGraphgetGraph () const
 This function returns the underlying DDNGraph of the CooperativeExperience. More...
 

Detailed Description

This class keeps track of registered events and rewards.

This class is a simple logger of events. It keeps track of both the number of times a particular transition has happened, and the average reward gained in any particular transition. (i.e. the maximum likelihood estimator of a QFunction from the data). It also computes the M2 statistic for the rewards (avg sum of squares minus square avg).

However, it does not record each event separately (i.e. you can't extract the results of a particular transition in the past).

The events are recorded with respect to a given structure, which should match the one of the generative model.

Note that since this class contains data in a DDN format, it's probably only usable by directly inspecting the stored VisitsTable and RewardMatrix. Thus we don't yet provide general getters for state/action pairs.

Member Typedef Documentation

◆ Indeces

◆ RewardMatrix

◆ VisitsTable

Constructor & Destructor Documentation

◆ CooperativeExperience()

AIToolbox::Factored::MDP::CooperativeExperience::CooperativeExperience ( const DDNGraph graph)

Basic constructor.

Note that the structure input does not need to pre-allocate the value matrices, nor to fill their values, since we do that internally. Here we only need the structure of the problem.

Parameters
graphThe coordination graph of the cooperative problem.

Member Function Documentation

◆ getA()

const Action& AIToolbox::Factored::MDP::CooperativeExperience::getA ( ) const

This function returns the number of available actions to the agent.

Returns
The total number of actions.

◆ getGraph()

const DDNGraph& AIToolbox::Factored::MDP::CooperativeExperience::getGraph ( ) const

This function returns the underlying DDNGraph of the CooperativeExperience.

Returns
The underlying DDNGraph.

◆ getM2Matrix()

const RewardMatrix& AIToolbox::Factored::MDP::CooperativeExperience::getM2Matrix ( ) const

This function returns the rewards squared matrix for inspection.

Returns
The rewards squared matrix.

◆ getRewardMatrix()

const RewardMatrix& AIToolbox::Factored::MDP::CooperativeExperience::getRewardMatrix ( ) const

This function returns the rewards matrix for inspection.

Returns
The rewards matrix.

◆ getS()

const State& AIToolbox::Factored::MDP::CooperativeExperience::getS ( ) const

This function returns the number of states of the world.

Returns
The total number of states.

◆ getTimesteps()

unsigned long AIToolbox::Factored::MDP::CooperativeExperience::getTimesteps ( ) const

This function returns the number of times the record function has been called.

Returns
The number of recorded timesteps.

◆ getVisitsTable()

const VisitsTable& AIToolbox::Factored::MDP::CooperativeExperience::getVisitsTable ( ) const

This function returns the visits table for inspection.

Returns
The visits table.

◆ record()

const Indeces& AIToolbox::Factored::MDP::CooperativeExperience::record ( const State s,
const Action a,
const State s1,
const Rewards rew 
)

This function adds a new event to the recordings.

Note that here we expect a vector of rewards, of the same size as the state space.

This function additionally returns a reference to the indeces updated for each element of the underlying DDN. This is useful, for example, when updating the CoordinatedRLModel without needing to recompute these indeces all the time.

Parameters
sOld state.
aPerformed action.
s1New state.
rewObtained rewards.
Returns
The indeces of s and a updated in the DDN.

◆ reset()

void AIToolbox::Factored::MDP::CooperativeExperience::reset ( )

This function resets all experienced rewards and transitions.


The documentation for this class was generated from the following file: