This class keeps track of registered events and rewards. More...

#include <AIToolbox/Factored/MDP/CooperativeExperience.hpp>

Public Types
using	RewardMatrix = std::vector< Vector >

using	VisitsTable = std::vector< Table2D >

using	Indeces = std::vector< size_t >

Public Member Functions
	CooperativeExperience (const DDNGraph &graph)
	Basic constructor. More...

const Indeces &	record (const State &s, const Action &a, const State &s1, const Rewards &rew)
	This function adds a new event to the recordings. More...

void	reset ()
	This function resets all experienced rewards and transitions. More...

unsigned long	getTimesteps () const
	This function returns the number of times the record function has been called. More...

const VisitsTable &	getVisitsTable () const
	This function returns the visits table for inspection. More...

const RewardMatrix &	getRewardMatrix () const
	This function returns the rewards matrix for inspection. More...

const RewardMatrix &	getM2Matrix () const
	This function returns the rewards squared matrix for inspection. More...

const State &	getS () const
	This function returns the number of states of the world. More...

const Action &	getA () const
	This function returns the number of available actions to the agent. More...

const DDNGraph &	getGraph () const
	This function returns the underlying DDNGraph of the CooperativeExperience. More...

Detailed Description

This class keeps track of registered events and rewards.

This class is a simple logger of events. It keeps track of both the number of times a particular transition has happened, and the average reward gained in any particular transition. (i.e. the maximum likelihood estimator of a QFunction from the data). It also computes the M2 statistic for the rewards (avg sum of squares minus square avg).

However, it does not record each event separately (i.e. you can't extract the results of a particular transition in the past).

The events are recorded with respect to a given structure, which should match the one of the generative model.

Note that since this class contains data in a DDN format, it's probably only usable by directly inspecting the stored VisitsTable and RewardMatrix. Thus we don't yet provide general getters for state/action pairs.

Member Typedef Documentation

◆ Indeces

using AIToolbox::Factored::MDP::CooperativeExperience::Indeces = std::vector<size_t>

◆ RewardMatrix

using AIToolbox::Factored::MDP::CooperativeExperience::RewardMatrix = std::vector<Vector>

◆ VisitsTable

using AIToolbox::Factored::MDP::CooperativeExperience::VisitsTable = std::vector<Table2D>

Constructor & Destructor Documentation

◆ CooperativeExperience()

AIToolbox::Factored::MDP::CooperativeExperience::CooperativeExperience ( const DDNGraph & graph )

Basic constructor.

Note that the structure input does not need to pre-allocate the value matrices, nor to fill their values, since we do that internally. Here we only need the structure of the problem.

Parameters

graph The coordination graph of the cooperative problem.

Member Function Documentation

◆ getA()

const Action& AIToolbox::Factored::MDP::CooperativeExperience::getA ( ) const

This function returns the number of available actions to the agent.

Returns: The total number of actions.

◆ getGraph()

const DDNGraph& AIToolbox::Factored::MDP::CooperativeExperience::getGraph ( ) const

This function returns the underlying DDNGraph of the CooperativeExperience.

Returns: The underlying DDNGraph.

◆ getM2Matrix()

const RewardMatrix& AIToolbox::Factored::MDP::CooperativeExperience::getM2Matrix ( ) const

This function returns the rewards squared matrix for inspection.

Returns: The rewards squared matrix.

◆ getRewardMatrix()

const RewardMatrix& AIToolbox::Factored::MDP::CooperativeExperience::getRewardMatrix ( ) const

This function returns the rewards matrix for inspection.

Returns: The rewards matrix.

◆ getS()

const State& AIToolbox::Factored::MDP::CooperativeExperience::getS ( ) const

This function returns the number of states of the world.

Returns: The total number of states.

◆ getTimesteps()

unsigned long AIToolbox::Factored::MDP::CooperativeExperience::getTimesteps ( ) const

This function returns the number of times the record function has been called.

Returns: The number of recorded timesteps.

◆ getVisitsTable()

const VisitsTable& AIToolbox::Factored::MDP::CooperativeExperience::getVisitsTable ( ) const

This function returns the visits table for inspection.

Returns: The visits table.

◆ record()

const Indeces& AIToolbox::Factored::MDP::CooperativeExperience::record	(	const State &	s,
		const Action &	a,
		const State &	s1,
		const Rewards &	rew
	)

This function adds a new event to the recordings.

Note that here we expect a vector of rewards, of the same size as the state space.

This function additionally returns a reference to the indeces updated for each element of the underlying DDN. This is useful, for example, when updating the CoordinatedRLModel without needing to recompute these indeces all the time.