This class models a cooperative MDP. More...

#include <AIToolbox/Factored/MDP/CooperativeModel.hpp>

Public Member Functions
	CooperativeModel (DDNGraph graph, DDN::TransitionMatrix transitions, FactoredMatrix2D rewards, double discount=1.0)
	Basic constructor. More...

	CooperativeModel (const CooperativeModel &)
	Copy constructor. More...

std::tuple< State, double >	sampleSR (const State &s, const Action &a) const
	This function samples the MDP with the specified state action pair. More...

double	sampleSR (const State &s, const Action &a, State *s1) const
	This function samples the MDP with the specified state action pair. More...

std::tuple< State, Rewards >	sampleSRs (const State &s, const Action &a) const
	This function samples the MDP with the specified state action pair. More...

void	sampleSRs (const State &s, const Action &a, State s1, Rewards rews) const
	This function samples the MDP with the specified state action pair. More...

void	setDiscount (double d)
	This function sets a new discount factor for the Model. More...

const State &	getS () const
	This function returns the state space of the world. More...

const Action &	getA () const
	This function returns the action space of the MDP. More...

double	getDiscount () const
	This function returns the currently set discount factor. More...

double	getTransitionProbability (const State &s, const Action &a, const State &s1) const
	This function returns the stored transition probability for the specified transition. More...

double	getExpectedReward (const State &s, const Action &a, const State &s1) const
	This function returns the stored expected reward for the specified transition. More...

const DDN &	getTransitionFunction () const
	This function returns the transition function of the MDP. More...

const FactoredMatrix2D &	getRewardFunction () const
	This function returns the reward function of the MDP. More...

const DDNGraph &	getGraph () const
	This function returns the underlying DDNGraph of the CooperativeExperience. More...

Detailed Description

This class models a cooperative MDP.

This class can be used in order to model problems where multiple agents cooperate in order to achieve a common goal. In particular, we model problems where each agent only cares about a specific subset of the state space, which allows to build a coordination graph to store dependencies.

Constructor & Destructor Documentation

◆ CooperativeModel() [1/2]

AIToolbox::Factored::MDP::CooperativeModel::CooperativeModel	(	DDNGraph	graph,
		DDN::TransitionMatrix	transitions,
		FactoredMatrix2D	rewards,
		double	discount = `1.0`
	)

Basic constructor.

Parameters

graph	The DDNGraph of the underlying MDP.
transitions	The transition function.
rewards	The reward function.
discount	The discount factor for the MDP.

◆ CooperativeModel() [2/2]

AIToolbox::Factored::MDP::CooperativeModel::CooperativeModel ( const CooperativeModel & )

Copy constructor.

We must manually copy the DDN as it contains a reference to the graph; if this class gets default copied the reference will not point to the internal graph anymore, which will break everything.

Note: we copy over the same random state as the next class; this is mostly to copy the behaviour of all other models without an explicit copy constructor. In addition, it makes somewhat easier to reproduce results while moving models around, without worrying whether there are RVO or copies being made.

If you want a copy and want to change the random state, just use the other constructor.

Member Function Documentation

◆ getA()

const Action& AIToolbox::Factored::MDP::CooperativeModel::getA ( ) const

This function returns the action space of the MDP.

Returns: The action space.

◆ getDiscount()

double AIToolbox::Factored::MDP::CooperativeModel::getDiscount ( ) const

This function returns the currently set discount factor.

Returns: The currently set discount factor.

◆ getExpectedReward()

double AIToolbox::Factored::MDP::CooperativeModel::getExpectedReward	(	const State &	s,
		const Action &	a,
		const State &	s1
	)		const

This function returns the stored expected reward for the specified transition.

Parameters

s	The initial state of the transition.
a	The action performed in the transition.
s1	The final state of the transition.

Returns: The expected reward of the specified transition.

◆ getGraph()

const DDNGraph& AIToolbox::Factored::MDP::CooperativeModel::getGraph ( ) const

This function returns the underlying DDNGraph of the CooperativeExperience.

Returns: The underlying DDNGraph.

◆ getRewardFunction()

const FactoredMatrix2D& AIToolbox::Factored::MDP::CooperativeModel::getRewardFunction ( ) const

This function returns the reward function of the MDP.

Returns: The reward function of the MDP.

◆ getS()

const State& AIToolbox::Factored::MDP::CooperativeModel::getS ( ) const

This function returns the state space of the world.

Returns: The state space.

◆ getTransitionFunction()

const DDN& AIToolbox::Factored::MDP::CooperativeModel::getTransitionFunction ( ) const

This function returns the transition function of the MDP.

Returns: The transition function of the MDP.

◆ getTransitionProbability()

double AIToolbox::Factored::MDP::CooperativeModel::getTransitionProbability	(	const State &	s,
		const Action &	a,
		const State &	s1
	)		const

This function returns the stored transition probability for the specified transition.

Parameters

s	The initial state of the transition.
a	The action performed in the transition.
s1	The final state of the transition.

Returns: The probability of the specified transition.

◆ sampleSR() [1/2]

std::tuple<State, double> AIToolbox::Factored::MDP::CooperativeModel::sampleSR	(	const State &	s,
		const Action &	a
	)		const

This function samples the MDP with the specified state action pair.

This function samples the model for simulated experience. The transition and reward functions are used to produce, from the state action pair inserted as arguments, a possible new state with respective reward. The new state is picked from all possible states that the MDP allows transitioning to, each with probability equal to the same probability of the transition in the model. After a new state is picked, the reward is the corresponding reward contained in the reward function.

Parameters

s	The state that needs to be sampled.
a	The action that needs to be sampled.

Returns: A tuple containing a new state and a reward.

◆ sampleSR() [2/2]

double AIToolbox::Factored::MDP::CooperativeModel::sampleSR	(	const State &	s,
		const Action &	a,
		State *	s1
	)		const

This function samples the MDP with the specified state action pair.

This function is equivalent to sampleSR(const State &, const Action &).

The only difference is that it allows to output the new State into a pre-allocated State, avoiding the need for an allocation at every sample.

NO CHECKS for nullptr are done.

Parameters

s	The state that needs to be sampled.
a	The action that needs to be sampled.
s1	The new state.

Returns: The reward for the sampled transition.

◆ sampleSRs() [1/2]

std::tuple<State, Rewards> AIToolbox::Factored::MDP::CooperativeModel::sampleSRs	(	const State &	s,
		const Action &	a
	)		const

This function samples the MDP with the specified state action pair.

This function samples the model for simulate experience. The transition and reward functions are used to produce, from the state action pair inserted as arguments, a possible new state with respective reward. The new state is picked from all possible states that the MDP allows transitioning to, each with probability equal to the same probability of the transition in the model.

After a new state is picked, the reward is the vector of corresponding rewards contained in the reward function. This means that the vector will have a length equal to the number of bases of the reward function.

Parameters

s	The state that needs to be sampled.
a	The action that needs to be sampled.

Returns: A tuple containing a new state and a reward.

◆ sampleSRs() [2/2]

void AIToolbox::Factored::MDP::CooperativeModel::sampleSRs	(	const State &	s,
		const Action &	a,
		State *	s1,
		Rewards *	rews
	)		const

This function samples the MDP with the specified state action pair.

This function is equivalent to sampleSRs(const State &, const Action &).

The only difference is that it allows to output the new State and Rewards into a pre-allocated State and Rewards, avoiding the need for an allocation at every sample.

NO CHECKS for nullptr are done.

Parameters

s	The state that needs to be sampled.
a	The action that needs to be sampled.
s1	The new state.
rews	The new rewards.

◆ setDiscount()

void AIToolbox::Factored::MDP::CooperativeModel::setDiscount ( double d )

This function sets a new discount factor for the Model.

Parameters

d	The new discount factor for the Model.

The documentation for this class was generated from the following file:

include/AIToolbox/Factored/MDP/CooperativeModel.hpp

Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ CooperativeModel() [1/2]

◆ CooperativeModel() [2/2]

Member Function Documentation

◆ getA()

◆ getDiscount()

◆ getExpectedReward()

◆ getGraph()

◆ getRewardFunction()

◆ getS()

◆ getTransitionFunction()

◆ getTransitionProbability()

◆ sampleSR() [1/2]

◆ sampleSR() [2/2]

◆ sampleSRs() [1/2]

◆ sampleSRs() [2/2]

◆ setDiscount()