AIToolbox
A library that offers tools for AI problem solving.
AIToolbox::Factored::MDP::CooperativeModel Class Reference

This class models a cooperative MDP. More...

#include <AIToolbox/Factored/MDP/CooperativeModel.hpp>

Public Member Functions

 CooperativeModel (DDNGraph graph, DDN::TransitionMatrix transitions, FactoredMatrix2D rewards, double discount=1.0)
 Basic constructor. More...
 
 CooperativeModel (const CooperativeModel &)
 Copy constructor. More...
 
std::tuple< State, double > sampleSR (const State &s, const Action &a) const
 This function samples the MDP with the specified state action pair. More...
 
double sampleSR (const State &s, const Action &a, State *s1) const
 This function samples the MDP with the specified state action pair. More...
 
std::tuple< State, RewardssampleSRs (const State &s, const Action &a) const
 This function samples the MDP with the specified state action pair. More...
 
void sampleSRs (const State &s, const Action &a, State *s1, Rewards *rews) const
 This function samples the MDP with the specified state action pair. More...
 
void setDiscount (double d)
 This function sets a new discount factor for the Model. More...
 
const StategetS () const
 This function returns the state space of the world. More...
 
const ActiongetA () const
 This function returns the action space of the MDP. More...
 
double getDiscount () const
 This function returns the currently set discount factor. More...
 
double getTransitionProbability (const State &s, const Action &a, const State &s1) const
 This function returns the stored transition probability for the specified transition. More...
 
double getExpectedReward (const State &s, const Action &a, const State &s1) const
 This function returns the stored expected reward for the specified transition. More...
 
const DDNgetTransitionFunction () const
 This function returns the transition function of the MDP. More...
 
const FactoredMatrix2DgetRewardFunction () const
 This function returns the reward function of the MDP. More...
 
const DDNGraphgetGraph () const
 This function returns the underlying DDNGraph of the CooperativeExperience. More...
 

Detailed Description

This class models a cooperative MDP.

This class can be used in order to model problems where multiple agents cooperate in order to achieve a common goal. In particular, we model problems where each agent only cares about a specific subset of the state space, which allows to build a coordination graph to store dependencies.

Constructor & Destructor Documentation

◆ CooperativeModel() [1/2]

AIToolbox::Factored::MDP::CooperativeModel::CooperativeModel ( DDNGraph  graph,
DDN::TransitionMatrix  transitions,
FactoredMatrix2D  rewards,
double  discount = 1.0 
)

Basic constructor.

Parameters
graphThe DDNGraph of the underlying MDP.
transitionsThe transition function.
rewardsThe reward function.
discountThe discount factor for the MDP.

◆ CooperativeModel() [2/2]

AIToolbox::Factored::MDP::CooperativeModel::CooperativeModel ( const CooperativeModel )

Copy constructor.

We must manually copy the DDN as it contains a reference to the graph; if this class gets default copied the reference will not point to the internal graph anymore, which will break everything.

Note: we copy over the same random state as the next class; this is mostly to copy the behaviour of all other models without an explicit copy constructor. In addition, it makes somewhat easier to reproduce results while moving models around, without worrying whether there are RVO or copies being made.

If you want a copy and want to change the random state, just use the other constructor.

Member Function Documentation

◆ getA()

const Action& AIToolbox::Factored::MDP::CooperativeModel::getA ( ) const

This function returns the action space of the MDP.

Returns
The action space.

◆ getDiscount()

double AIToolbox::Factored::MDP::CooperativeModel::getDiscount ( ) const

This function returns the currently set discount factor.

Returns
The currently set discount factor.

◆ getExpectedReward()

double AIToolbox::Factored::MDP::CooperativeModel::getExpectedReward ( const State s,
const Action a,
const State s1 
) const

This function returns the stored expected reward for the specified transition.

Parameters
sThe initial state of the transition.
aThe action performed in the transition.
s1The final state of the transition.
Returns
The expected reward of the specified transition.

◆ getGraph()

const DDNGraph& AIToolbox::Factored::MDP::CooperativeModel::getGraph ( ) const

This function returns the underlying DDNGraph of the CooperativeExperience.

Returns
The underlying DDNGraph.

◆ getRewardFunction()

const FactoredMatrix2D& AIToolbox::Factored::MDP::CooperativeModel::getRewardFunction ( ) const

This function returns the reward function of the MDP.

Returns
The reward function of the MDP.

◆ getS()

const State& AIToolbox::Factored::MDP::CooperativeModel::getS ( ) const

This function returns the state space of the world.

Returns
The state space.

◆ getTransitionFunction()

const DDN& AIToolbox::Factored::MDP::CooperativeModel::getTransitionFunction ( ) const

This function returns the transition function of the MDP.

Returns
The transition function of the MDP.

◆ getTransitionProbability()

double AIToolbox::Factored::MDP::CooperativeModel::getTransitionProbability ( const State s,
const Action a,
const State s1 
) const

This function returns the stored transition probability for the specified transition.

Parameters
sThe initial state of the transition.
aThe action performed in the transition.
s1The final state of the transition.
Returns
The probability of the specified transition.

◆ sampleSR() [1/2]

std::tuple<State, double> AIToolbox::Factored::MDP::CooperativeModel::sampleSR ( const State s,
const Action a 
) const

This function samples the MDP with the specified state action pair.

This function samples the model for simulated experience. The transition and reward functions are used to produce, from the state action pair inserted as arguments, a possible new state with respective reward. The new state is picked from all possible states that the MDP allows transitioning to, each with probability equal to the same probability of the transition in the model. After a new state is picked, the reward is the corresponding reward contained in the reward function.

Parameters
sThe state that needs to be sampled.
aThe action that needs to be sampled.
Returns
A tuple containing a new state and a reward.

◆ sampleSR() [2/2]

double AIToolbox::Factored::MDP::CooperativeModel::sampleSR ( const State s,
const Action a,
State s1 
) const

This function samples the MDP with the specified state action pair.

This function is equivalent to sampleSR(const State &, const Action &).

The only difference is that it allows to output the new State into a pre-allocated State, avoiding the need for an allocation at every sample.

NO CHECKS for nullptr are done.

Parameters
sThe state that needs to be sampled.
aThe action that needs to be sampled.
s1The new state.
Returns
The reward for the sampled transition.

◆ sampleSRs() [1/2]

std::tuple<State, Rewards> AIToolbox::Factored::MDP::CooperativeModel::sampleSRs ( const State s,
const Action a 
) const

This function samples the MDP with the specified state action pair.

This function samples the model for simulate experience. The transition and reward functions are used to produce, from the state action pair inserted as arguments, a possible new state with respective reward. The new state is picked from all possible states that the MDP allows transitioning to, each with probability equal to the same probability of the transition in the model.

After a new state is picked, the reward is the vector of corresponding rewards contained in the reward function. This means that the vector will have a length equal to the number of bases of the reward function.

Parameters
sThe state that needs to be sampled.
aThe action that needs to be sampled.
Returns
A tuple containing a new state and a reward.

◆ sampleSRs() [2/2]

void AIToolbox::Factored::MDP::CooperativeModel::sampleSRs ( const State s,
const Action a,
State s1,
Rewards rews 
) const

This function samples the MDP with the specified state action pair.

This function is equivalent to sampleSRs(const State &, const Action &).

The only difference is that it allows to output the new State and Rewards into a pre-allocated State and Rewards, avoiding the need for an allocation at every sample.

NO CHECKS for nullptr are done.

Parameters
sThe state that needs to be sampled.
aThe action that needs to be sampled.
s1The new state.
rewsThe new rewards.

◆ setDiscount()

void AIToolbox::Factored::MDP::CooperativeModel::setDiscount ( double  d)

This function sets a new discount factor for the Model.

Parameters
dThe new discount factor for the Model.

The documentation for this class was generated from the following file: