AIToolbox
A library that offers tools for AI problem solving.
|
This class models a cooperative MDP. More...
#include <AIToolbox/Factored/MDP/CooperativeModel.hpp>
Public Member Functions | |
CooperativeModel (DDNGraph graph, DDN::TransitionMatrix transitions, FactoredMatrix2D rewards, double discount=1.0) | |
Basic constructor. More... | |
CooperativeModel (const CooperativeModel &) | |
Copy constructor. More... | |
std::tuple< State, double > | sampleSR (const State &s, const Action &a) const |
This function samples the MDP with the specified state action pair. More... | |
double | sampleSR (const State &s, const Action &a, State *s1) const |
This function samples the MDP with the specified state action pair. More... | |
std::tuple< State, Rewards > | sampleSRs (const State &s, const Action &a) const |
This function samples the MDP with the specified state action pair. More... | |
void | sampleSRs (const State &s, const Action &a, State *s1, Rewards *rews) const |
This function samples the MDP with the specified state action pair. More... | |
void | setDiscount (double d) |
This function sets a new discount factor for the Model. More... | |
const State & | getS () const |
This function returns the state space of the world. More... | |
const Action & | getA () const |
This function returns the action space of the MDP. More... | |
double | getDiscount () const |
This function returns the currently set discount factor. More... | |
double | getTransitionProbability (const State &s, const Action &a, const State &s1) const |
This function returns the stored transition probability for the specified transition. More... | |
double | getExpectedReward (const State &s, const Action &a, const State &s1) const |
This function returns the stored expected reward for the specified transition. More... | |
const DDN & | getTransitionFunction () const |
This function returns the transition function of the MDP. More... | |
const FactoredMatrix2D & | getRewardFunction () const |
This function returns the reward function of the MDP. More... | |
const DDNGraph & | getGraph () const |
This function returns the underlying DDNGraph of the CooperativeExperience. More... | |
This class models a cooperative MDP.
This class can be used in order to model problems where multiple agents cooperate in order to achieve a common goal. In particular, we model problems where each agent only cares about a specific subset of the state space, which allows to build a coordination graph to store dependencies.
AIToolbox::Factored::MDP::CooperativeModel::CooperativeModel | ( | DDNGraph | graph, |
DDN::TransitionMatrix | transitions, | ||
FactoredMatrix2D | rewards, | ||
double | discount = 1.0 |
||
) |
AIToolbox::Factored::MDP::CooperativeModel::CooperativeModel | ( | const CooperativeModel & | ) |
Copy constructor.
We must manually copy the DDN as it contains a reference to the graph; if this class gets default copied the reference will not point to the internal graph anymore, which will break everything.
Note: we copy over the same random state as the next class; this is mostly to copy the behaviour of all other models without an explicit copy constructor. In addition, it makes somewhat easier to reproduce results while moving models around, without worrying whether there are RVO or copies being made.
If you want a copy and want to change the random state, just use the other constructor.
const Action& AIToolbox::Factored::MDP::CooperativeModel::getA | ( | ) | const |
This function returns the action space of the MDP.
double AIToolbox::Factored::MDP::CooperativeModel::getDiscount | ( | ) | const |
This function returns the currently set discount factor.
double AIToolbox::Factored::MDP::CooperativeModel::getExpectedReward | ( | const State & | s, |
const Action & | a, | ||
const State & | s1 | ||
) | const |
This function returns the stored expected reward for the specified transition.
s | The initial state of the transition. |
a | The action performed in the transition. |
s1 | The final state of the transition. |
const DDNGraph& AIToolbox::Factored::MDP::CooperativeModel::getGraph | ( | ) | const |
This function returns the underlying DDNGraph of the CooperativeExperience.
const FactoredMatrix2D& AIToolbox::Factored::MDP::CooperativeModel::getRewardFunction | ( | ) | const |
const State& AIToolbox::Factored::MDP::CooperativeModel::getS | ( | ) | const |
This function returns the state space of the world.
const DDN& AIToolbox::Factored::MDP::CooperativeModel::getTransitionFunction | ( | ) | const |
double AIToolbox::Factored::MDP::CooperativeModel::getTransitionProbability | ( | const State & | s, |
const Action & | a, | ||
const State & | s1 | ||
) | const |
This function returns the stored transition probability for the specified transition.
s | The initial state of the transition. |
a | The action performed in the transition. |
s1 | The final state of the transition. |
std::tuple<State, double> AIToolbox::Factored::MDP::CooperativeModel::sampleSR | ( | const State & | s, |
const Action & | a | ||
) | const |
This function samples the MDP with the specified state action pair.
This function samples the model for simulated experience. The transition and reward functions are used to produce, from the state action pair inserted as arguments, a possible new state with respective reward. The new state is picked from all possible states that the MDP allows transitioning to, each with probability equal to the same probability of the transition in the model. After a new state is picked, the reward is the corresponding reward contained in the reward function.
s | The state that needs to be sampled. |
a | The action that needs to be sampled. |
double AIToolbox::Factored::MDP::CooperativeModel::sampleSR | ( | const State & | s, |
const Action & | a, | ||
State * | s1 | ||
) | const |
This function samples the MDP with the specified state action pair.
This function is equivalent to sampleSR(const State &, const Action &).
The only difference is that it allows to output the new State into a pre-allocated State, avoiding the need for an allocation at every sample.
NO CHECKS for nullptr are done.
s | The state that needs to be sampled. |
a | The action that needs to be sampled. |
s1 | The new state. |
std::tuple<State, Rewards> AIToolbox::Factored::MDP::CooperativeModel::sampleSRs | ( | const State & | s, |
const Action & | a | ||
) | const |
This function samples the MDP with the specified state action pair.
This function samples the model for simulate experience. The transition and reward functions are used to produce, from the state action pair inserted as arguments, a possible new state with respective reward. The new state is picked from all possible states that the MDP allows transitioning to, each with probability equal to the same probability of the transition in the model.
After a new state is picked, the reward is the vector of corresponding rewards contained in the reward function. This means that the vector will have a length equal to the number of bases of the reward function.
s | The state that needs to be sampled. |
a | The action that needs to be sampled. |
void AIToolbox::Factored::MDP::CooperativeModel::sampleSRs | ( | const State & | s, |
const Action & | a, | ||
State * | s1, | ||
Rewards * | rews | ||
) | const |
This function samples the MDP with the specified state action pair.
This function is equivalent to sampleSRs(const State &, const Action &).
The only difference is that it allows to output the new State and Rewards into a pre-allocated State and Rewards, avoiding the need for an allocation at every sample.
NO CHECKS for nullptr are done.
s | The state that needs to be sampled. |
a | The action that needs to be sampled. |
s1 | The new state. |
rews | The new rewards. |
void AIToolbox::Factored::MDP::CooperativeModel::setDiscount | ( | double | d | ) |
This function sets a new discount factor for the Model.
d | The new discount factor for the Model. |