AIToolbox
A library that offers tools for AI problem solving.
|
This class represents a factored multi-armed bandit. More...
#include <AIToolbox/Factored/Bandit/Model.hpp>
Public Member Functions | |
template<typename... TupleArgs> | |
Model (Action A, std::vector< PartialKeys > deps, std::vector< AIToolbox::Bandit::Model< Dist >> arms) | |
Basic constructor. More... | |
const Rewards & | sampleR (const Action &a) const |
This function samples the specified joint bandit arm. More... | |
const Action & | getA () const |
This function returns the joint action space. More... | |
const std::vector< PartialKeys > & | getGroups () const |
This function returns a reference to the agent groupings. More... | |
const std::vector< AIToolbox::Bandit::Model< Dist > > & | getArms () const |
This function returns a reference to the internal local arms. More... | |
This class represents a factored multi-armed bandit.
A factored multi-armed bandit is a specific bandit class, where the reward function is factored into independent components, each of which only depends on a subset of agents. The goal is generally to maximize the sum of the rewards of all local arms.
It effectively behaves as a collection of multi-armed bandits, aside from the fact that the action each agent takes will be the same in all bandits that it participates into. Each "local" bandit's effective action will be the combination of all participating agents.
This structure can make learning learning how to act much more efficient, as exploiting the factorization allows to extract more information from each joint action performed by the agents.
Dist | The distribution to use for all local arms. |
AIToolbox::Factored::Bandit::Model< Dist >::Model | ( | Action | A, |
std::vector< PartialKeys > | deps, | ||
std::vector< AIToolbox::Bandit::Model< Dist >> | arms | ||
) |
Basic constructor.
This constructor creates the factored multi-armed bandit from a set of standard bandits, each associated with a group of agents.
Note that the action space of each bandit must be equal to the product of the action spaces of all agents in its group. For example, a bandit associated with agents with action spaces 2, 3, 2 should have 12 arms in total.
A | The joint action space. |
deps | The agents associated with each bandit. |
arms | The local bandits to use. |
const Action & AIToolbox::Factored::Bandit::Model< Dist >::getA |
This function returns the joint action space.
const std::vector< AIToolbox::Bandit::Model< Dist > > & AIToolbox::Factored::Bandit::Model< Dist >::getArms |
This function returns a reference to the internal local arms.
const std::vector< PartialKeys > & AIToolbox::Factored::Bandit::Model< Dist >::getGroups |
This function returns a reference to the agent groupings.
const Rewards & AIToolbox::Factored::Bandit::Model< Dist >::sampleR | ( | const Action & | a | ) | const |
This function samples the specified joint bandit arm.
a | The joint arm to sample. |