AIToolbox
A library that offers tools for AI problem solving.
|
This class extends a Bandit policy so that it can be called from MDP code. More...
#include <AIToolbox/Factored/MDP/Policies/BanditPolicyAdaptor.hpp>
Public Types | |
using | Base = PolicyInterface< State, State, Action > |
Public Member Functions | |
template<typename... Args> | |
BanditPolicyAdaptor (State s, Args &&... params) | |
Basic constructor. More... | |
virtual Action | sampleAction (const State &s) const override |
This function chooses a random action using the underlying bandit policy. More... | |
virtual double | getActionProbability (const State &s, const Action &a) const override |
This function returns the probability of taking the specified action. More... | |
BanditPolicy & | getBanditPolicy () |
This function returns a reference to the underlying BanditPolicy. More... | |
const BanditPolicy & | getBanditPolicy () const |
This function returns a reference to the underlying BanditPolicy. More... | |
![]() | |
PolicyInterface (State s, Action a) | |
Basic constructor. More... | |
virtual | ~PolicyInterface () |
Basic virtual destructor. More... | |
virtual Action | sampleAction (const State &s) const=0 |
This function chooses a random action for state s, following the policy distribution. More... | |
virtual double | getActionProbability (const State &s, const Action &a) const=0 |
This function returns the probability of taking the specified action in the specified state. More... | |
const State & | getS () const |
This function returns the number of states of the world. More... | |
const Action & | getA () const |
This function returns the number of available actions to the agent. More... | |
Additional Inherited Members | |
![]() | |
State | S |
Action | A |
RandomEngine | rand_ |
This class extends a Bandit policy so that it can be called from MDP code.
This class simply ignores all states that are passed to it, and just uses the actions in order to sample and call the underlying Bandit code.
BanditPolicy | The Bandit policy to wrap. |
using AIToolbox::Factored::MDP::BanditPolicyAdaptor< BanditPolicy >::Base = PolicyInterface<State, State, Action> |
AIToolbox::Factored::MDP::BanditPolicyAdaptor< BP >::BanditPolicyAdaptor | ( | State | s, |
Args &&... | params | ||
) |
Basic constructor.
s | The size of the state space. |
params | The arguments for the underlying BanditPolicy. |
|
overridevirtual |
This function returns the probability of taking the specified action.
s | The unused selected state. |
a | The selected action. |
const BP & AIToolbox::Factored::MDP::BanditPolicyAdaptor< BP >::getBanditPolicy |
This function returns a reference to the underlying BanditPolicy.
const BanditPolicy& AIToolbox::Factored::MDP::BanditPolicyAdaptor< BanditPolicy >::getBanditPolicy | ( | ) | const |
This function returns a reference to the underlying BanditPolicy.
|
overridevirtual |
This function chooses a random action using the underlying bandit policy.
s | The unused sampled state of the policy. |