AIToolbox
A library that offers tools for AI problem solving.
|
This class extends a Bandit policy so that it can be called from MDP code. More...
#include <AIToolbox/MDP/Policies/BanditPolicyAdaptor.hpp>
Public Member Functions | |
template<typename... Args> | |
BanditPolicyAdaptor (size_t s, Args &&... params) | |
Basic constructor. More... | |
virtual size_t | sampleAction (const size_t &s) const override |
This function chooses a random action using the underlying bandit policy. More... | |
virtual double | getActionProbability (const size_t &s, const size_t &a) const override |
This function returns the probability of taking the specified action. More... | |
virtual Matrix2D | getPolicy () const override |
This function returns a matrix containing all probabilities of the policy. More... | |
BanditPolicy & | getBanditPolicy () |
This function returns a reference to the underlying BanditPolicy. More... | |
const BanditPolicy & | getBanditPolicy () const |
This function returns a reference to the underlying BanditPolicy. More... | |
Public Member Functions inherited from AIToolbox::PolicyInterface< size_t, size_t, size_t > | |
PolicyInterface (size_t s, size_t a) | |
Basic constructor. More... | |
virtual | ~PolicyInterface () |
Basic virtual destructor. More... | |
const size_t & | getS () const |
This function returns the number of states of the world. More... | |
const size_t & | getA () const |
This function returns the number of available actions to the agent. More... | |
Additional Inherited Members | |
Public Types inherited from AIToolbox::MDP::PolicyInterface | |
using | Base = AIToolbox::PolicyInterface< size_t, size_t, size_t > |
Protected Attributes inherited from AIToolbox::PolicyInterface< size_t, size_t, size_t > | |
size_t | S |
size_t | A |
RandomEngine | rand_ |
This class extends a Bandit policy so that it can be called from MDP code.
This class simply ignores all states that are passed to it, and just uses the actions in order to sample and call the underlying Bandit code.
BanditPolicy | The Bandit policy to wrap. |
AIToolbox::MDP::BanditPolicyAdaptor< BP >::BanditPolicyAdaptor | ( | size_t | s, |
Args &&... | params | ||
) |
Basic constructor.
s | The size of the state space. |
params | The arguments for the underlying BanditPolicy. |
|
overridevirtual |
This function returns the probability of taking the specified action.
s | The unused selected state. |
a | The selected action. |
Implements AIToolbox::PolicyInterface< size_t, size_t, size_t >.
const BP & AIToolbox::MDP::BanditPolicyAdaptor< BP >::getBanditPolicy |
This function returns a reference to the underlying BanditPolicy.
const BanditPolicy& AIToolbox::MDP::BanditPolicyAdaptor< BanditPolicy >::getBanditPolicy | ( | ) | const |
This function returns a reference to the underlying BanditPolicy.
|
overridevirtual |
This function returns a matrix containing all probabilities of the policy.
This function returns a matrix replicating the Bandit policy for every row.
Implements AIToolbox::MDP::PolicyInterface.
|
overridevirtual |
This function chooses a random action using the underlying bandit policy.
s | The unused sampled state of the policy. |
Implements AIToolbox::PolicyInterface< size_t, size_t, size_t >.