AIToolbox
A library that offers tools for AI problem solving.
AIToolbox::MDP::BanditPolicyAdaptor< BanditPolicy > Class Template Reference

This class extends a Bandit policy so that it can be called from MDP code. More...

#include <AIToolbox/MDP/Policies/BanditPolicyAdaptor.hpp>

Inheritance diagram for AIToolbox::MDP::BanditPolicyAdaptor< BanditPolicy >:
AIToolbox::MDP::PolicyInterface AIToolbox::PolicyInterface< size_t, size_t, size_t >

Public Member Functions

template<typename... Args>
 BanditPolicyAdaptor (size_t s, Args &&... params)
 Basic constructor. More...
 
virtual size_t sampleAction (const size_t &s) const override
 This function chooses a random action using the underlying bandit policy. More...
 
virtual double getActionProbability (const size_t &s, const size_t &a) const override
 This function returns the probability of taking the specified action. More...
 
virtual Matrix2D getPolicy () const override
 This function returns a matrix containing all probabilities of the policy. More...
 
BanditPolicy & getBanditPolicy ()
 This function returns a reference to the underlying BanditPolicy. More...
 
const BanditPolicy & getBanditPolicy () const
 This function returns a reference to the underlying BanditPolicy. More...
 
- Public Member Functions inherited from AIToolbox::PolicyInterface< size_t, size_t, size_t >
 PolicyInterface (size_t s, size_t a)
 Basic constructor. More...
 
virtual ~PolicyInterface ()
 Basic virtual destructor. More...
 
const size_t & getS () const
 This function returns the number of states of the world. More...
 
const size_t & getA () const
 This function returns the number of available actions to the agent. More...
 

Additional Inherited Members

- Public Types inherited from AIToolbox::MDP::PolicyInterface
using Base = AIToolbox::PolicyInterface< size_t, size_t, size_t >
 
- Protected Attributes inherited from AIToolbox::PolicyInterface< size_t, size_t, size_t >
size_t S
 
size_t A
 
RandomEngine rand_
 

Detailed Description

template<typename BanditPolicy>
class AIToolbox::MDP::BanditPolicyAdaptor< BanditPolicy >

This class extends a Bandit policy so that it can be called from MDP code.

This class simply ignores all states that are passed to it, and just uses the actions in order to sample and call the underlying Bandit code.

Template Parameters
BanditPolicyThe Bandit policy to wrap.

Constructor & Destructor Documentation

◆ BanditPolicyAdaptor()

template<typename BP >
template<typename... Args>
AIToolbox::MDP::BanditPolicyAdaptor< BP >::BanditPolicyAdaptor ( size_t  s,
Args &&...  params 
)

Basic constructor.

Parameters
sThe size of the state space.
paramsThe arguments for the underlying BanditPolicy.

Member Function Documentation

◆ getActionProbability()

template<typename BP >
double AIToolbox::MDP::BanditPolicyAdaptor< BP >::getActionProbability ( const size_t &  s,
const size_t &  a 
) const
overridevirtual

This function returns the probability of taking the specified action.

Parameters
sThe unused selected state.
aThe selected action.
Returns
The probability of taking the selected action in the specified state.

Implements AIToolbox::PolicyInterface< size_t, size_t, size_t >.

◆ getBanditPolicy() [1/2]

template<typename BP >
const BP & AIToolbox::MDP::BanditPolicyAdaptor< BP >::getBanditPolicy

This function returns a reference to the underlying BanditPolicy.

◆ getBanditPolicy() [2/2]

template<typename BanditPolicy >
const BanditPolicy& AIToolbox::MDP::BanditPolicyAdaptor< BanditPolicy >::getBanditPolicy ( ) const

This function returns a reference to the underlying BanditPolicy.

◆ getPolicy()

template<typename BP >
Matrix2D AIToolbox::MDP::BanditPolicyAdaptor< BP >::getPolicy
overridevirtual

This function returns a matrix containing all probabilities of the policy.

This function returns a matrix replicating the Bandit policy for every row.

Implements AIToolbox::MDP::PolicyInterface.

◆ sampleAction()

template<typename BP >
size_t AIToolbox::MDP::BanditPolicyAdaptor< BP >::sampleAction ( const size_t &  s) const
overridevirtual

This function chooses a random action using the underlying bandit policy.

Parameters
sThe unused sampled state of the policy.
Returns
The chosen action.

Implements AIToolbox::PolicyInterface< size_t, size_t, size_t >.


The documentation for this class was generated from the following file: