AIToolbox
A library that offers tools for AI problem solving.
AIToolbox::Factored::MDP::BanditPolicyAdaptor< BanditPolicy > Class Template Reference

This class extends a Bandit policy so that it can be called from MDP code. More...

#include <AIToolbox/Factored/MDP/Policies/BanditPolicyAdaptor.hpp>

Inheritance diagram for AIToolbox::Factored::MDP::BanditPolicyAdaptor< BanditPolicy >:
AIToolbox::PolicyInterface< State, State, Action >

Public Types

using Base = PolicyInterface< State, State, Action >
 

Public Member Functions

template<typename... Args>
 BanditPolicyAdaptor (State s, Args &&... params)
 Basic constructor. More...
 
virtual Action sampleAction (const State &s) const override
 This function chooses a random action using the underlying bandit policy. More...
 
virtual double getActionProbability (const State &s, const Action &a) const override
 This function returns the probability of taking the specified action. More...
 
BanditPolicy & getBanditPolicy ()
 This function returns a reference to the underlying BanditPolicy. More...
 
const BanditPolicy & getBanditPolicy () const
 This function returns a reference to the underlying BanditPolicy. More...
 
- Public Member Functions inherited from AIToolbox::PolicyInterface< State, State, Action >
 PolicyInterface (State s, Action a)
 Basic constructor. More...
 
virtual ~PolicyInterface ()
 Basic virtual destructor. More...
 
virtual Action sampleAction (const State &s) const=0
 This function chooses a random action for state s, following the policy distribution. More...
 
virtual double getActionProbability (const State &s, const Action &a) const=0
 This function returns the probability of taking the specified action in the specified state. More...
 
const State & getS () const
 This function returns the number of states of the world. More...
 
const Action & getA () const
 This function returns the number of available actions to the agent. More...
 

Additional Inherited Members

- Protected Attributes inherited from AIToolbox::PolicyInterface< State, State, Action >
State S
 
Action A
 
RandomEngine rand_
 

Detailed Description

template<typename BanditPolicy>
class AIToolbox::Factored::MDP::BanditPolicyAdaptor< BanditPolicy >

This class extends a Bandit policy so that it can be called from MDP code.

This class simply ignores all states that are passed to it, and just uses the actions in order to sample and call the underlying Bandit code.

Template Parameters
BanditPolicyThe Bandit policy to wrap.

Member Typedef Documentation

◆ Base

template<typename BanditPolicy >
using AIToolbox::Factored::MDP::BanditPolicyAdaptor< BanditPolicy >::Base = PolicyInterface<State, State, Action>

Constructor & Destructor Documentation

◆ BanditPolicyAdaptor()

template<typename BP >
template<typename... Args>
AIToolbox::Factored::MDP::BanditPolicyAdaptor< BP >::BanditPolicyAdaptor ( State  s,
Args &&...  params 
)

Basic constructor.

Parameters
sThe size of the state space.
paramsThe arguments for the underlying BanditPolicy.

Member Function Documentation

◆ getActionProbability()

template<typename BP >
double AIToolbox::Factored::MDP::BanditPolicyAdaptor< BP >::getActionProbability ( const State s,
const Action a 
) const
overridevirtual

This function returns the probability of taking the specified action.

Parameters
sThe unused selected state.
aThe selected action.
Returns
The probability of taking the selected action in the specified state.

◆ getBanditPolicy() [1/2]

template<typename BP >
const BP & AIToolbox::Factored::MDP::BanditPolicyAdaptor< BP >::getBanditPolicy

This function returns a reference to the underlying BanditPolicy.

◆ getBanditPolicy() [2/2]

template<typename BanditPolicy >
const BanditPolicy& AIToolbox::Factored::MDP::BanditPolicyAdaptor< BanditPolicy >::getBanditPolicy ( ) const

This function returns a reference to the underlying BanditPolicy.

◆ sampleAction()

template<typename BP >
Action AIToolbox::Factored::MDP::BanditPolicyAdaptor< BP >::sampleAction ( const State s) const
overridevirtual

This function chooses a random action using the underlying bandit policy.

Parameters
sThe unused sampled state of the policy.
Returns
The chosen action.

The documentation for this class was generated from the following file: