AIToolbox
A library that offers tools for AI problem solving.
AIToolbox::Bandit::PolicyInterface Class Referenceabstract

Simple typedef for most of a normal Bandit's policy needs. More...

#include <AIToolbox/Bandit/Policies/PolicyInterface.hpp>

Inheritance diagram for AIToolbox::Bandit::PolicyInterface:
AIToolbox::PolicyInterface< void, void, size_t > AIToolbox::Bandit::EpsilonPolicy AIToolbox::Bandit::ESRLPolicy AIToolbox::Bandit::LRPPolicy AIToolbox::Bandit::QGreedyPolicy AIToolbox::Bandit::QSoftmaxPolicy AIToolbox::Bandit::RandomPolicy AIToolbox::Bandit::SuccessiveRejectsPolicy AIToolbox::Bandit::T3CPolicy AIToolbox::Bandit::ThompsonSamplingPolicy AIToolbox::Bandit::TopTwoThompsonSamplingPolicy

Public Types

using Base = AIToolbox::PolicyInterface< void, void, size_t >
 

Public Member Functions

virtual Vector getPolicy () const =0
 This function returns a vector containing all probabilities of the policy. More...
 
- Public Member Functions inherited from AIToolbox::PolicyInterface< void, void, size_t >
 PolicyInterface (void s, size_t a)
 Basic constructor. More...
 
virtual ~PolicyInterface ()
 Basic virtual destructor. More...
 
virtual size_t sampleAction (const void &s) const=0
 This function chooses a random action for state s, following the policy distribution. More...
 
virtual double getActionProbability (const void &s, const size_t &a) const=0
 This function returns the probability of taking the specified action in the specified state. More...
 
const void & getS () const
 This function returns the number of states of the world. More...
 
const size_t & getA () const
 This function returns the number of available actions to the agent. More...
 

Additional Inherited Members

- Protected Attributes inherited from AIToolbox::PolicyInterface< void, void, size_t >
void S
 
size_t A
 
RandomEngine rand_
 

Detailed Description

Simple typedef for most of a normal Bandit's policy needs.

Member Typedef Documentation

◆ Base

Member Function Documentation

◆ getPolicy()

virtual Vector AIToolbox::Bandit::PolicyInterface::getPolicy ( ) const
pure virtual

This function returns a vector containing all probabilities of the policy.

Note that this may be expensive to compute, and should not be called often (aside from the fact that it needs to allocate a new Vector each time).

Ideally this function can be called only when there is a repeated need to access the same policy values in an efficient manner.

Implemented in AIToolbox::Bandit::ESRLPolicy, AIToolbox::Bandit::LRPPolicy, AIToolbox::Bandit::SuccessiveRejectsPolicy, AIToolbox::Bandit::QSoftmaxPolicy, AIToolbox::Bandit::TopTwoThompsonSamplingPolicy, AIToolbox::Bandit::T3CPolicy, AIToolbox::Bandit::ThompsonSamplingPolicy, AIToolbox::Bandit::QGreedyPolicy, AIToolbox::Bandit::RandomPolicy, and AIToolbox::Bandit::EpsilonPolicy.


The documentation for this class was generated from the following file: