AIToolbox
A library that offers tools for AI problem solving.
AIToolbox::Bandit::EpsilonPolicy Class Reference

#include <AIToolbox/Bandit/Policies/EpsilonPolicy.hpp>

Inheritance diagram for AIToolbox::Bandit::EpsilonPolicy:
AIToolbox::Bandit::PolicyInterface AIToolbox::EpsilonPolicyInterface< void, void, size_t > AIToolbox::PolicyInterface< void, void, size_t > AIToolbox::PolicyInterface< void, void, size_t >

Public Types

using EpsilonBase = EpsilonPolicyInterface< void, void, size_t >
 
- Public Types inherited from AIToolbox::Bandit::PolicyInterface
using Base = AIToolbox::PolicyInterface< void, void, size_t >
 
- Public Types inherited from AIToolbox::EpsilonPolicyInterface< void, void, size_t >
using Base = PolicyInterface< void, void, size_t >
 

Public Member Functions

 EpsilonPolicy (const PolicyInterface &p, double epsilon=0.1)
 Basic constructor. More...
 
virtual Vector getPolicy () const override
 This function returns a matrix containing all probabilities of the policy. More...
 
- Public Member Functions inherited from AIToolbox::PolicyInterface< void, void, size_t >
 PolicyInterface (void s, size_t a)
 Basic constructor. More...
 
virtual ~PolicyInterface ()
 Basic virtual destructor. More...
 
const void & getS () const
 This function returns the number of states of the world. More...
 
const size_t & getA () const
 This function returns the number of available actions to the agent. More...
 
- Public Member Functions inherited from AIToolbox::EpsilonPolicyInterface< void, void, size_t >
 EpsilonPolicyInterface (const Base &p, double epsilon=0.1)
 Basic constructor. More...
 
virtual size_t sampleAction (const void &s) const override
 This function chooses an action for state s, following the policy distribution and epsilon. More...
 
virtual double getActionProbability (const void &s, const size_t &a) const override
 This function returns the probability of taking the specified action in the specified state. More...
 
void setEpsilon (double e)
 This function sets the epsilon parameter. More...
 
double getEpsilon () const
 This function will return the currently set epsilon parameter. More...
 

Protected Member Functions

virtual size_t sampleRandomAction () const override
 This function returns a random action in the Action space. More...
 
virtual double getRandomActionProbability () const override
 This function returns the probability of picking a random action. More...
 
- Protected Member Functions inherited from AIToolbox::EpsilonPolicyInterface< void, void, size_t >
virtual size_t sampleRandomAction () const=0
 This function returns a random action in the Action space. More...
 
virtual double getRandomActionProbability () const=0
 This function returns the probability of picking a random action. More...
 

Protected Attributes

std::uniform_int_distribution< size_t > randomDistribution_
 
- Protected Attributes inherited from AIToolbox::PolicyInterface< void, void, size_t >
void S
 
size_t A
 
RandomEngine rand_
 
- Protected Attributes inherited from AIToolbox::EpsilonPolicyInterface< void, void, size_t >
const Basepolicy_
 
double epsilon_
 

Member Typedef Documentation

◆ EpsilonBase

Constructor & Destructor Documentation

◆ EpsilonPolicy()

AIToolbox::Bandit::EpsilonPolicy::EpsilonPolicy ( const PolicyInterface p,
double  epsilon = 0.1 
)

Basic constructor.

This constructor saves the input policy and the epsilon parameter for later use.

The epsilon parameter must be >= 0.0 and <= 1.0, otherwise the constructor will throw an std::invalid_argument.

Parameters
pThe policy that is being extended.
epsilonThe parameter that controls the amount of exploration.

Member Function Documentation

◆ getPolicy()

virtual Vector AIToolbox::Bandit::EpsilonPolicy::getPolicy ( ) const
overridevirtual

This function returns a matrix containing all probabilities of the policy.

Note that this may be expensive to compute, and should not be called often (aside from the fact that it needs to allocate a new Matrix2D each time).

Ideally this function can be called only when there is a repeated need to access the same policy values in an efficient manner.

Implements AIToolbox::Bandit::PolicyInterface.

◆ getRandomActionProbability()

virtual double AIToolbox::Bandit::EpsilonPolicy::getRandomActionProbability ( ) const
overrideprotectedvirtual

This function returns the probability of picking a random action.

Returns
The probability of picking an an action at random.

◆ sampleRandomAction()

virtual size_t AIToolbox::Bandit::EpsilonPolicy::sampleRandomAction ( ) const
overrideprotectedvirtual

This function returns a random action in the Action space.

Returns
A valid random action.

Member Data Documentation

◆ randomDistribution_

std::uniform_int_distribution<size_t> AIToolbox::Bandit::EpsilonPolicy::randomDistribution_
mutableprotected

The documentation for this class was generated from the following file: