AIToolbox
A library that offers tools for AI problem solving.
AIToolbox::MDP::PolicyInterface Class Referenceabstract

Simple typedef for most of MDP's policy needs. More...

#include <AIToolbox/MDP/Policies/PolicyInterface.hpp>

Inheritance diagram for AIToolbox::MDP::PolicyInterface:
AIToolbox::PolicyInterface< size_t, size_t, size_t > AIToolbox::MDP::BanditPolicyAdaptor< BanditPolicy > AIToolbox::MDP::EpsilonPolicy AIToolbox::MDP::PolicyWrapper AIToolbox::MDP::QPolicyInterface AIToolbox::MDP::Policy AIToolbox::MDP::PGAAPPPolicy AIToolbox::MDP::QGreedyPolicy AIToolbox::MDP::QSoftmaxPolicy AIToolbox::MDP::WoLFPolicy

Public Types

using Base = AIToolbox::PolicyInterface< size_t, size_t, size_t >
 

Public Member Functions

virtual Matrix2D getPolicy () const =0
 This function returns a matrix containing all probabilities of the policy. More...
 
- Public Member Functions inherited from AIToolbox::PolicyInterface< size_t, size_t, size_t >
 PolicyInterface (size_t s, size_t a)
 Basic constructor. More...
 
virtual ~PolicyInterface ()
 Basic virtual destructor. More...
 
virtual size_t sampleAction (const size_t &s) const=0
 This function chooses a random action for state s, following the policy distribution. More...
 
virtual double getActionProbability (const size_t &s, const size_t &a) const=0
 This function returns the probability of taking the specified action in the specified state. More...
 
const size_t & getS () const
 This function returns the number of states of the world. More...
 
const size_t & getA () const
 This function returns the number of available actions to the agent. More...
 

Additional Inherited Members

- Protected Attributes inherited from AIToolbox::PolicyInterface< size_t, size_t, size_t >
size_t S
 
size_t A
 
RandomEngine rand_
 

Detailed Description

Simple typedef for most of MDP's policy needs.

Member Typedef Documentation

◆ Base

Member Function Documentation

◆ getPolicy()

virtual Matrix2D AIToolbox::MDP::PolicyInterface::getPolicy ( ) const
pure virtual

This function returns a matrix containing all probabilities of the policy.

Note that this may be expensive to compute, and should not be called often (aside from the fact that it needs to allocate a new Matrix2D each time).

Ideally this function can be called only when there is a repeated need to access the same policy values in an efficient manner.

Implemented in AIToolbox::MDP::WoLFPolicy, AIToolbox::MDP::QSoftmaxPolicy, AIToolbox::MDP::PolicyWrapper, AIToolbox::MDP::PGAAPPPolicy, AIToolbox::MDP::QGreedyPolicy, AIToolbox::MDP::BanditPolicyAdaptor< BanditPolicy >, and AIToolbox::MDP::EpsilonPolicy.


The documentation for this class was generated from the following file: