AIToolbox
A library that offers tools for AI problem solving.
AIToolbox::Bandit::Experience Class Reference

This class computes averages and counts for a Bandit problem. More...

#include <AIToolbox/Bandit/Experience.hpp>

Public Types

using VisitsTable = std::vector< unsigned long >
 

Public Member Functions

 Experience (size_t A)
 Basic constructor. More...
 
void record (size_t a, double rew)
 This function updates the reward matrix and counts. More...
 
void reset ()
 This function resets the QFunction and counts to zero. More...
 
unsigned long getTimesteps () const
 This function returns the number of times the record function has been called. More...
 
const QFunctiongetRewardMatrix () const
 This function returns a reference to the internal reward matrix. More...
 
const VisitsTablegetVisitsTable () const
 This function returns a reference for the counts for the actions. More...
 
const VectorgetM2Matrix () const
 This function returns the estimated squared distance of the samples from the mean. More...
 
size_t getA () const
 This function returns the size of the action space. More...
 

Detailed Description

This class computes averages and counts for a Bandit problem.

This class can be used to compute the averages and counts for all actions in a Bandit problem over time.

Member Typedef Documentation

◆ VisitsTable

using AIToolbox::Bandit::Experience::VisitsTable = std::vector<unsigned long>

Constructor & Destructor Documentation

◆ Experience()

AIToolbox::Bandit::Experience::Experience ( size_t  A)

Basic constructor.

Parameters
AThe size of the action space.

Member Function Documentation

◆ getA()

size_t AIToolbox::Bandit::Experience::getA ( ) const

This function returns the size of the action space.

Returns
The size of the action space.

◆ getM2Matrix()

const Vector& AIToolbox::Bandit::Experience::getM2Matrix ( ) const

This function returns the estimated squared distance of the samples from the mean.

The retuned values estimate sum_i (x_i - mean_x)^2 for the rewards of each action. Note that these values only have meaning when the respective action has at least 2 samples.

Returns
A reference to the estimated square distance from the mean.

◆ getRewardMatrix()

const QFunction& AIToolbox::Bandit::Experience::getRewardMatrix ( ) const

This function returns a reference to the internal reward matrix.

The reward matrix contains the current average rewards computed for each action.

Returns
A reference to the internal reward matrix.

◆ getTimesteps()

unsigned long AIToolbox::Bandit::Experience::getTimesteps ( ) const

This function returns the number of times the record function has been called.

Returns
The number of recorded timesteps.

◆ getVisitsTable()

const VisitsTable& AIToolbox::Bandit::Experience::getVisitsTable ( ) const

This function returns a reference for the counts for the actions.

Returns
A reference to the counts of the actions.

◆ record()

void AIToolbox::Bandit::Experience::record ( size_t  a,
double  rew 
)

This function updates the reward matrix and counts.

Parameters
aThe action taken.
rewThe reward obtained.

◆ reset()

void AIToolbox::Bandit::Experience::reset ( )

This function resets the QFunction and counts to zero.


The documentation for this class was generated from the following file: