AIToolbox
A library that offers tools for AI problem solving.
|
This class computes averages and counts for a Bandit problem. More...
#include <AIToolbox/Bandit/Experience.hpp>
Public Types | |
using | VisitsTable = std::vector< unsigned long > |
Public Member Functions | |
Experience (size_t A) | |
Basic constructor. More... | |
void | record (size_t a, double rew) |
This function updates the reward matrix and counts. More... | |
void | reset () |
This function resets the QFunction and counts to zero. More... | |
unsigned long | getTimesteps () const |
This function returns the number of times the record function has been called. More... | |
const QFunction & | getRewardMatrix () const |
This function returns a reference to the internal reward matrix. More... | |
const VisitsTable & | getVisitsTable () const |
This function returns a reference for the counts for the actions. More... | |
const Vector & | getM2Matrix () const |
This function returns the estimated squared distance of the samples from the mean. More... | |
size_t | getA () const |
This function returns the size of the action space. More... | |
This class computes averages and counts for a Bandit problem.
This class can be used to compute the averages and counts for all actions in a Bandit problem over time.
using AIToolbox::Bandit::Experience::VisitsTable = std::vector<unsigned long> |
AIToolbox::Bandit::Experience::Experience | ( | size_t | A | ) |
Basic constructor.
A | The size of the action space. |
size_t AIToolbox::Bandit::Experience::getA | ( | ) | const |
This function returns the size of the action space.
const Vector& AIToolbox::Bandit::Experience::getM2Matrix | ( | ) | const |
This function returns the estimated squared distance of the samples from the mean.
The retuned values estimate sum_i (x_i - mean_x)^2 for the rewards of each action. Note that these values only have meaning when the respective action has at least 2 samples.
const QFunction& AIToolbox::Bandit::Experience::getRewardMatrix | ( | ) | const |
This function returns a reference to the internal reward matrix.
The reward matrix contains the current average rewards computed for each action.
unsigned long AIToolbox::Bandit::Experience::getTimesteps | ( | ) | const |
This function returns the number of times the record function has been called.
const VisitsTable& AIToolbox::Bandit::Experience::getVisitsTable | ( | ) | const |
This function returns a reference for the counts for the actions.
void AIToolbox::Bandit::Experience::record | ( | size_t | a, |
double | rew | ||
) |
This function updates the reward matrix and counts.
a | The action taken. |
rew | The reward obtained. |
void AIToolbox::Bandit::Experience::reset | ( | ) |
This function resets the QFunction and counts to zero.