|
AIToolbox
A library that offers tools for AI problem solving.
|
This class computes averages and counts for a Bandit problem. More...
#include <AIToolbox/Bandit/Experience.hpp>
Public Types | |
| using | VisitsTable = std::vector< unsigned long > |
Public Member Functions | |
| Experience (size_t A) | |
| Basic constructor. More... | |
| void | record (size_t a, double rew) |
| This function updates the reward matrix and counts. More... | |
| void | reset () |
| This function resets the QFunction and counts to zero. More... | |
| unsigned long | getTimesteps () const |
| This function returns the number of times the record function has been called. More... | |
| const QFunction & | getRewardMatrix () const |
| This function returns a reference to the internal reward matrix. More... | |
| const VisitsTable & | getVisitsTable () const |
| This function returns a reference for the counts for the actions. More... | |
| const Vector & | getM2Matrix () const |
| This function returns the estimated squared distance of the samples from the mean. More... | |
| size_t | getA () const |
| This function returns the size of the action space. More... | |
This class computes averages and counts for a Bandit problem.
This class can be used to compute the averages and counts for all actions in a Bandit problem over time.
| using AIToolbox::Bandit::Experience::VisitsTable = std::vector<unsigned long> |
| AIToolbox::Bandit::Experience::Experience | ( | size_t | A | ) |
Basic constructor.
| A | The size of the action space. |
| size_t AIToolbox::Bandit::Experience::getA | ( | ) | const |
This function returns the size of the action space.
| const Vector& AIToolbox::Bandit::Experience::getM2Matrix | ( | ) | const |
This function returns the estimated squared distance of the samples from the mean.
The retuned values estimate sum_i (x_i - mean_x)^2 for the rewards of each action. Note that these values only have meaning when the respective action has at least 2 samples.
| const QFunction& AIToolbox::Bandit::Experience::getRewardMatrix | ( | ) | const |
This function returns a reference to the internal reward matrix.
The reward matrix contains the current average rewards computed for each action.
| unsigned long AIToolbox::Bandit::Experience::getTimesteps | ( | ) | const |
This function returns the number of times the record function has been called.
| const VisitsTable& AIToolbox::Bandit::Experience::getVisitsTable | ( | ) | const |
This function returns a reference for the counts for the actions.
| void AIToolbox::Bandit::Experience::record | ( | size_t | a, |
| double | rew | ||
| ) |
This function updates the reward matrix and counts.
| a | The action taken. |
| rew | The reward obtained. |
| void AIToolbox::Bandit::Experience::reset | ( | ) |
This function resets the QFunction and counts to zero.