AIToolbox
A library that offers tools for AI problem solving.
Experience.hpp
Go to the documentation of this file.
1 #ifndef AI_TOOLBOX_BANDIT_EXPERIENCE_HEADER_FILE
2 #define AI_TOOLBOX_BANDIT_EXPERIENCE_HEADER_FILE
3 
5 
6 namespace AIToolbox::Bandit {
13  class Experience {
14  public:
15  using VisitsTable = std::vector<unsigned long>;
16 
22  Experience(size_t A);
23 
30  void record(size_t a, double rew);
31 
35  void reset();
36 
42  unsigned long getTimesteps() const;
43 
51  const QFunction & getRewardMatrix() const;
52 
58  const VisitsTable & getVisitsTable() const;
59 
69  const Vector & getM2Matrix() const;
70 
76  size_t getA() const;
77 
78  private:
79  QFunction q_;
80  Vector M2s_;
81  VisitsTable counts_;
82  unsigned long timesteps_;
83  };
84 }
85 
86 #endif
AIToolbox::Bandit::Experience::getVisitsTable
const VisitsTable & getVisitsTable() const
This function returns a reference for the counts for the actions.
AIToolbox::Bandit::QFunction
Vector QFunction
Definition: Types.hpp:16
AIToolbox::Bandit::Experience::VisitsTable
std::vector< unsigned long > VisitsTable
Definition: Experience.hpp:15
AIToolbox::Vector
Eigen::Matrix< double, Eigen::Dynamic, 1 > Vector
Definition: Types.hpp:16
AIToolbox::Bandit::Experience::getM2Matrix
const Vector & getM2Matrix() const
This function returns the estimated squared distance of the samples from the mean.
AIToolbox::Bandit::Experience::getRewardMatrix
const QFunction & getRewardMatrix() const
This function returns a reference to the internal reward matrix.
AIToolbox::Bandit
Definition: Experience.hpp:6
AIToolbox::Bandit::Experience::getA
size_t getA() const
This function returns the size of the action space.
AIToolbox::Bandit::Experience::reset
void reset()
This function resets the QFunction and counts to zero.
AIToolbox::Bandit::Experience::Experience
Experience(size_t A)
Basic constructor.
Types.hpp
AIToolbox::Bandit::Experience
This class computes averages and counts for a Bandit problem.
Definition: Experience.hpp:13
AIToolbox::Bandit::Experience::record
void record(size_t a, double rew)
This function updates the reward matrix and counts.
AIToolbox::Bandit::Experience::getTimesteps
unsigned long getTimesteps() const
This function returns the number of times the record function has been called.