AIToolbox
A library that offers tools for AI problem solving.
ThompsonSamplingPolicy.hpp
Go to the documentation of this file.
1 #ifndef AI_TOOLBOX_BANDIT_THOMPSON_SAMPLING_POLICY_HEADER_FILE
2 #define AI_TOOLBOX_BANDIT_THOMPSON_SAMPLING_POLICY_HEADER_FILE
3 
4 #include <random>
5 
9 
10 namespace AIToolbox::Bandit {
20  public:
27 
33  virtual size_t sampleAction() const override;
34 
50  virtual double getActionProbability(const size_t & a) const override;
51 
64  virtual Vector getPolicy() const override;
65 
71  const Experience & getExperience() const;
72 
73  private:
74  const Experience & exp_;
75  };
76 }
77 
78 #endif
AIToolbox::Bandit::ThompsonSamplingPolicy::getPolicy
virtual Vector getPolicy() const override
This function returns a vector containing all probabilities of the policy.
AIToolbox::Bandit::ThompsonSamplingPolicy::ThompsonSamplingPolicy
ThompsonSamplingPolicy(const Experience &exp)
Basic constructor.
AIToolbox::Bandit::ThompsonSamplingPolicy::sampleAction
virtual size_t sampleAction() const override
This function chooses an action using Thompson sampling.
AIToolbox::Bandit::ThompsonSamplingPolicy
This class implements a Thompson sampling policy.
Definition: ThompsonSamplingPolicy.hpp:19
Experience.hpp
AIToolbox::Bandit::PolicyInterface
Simple typedef for most of a normal Bandit's policy needs.
Definition: PolicyInterface.hpp:11
AIToolbox::Bandit::ThompsonSamplingPolicy::getExperience
const Experience & getExperience() const
This function returns a reference to the underlying Experience we use.
AIToolbox::Vector
Eigen::Matrix< double, Eigen::Dynamic, 1 > Vector
Definition: Types.hpp:16
AIToolbox::Bandit
Definition: Experience.hpp:6
PolicyInterface.hpp
Types.hpp
AIToolbox::Bandit::Experience
This class computes averages and counts for a Bandit problem.
Definition: Experience.hpp:13
AIToolbox::Bandit::ThompsonSamplingPolicy::getActionProbability
virtual double getActionProbability(const size_t &a) const override
This function returns the probability of taking the specified action.