AIToolbox
A library that offers tools for AI problem solving.
SuccessiveRejectsPolicy.hpp
Go to the documentation of this file.
1 #ifndef SUCCESSIVE_REJECTS_POLICY_HEADER_FILE
2 #define SUCCESSIVE_REJECTS_POLICY_HEADER_FILE
3 
4 #include <AIToolbox/Types.hpp>
7 
8 namespace AIToolbox::Bandit {
29  public:
36  SuccessiveRejectsPolicy(const Experience & experience, unsigned budget);
37 
46  virtual size_t sampleAction() const override;
47 
59  void stepUpdateQ();
60 
64  bool canRecommendAction() const;
65 
69  size_t recommendAction() const;
70 
76  size_t getCurrentPhase() const;
77 
81  size_t getCurrentNk() const;
82 
89  size_t getPreviousNk() const;
90 
94  virtual double getActionProbability(const size_t & a) const override;
95 
99  virtual Vector getPolicy() const override;
100 
106  const Experience & getExperience() const;
107 
108  private:
109  void updateNks();
110 
111  const Experience & exp_;
112  unsigned budget_;
113 
114  unsigned currentPhase_;
115  size_t currentActionId_;
116  unsigned currentArmPulls_;
117 
118  unsigned nKOld_, nKNew_;
119  double logBarK_;
120  std::vector<size_t> availableActions_;
121  };
122 }
123 
124 #endif
125 
AIToolbox::Bandit::SuccessiveRejectsPolicy::canRecommendAction
bool canRecommendAction() const
This function returns whether a single action remains in the pool.
AIToolbox::Bandit::SuccessiveRejectsPolicy::getPolicy
virtual Vector getPolicy() const override
This function probably should not be called, but otherwise is what you would expect given the current...
AIToolbox::Bandit::SuccessiveRejectsPolicy::sampleAction
virtual size_t sampleAction() const override
This function selects the current action to explore.
AIToolbox::Bandit::SuccessiveRejectsPolicy::recommendAction
size_t recommendAction() const
If the pool has a single element, this function returns the best estimated action after the SR explor...
Experience.hpp
AIToolbox::Bandit::PolicyInterface
Simple typedef for most of a normal Bandit's policy needs.
Definition: PolicyInterface.hpp:11
AIToolbox::Bandit::SuccessiveRejectsPolicy::getExperience
const Experience & getExperience() const
This function returns a reference to the underlying Experience we use.
AIToolbox::Vector
Eigen::Matrix< double, Eigen::Dynamic, 1 > Vector
Definition: Types.hpp:16
AIToolbox::Bandit::SuccessiveRejectsPolicy::getCurrentNk
size_t getCurrentNk() const
This function returns the nK_ for the current phase.
AIToolbox::Bandit
Definition: Experience.hpp:6
AIToolbox::Bandit::SuccessiveRejectsPolicy::getCurrentPhase
size_t getCurrentPhase() const
This function returns the current phase.
PolicyInterface.hpp
Types.hpp
AIToolbox::Bandit::SuccessiveRejectsPolicy::SuccessiveRejectsPolicy
SuccessiveRejectsPolicy(const Experience &experience, unsigned budget)
Basic constructor.
AIToolbox::Bandit::SuccessiveRejectsPolicy::stepUpdateQ
void stepUpdateQ()
This function updates the current phase, nK_, and prunes actions from the pool.
AIToolbox::Bandit::SuccessiveRejectsPolicy::getPreviousNk
size_t getPreviousNk() const
This function returns the nK_ for the previous phase.
AIToolbox::Bandit::SuccessiveRejectsPolicy
This class implements the successive rejects algorithm.
Definition: SuccessiveRejectsPolicy.hpp:28
AIToolbox::Bandit::Experience
This class computes averages and counts for a Bandit problem.
Definition: Experience.hpp:13
AIToolbox::Bandit::SuccessiveRejectsPolicy::getActionProbability
virtual double getActionProbability(const size_t &a) const override
This function is fairly useless for SR, but it returns either 1.0 or 0.0 depending on which action is...