AIToolbox
A library that offers tools for AI problem solving.
LLRPolicy.hpp
Go to the documentation of this file.
1 #ifndef AI_TOOLBOX_FACTORED_BANDIT_LEARNING_WITH_LINEAR_REWARDS_POLICY_HEADER_FILE
2 #define AI_TOOLBOX_FACTORED_BANDIT_LEARNING_WITH_LINEAR_REWARDS_POLICY_HEADER_FILE
3 
8 
31  class LLRPolicy : public PolicyInterface {
32  public:
38  LLRPolicy(const Experience & exp);
39 
54  virtual Action sampleAction() const override;
55 
66  virtual double getActionProbability(const Action & a) const override;
67 
73  const Experience & getExperience() const;
74 
75  private:
77  const Experience & exp_;
79  unsigned L;
80  };
81 }
82 
83 #endif
AIToolbox::Factored::Bandit::LLRPolicy::LLRPolicy
LLRPolicy(const Experience &exp)
Basic constructor.
AIToolbox::Factored::Bandit::LLRPolicy
This class represents the Learning with Linear Rewards algorithm.
Definition: LLRPolicy.hpp:31
Types.hpp
FilterMap.hpp
AIToolbox::Factored::Bandit::Experience
This class computes averages and counts for a multi-agent cooperative Bandit problem.
Definition: Experience.hpp:14
PolicyInterface.hpp
AIToolbox::Factored::Bandit::LLRPolicy::getActionProbability
virtual double getActionProbability(const Action &a) const override
This function returns the probability of taking the specified action.
AIToolbox::Factored::Bandit::LLRPolicy::getExperience
const Experience & getExperience() const
This function returns the Experience we use to learn.
Experience.hpp
AIToolbox::Factored::Action
Factors Action
Definition: Types.hpp:69
AIToolbox::Factored::Bandit::PolicyInterface
Simple typedef for most of a normal Bandit's policy needs.
Definition: PolicyInterface.hpp:11
AIToolbox::Factored::Bandit::LLRPolicy::sampleAction
virtual Action sampleAction() const override
This function selects an action using LLR.
AIToolbox::Factored::Bandit
Definition: GraphUtils.hpp:12