|
AIToolbox
A library that offers tools for AI problem solving.
|
This class implements a simple greedy policy. More...
#include <AIToolbox/Bandit/Policies/QGreedyPolicy.hpp>
Public Member Functions | |
| QGreedyPolicy (const QFunction &q) | |
| Basic constructor. More... | |
| virtual size_t | sampleAction () const override |
| This function chooses the greediest action. More... | |
| virtual double | getActionProbability (const size_t &a) const override |
| This function returns the probability of taking the specified action. More... | |
| virtual Vector | getPolicy () const override |
| This function returns a vector containing all probabilities of the policy. More... | |
Public Member Functions inherited from AIToolbox::PolicyInterface< void, void, size_t > | |
| PolicyInterface (void s, size_t a) | |
| Basic constructor. More... | |
| virtual | ~PolicyInterface () |
| Basic virtual destructor. More... | |
| virtual size_t | sampleAction (const void &s) const=0 |
| This function chooses a random action for state s, following the policy distribution. More... | |
| virtual double | getActionProbability (const void &s, const size_t &a) const=0 |
| This function returns the probability of taking the specified action in the specified state. More... | |
| const void & | getS () const |
| This function returns the number of states of the world. More... | |
| const size_t & | getA () const |
| This function returns the number of available actions to the agent. More... | |
Additional Inherited Members | |
Public Types inherited from AIToolbox::Bandit::PolicyInterface | |
| using | Base = AIToolbox::PolicyInterface< void, void, size_t > |
Protected Attributes inherited from AIToolbox::PolicyInterface< void, void, size_t > | |
| void | S |
| size_t | A |
| RandomEngine | rand_ |
This class implements a simple greedy policy.
This class always selects the greediest action with respect to the already obtained experience.
| AIToolbox::Bandit::QGreedyPolicy::QGreedyPolicy | ( | const QFunction & | q | ) |
Basic constructor.
| q | The QFunction to act upon. |
|
overridevirtual |
This function returns the probability of taking the specified action.
If multiple greedy actions exist, this function returns the correct probability of picking each one, since we return a random one with sampleAction().
| a | The selected action. |
|
overridevirtual |
This function returns a vector containing all probabilities of the policy.
Ideally this function can be called only when there is a repeated need to access the same policy values in an efficient manner.
Implements AIToolbox::Bandit::PolicyInterface.
|
overridevirtual |
This function chooses the greediest action.
If multiple actions would be equally as greedy, a random one is returned.