AIToolbox
A library that offers tools for AI problem solving.
|
This class represents the base interface for epsilon policies in games and bandits. More...
#include <AIToolbox/EpsilonPolicyInterface.hpp>
Public Types | |
using | Base = PolicyInterface< void, void, Action > |
Public Member Functions | |
EpsilonPolicyInterface (const Base &p, double epsilon=0.1) | |
Basic constructor. More... | |
virtual Action | sampleAction () const override |
This function chooses an action, following the policy distribution and epsilon. More... | |
virtual double | getActionProbability (const Action &a) const override |
This function returns the probability of taking the specified action. More... | |
void | setEpsilon (double e) |
This function sets the epsilon parameter. More... | |
double | getEpsilon () const |
This function will return the currently set epsilon parameter. More... | |
Public Member Functions inherited from AIToolbox::PolicyInterface< void, void, Action > | |
PolicyInterface (Action a) | |
Basic constructor. More... | |
virtual | ~PolicyInterface () |
Basic virtual destructor. More... | |
const Action & | getA () const |
This function returns the number of available actions to the agent. More... | |
Protected Member Functions | |
virtual Action | sampleRandomAction () const =0 |
This function returns a random action in the Action space. More... | |
virtual double | getRandomActionProbability () const =0 |
This function returns the probability of picking a random action. More... | |
Protected Attributes | |
const Base & | policy_ |
double | epsilon_ |
Protected Attributes inherited from AIToolbox::PolicyInterface< void, void, Action > | |
Action | A |
RandomEngine | rand_ |
This class represents the base interface for epsilon policies in games and bandits.
This specialization simply removes the states from the EpsilonPolicyInterface, since in normal games and bandits there is no state, and we can simplify the interface. The rest is the same.
Action | This defines the type that is used to handle actions. |
using AIToolbox::EpsilonPolicyInterface< void, void, Action >::Base = PolicyInterface<void, void, Action> |
AIToolbox::EpsilonPolicyInterface< void, void, Action >::EpsilonPolicyInterface | ( | const Base & | p, |
double | epsilon = 0.1 |
||
) |
Basic constructor.
This constructor saves the input policy and the epsilon parameter for later use.
The epsilon parameter must be >= 0.0 and <= 1.0, otherwise the constructor will throw an std::invalid_argument.
p | The policy that is being extended. |
epsilon | The parameter that controls the amount of exploration. |
|
overridevirtual |
This function returns the probability of taking the specified action.
This function takes into account parameter epsilon while computing the final probability.
a | The selected action. |
Implements AIToolbox::PolicyInterface< void, void, Action >.
double AIToolbox::EpsilonPolicyInterface< void, void, Action >::getEpsilon |
This function will return the currently set epsilon parameter.
|
protectedpure virtual |
This function returns the probability of picking a random action.
This is simply one over the action space, but since the action space may not be a single number we leave to implementation to decide how to best compute this.
Implemented in AIToolbox::Factored::Bandit::EpsilonPolicy.
|
overridevirtual |
This function chooses an action, following the policy distribution and epsilon.
This function has a probability of epsilon
of selecting a random action. Otherwise, it selects an action according to the distribution specified by the wrapped policy.
Implements AIToolbox::PolicyInterface< void, void, Action >.
|
protectedpure virtual |
This function returns a random action in the Action space.
Implemented in AIToolbox::Factored::Bandit::EpsilonPolicy.
void AIToolbox::EpsilonPolicyInterface< void, void, Action >::setEpsilon | ( | double | e | ) |
This function sets the epsilon parameter.
The epsilon parameter determines the amount of exploration this policy will enforce when selecting actions. In particular actions are going to selected randomly with probability epsilon
, and are going to be selected following the underlying policy with probability 1-epsilon
.
The epsilon parameter must be >= 0.0 and <= 1.0, otherwise the function will do throw std::invalid_argument.
e | The new epsilon parameter. |
|
protected |
|
protected |