#include <AIToolbox/Bandit/Policies/EpsilonPolicy.hpp>
◆ EpsilonBase
◆ EpsilonPolicy()
AIToolbox::Bandit::EpsilonPolicy::EpsilonPolicy |
( |
const PolicyInterface & |
p, |
|
|
double |
epsilon = 0.1 |
|
) |
| |
Basic constructor.
This constructor saves the input policy and the epsilon parameter for later use.
The epsilon parameter must be >= 0.0 and <= 1.0, otherwise the constructor will throw an std::invalid_argument.
- Parameters
-
p | The policy that is being extended. |
epsilon | The parameter that controls the amount of exploration. |
◆ getPolicy()
virtual Vector AIToolbox::Bandit::EpsilonPolicy::getPolicy |
( |
| ) |
const |
|
overridevirtual |
This function returns a matrix containing all probabilities of the policy.
Note that this may be expensive to compute, and should not be called often (aside from the fact that it needs to allocate a new Matrix2D each time).
Ideally this function can be called only when there is a repeated need to access the same policy values in an efficient manner.
Implements AIToolbox::Bandit::PolicyInterface.
◆ getRandomActionProbability()
virtual double AIToolbox::Bandit::EpsilonPolicy::getRandomActionProbability |
( |
| ) |
const |
|
overrideprotectedvirtual |
This function returns the probability of picking a random action.
- Returns
- The probability of picking an an action at random.
◆ sampleRandomAction()
virtual size_t AIToolbox::Bandit::EpsilonPolicy::sampleRandomAction |
( |
| ) |
const |
|
overrideprotectedvirtual |
This function returns a random action in the Action space.
- Returns
- A valid random action.
◆ randomDistribution_
std::uniform_int_distribution<size_t> AIToolbox::Bandit::EpsilonPolicy::randomDistribution_ |
|
mutableprotected |
The documentation for this class was generated from the following file: