AIToolbox
A library that offers tools for AI problem solving.
|
This class represents the Policy Iteration algorithm. More...
#include <AIToolbox/MDP/Algorithms/PolicyIteration.hpp>
Public Member Functions | |
PolicyIteration (unsigned horizon, double tolerance=0.001) | |
Basic constructor. More... | |
template<IsModel M> | |
QFunction | operator() (const M &m) |
This function applies policy iteration on an MDP to solve it. More... | |
void | setTolerance (double t) |
This function sets the tolerance parameter. More... | |
void | setHorizon (unsigned h) |
This function sets the horizon parameter. More... | |
double | getTolerance () const |
This function returns the currently set tolerance parameter. More... | |
unsigned | getHorizon () const |
This function returns the currently set horizon parameter. More... | |
This class represents the Policy Iteration algorithm.
This algorithm begins with an arbitrary policy (random), and uses the PolicyEvaluation algorithm to find out the Values for each state of this policy.
Once this is done, the policy can be improved by using a greedy approach towards the QFunction found. The new policy is then newly evaluated, and the process repeated.
When the policy does not change anymore, it is guaranteed to be optimal, and the found QFunction is returned.
AIToolbox::MDP::PolicyIteration::PolicyIteration | ( | unsigned | horizon, |
double | tolerance = 0.001 |
||
) |
Basic constructor.
horizon | The horizon parameter to use during the PolicyEvaluation phase. |
tolerance | The tolerance parameter to use during the PolicyEvaluation phase. |
unsigned AIToolbox::MDP::PolicyIteration::getHorizon | ( | ) | const |
This function returns the currently set horizon parameter.
double AIToolbox::MDP::PolicyIteration::getTolerance | ( | ) | const |
This function returns the currently set tolerance parameter.
QFunction AIToolbox::MDP::PolicyIteration::operator() | ( | const M & | m | ) |
void AIToolbox::MDP::PolicyIteration::setHorizon | ( | unsigned | h | ) |
This function sets the horizon parameter.
void AIToolbox::MDP::PolicyIteration::setTolerance | ( | double | t | ) |
This function sets the tolerance parameter.
The tolerance parameter must be >= 0 or the function will throw.