AIToolbox
A library that offers tools for AI problem solving.
AIToolbox::MDP::PolicyIteration Class Reference

This class represents the Policy Iteration algorithm. More...

#include <AIToolbox/MDP/Algorithms/PolicyIteration.hpp>

Public Member Functions

 PolicyIteration (unsigned horizon, double tolerance=0.001)
 Basic constructor. More...
 
template<IsModel M>
QFunction operator() (const M &m)
 This function applies policy iteration on an MDP to solve it. More...
 
void setTolerance (double t)
 This function sets the tolerance parameter. More...
 
void setHorizon (unsigned h)
 This function sets the horizon parameter. More...
 
double getTolerance () const
 This function returns the currently set tolerance parameter. More...
 
unsigned getHorizon () const
 This function returns the currently set horizon parameter. More...
 

Detailed Description

This class represents the Policy Iteration algorithm.

This algorithm begins with an arbitrary policy (random), and uses the PolicyEvaluation algorithm to find out the Values for each state of this policy.

Once this is done, the policy can be improved by using a greedy approach towards the QFunction found. The new policy is then newly evaluated, and the process repeated.

When the policy does not change anymore, it is guaranteed to be optimal, and the found QFunction is returned.

Constructor & Destructor Documentation

◆ PolicyIteration()

AIToolbox::MDP::PolicyIteration::PolicyIteration ( unsigned  horizon,
double  tolerance = 0.001 
)

Basic constructor.

Parameters
horizonThe horizon parameter to use during the PolicyEvaluation phase.
toleranceThe tolerance parameter to use during the PolicyEvaluation phase.

Member Function Documentation

◆ getHorizon()

unsigned AIToolbox::MDP::PolicyIteration::getHorizon ( ) const

This function returns the currently set horizon parameter.

◆ getTolerance()

double AIToolbox::MDP::PolicyIteration::getTolerance ( ) const

This function returns the currently set tolerance parameter.

◆ operator()()

template<IsModel M>
QFunction AIToolbox::MDP::PolicyIteration::operator() ( const M &  m)

This function applies policy iteration on an MDP to solve it.

The algorithm is constrained by the currently set parameters.

Parameters
mThe MDP that needs to be solved.
Returns
The QFunction of the optimal policy found.

◆ setHorizon()

void AIToolbox::MDP::PolicyIteration::setHorizon ( unsigned  h)

This function sets the horizon parameter.

◆ setTolerance()

void AIToolbox::MDP::PolicyIteration::setTolerance ( double  t)

This function sets the tolerance parameter.

The tolerance parameter must be >= 0 or the function will throw.


The documentation for this class was generated from the following file: