This class applies the value iteration algorithm on a Model. More...

#include <AIToolbox/MDP/Algorithms/ValueIteration.hpp>

Public Member Functions
	ValueIteration (unsigned horizon, double tolerance=0.001, ValueFunction v={Values(), Actions(0)})
	Basic constructor. More...

template<IsModel M>
std::tuple< double, ValueFunction, QFunction >	operator() (const M &m)
	This function applies value iteration on an MDP to solve it. More...

void	setTolerance (double e)
	This function sets the tolerance parameter. More...

void	setHorizon (unsigned h)
	This function sets the horizon parameter. More...

void	setValueFunction (ValueFunction v)
	This function sets the starting value function. More...

double	getTolerance () const
	This function will return the currently set tolerance parameter. More...

unsigned	getHorizon () const
	This function will return the current horizon parameter. More...

const ValueFunction &	getValueFunction () const
	This function will return the current set default value function. More...

Detailed Description

This class applies the value iteration algorithm on a Model.

This algorithm solves an MDP model for the specified horizon, or less if convergence is encountered.

The idea of this algorithm is to iteratively compute the ValueFunction for the MDP optimal policy. On the first iteration, the ValueFunction for horizon 1 is obtained. On the second iteration, the one for horizon 2. This process is repeated until the ValueFunction has converged within a certain accuracy, or the horizon requested is reached.

This implementation in particular is ported from the MATLAB MDPToolbox (although it is simplified).

Constructor & Destructor Documentation

◆ ValueIteration()

AIToolbox::MDP::ValueIteration::ValueIteration	(	unsigned	horizon,
		double	tolerance = `0.001`,
		ValueFunction	v = `{Values(), Actions(0)}`
	)

Basic constructor.

The tolerance parameter must be >= 0.0, otherwise the constructor will throw an std::runtime_error. The tolerance parameter sets the convergence criterion. A tolerance of 0.0 forces ValueIteration to perform a number of iterations equal to the horizon specified. Otherwise, ValueIteration will stop as soon as the difference between two iterations is less than the tolerance specified.

Note that the default value function size needs to match the number of states of the Model. Otherwise it will be ignored. An empty value function will be defaulted to all zeroes.

Parameters

horizon	The maximum number of iterations to perform.
tolerance	The tolerance factor to stop the value iteration loop.
v	The initial value function from which to start the loop.

Member Function Documentation

◆ getHorizon()

unsigned AIToolbox::MDP::ValueIteration::getHorizon ( ) const

This function will return the current horizon parameter.

Returns: The currently set horizon parameter.

◆ getTolerance()

double AIToolbox::MDP::ValueIteration::getTolerance ( ) const

This function will return the currently set tolerance parameter.

Returns: The currently set tolerance parameter.

◆ getValueFunction()

const ValueFunction& AIToolbox::MDP::ValueIteration::getValueFunction ( ) const

This function will return the current set default value function.

Returns: The currently set default value function.

◆ operator()()

template<IsModel M>

std::tuple< double, ValueFunction, QFunction > AIToolbox::MDP::ValueIteration::operator() ( const M & m )

This function applies value iteration on an MDP to solve it.

The algorithm is constrained by the currently set parameters.

Template Parameters

M	The type of the solvable MDP.

Parameters

m	The MDP that needs to be solved.

Returns: A tuple containing the maximum variation for the ValueFunction, the ValueFunction and the QFunction for the Model.

◆ setHorizon()

void AIToolbox::MDP::ValueIteration::setHorizon ( unsigned h )

This function sets the horizon parameter.

Parameters

h	The new horizon parameter.

◆ setTolerance()

void AIToolbox::MDP::ValueIteration::setTolerance ( double e )

This function sets the tolerance parameter.

The tolerance parameter must be >= 0.0, otherwise the constructor will throw an std::runtime_error. The tolerance parameter sets the convergence criterion. A tolerance of 0.0 forces ValueIteration to perform a number of iterations equal to the horizon specified. Otherwise, ValueIteration will stop as soon as the difference between two iterations is less than the tolerance specified.

Parameters

e	The new tolerance parameter.

◆ setValueFunction()

void AIToolbox::MDP::ValueIteration::setValueFunction ( ValueFunction v )

This function sets the starting value function.

An empty value function defaults to all zeroes. Note that the default value function size needs to match the number of states of the Model that needs to be solved. Otherwise it will be ignored.

Parameters

v	The new starting value function.

The documentation for this class was generated from the following file:

include/AIToolbox/MDP/Algorithms/ValueIteration.hpp

Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ ValueIteration()

Member Function Documentation

◆ getHorizon()

◆ getTolerance()

◆ getValueFunction()

◆ operator()()

◆ setHorizon()

◆ setTolerance()

◆ setValueFunction()