This class applies the policy evaluation algorithm on a policy. More...

#include <AIToolbox/MDP/Algorithms/Utils/PolicyEvaluation.hpp>

Public Member Functions
	PolicyEvaluation (const M &m, unsigned horizon, double tolerance=0.001, Values v=Values())
	Basic constructor. More...

std::tuple< double, Values, QFunction >	operator() (const PolicyInterface &p)
	This function applies policy evaluation on a policy. More...

void	setTolerance (double e)
	This function sets the tolerance parameter. More...

void	setHorizon (unsigned h)
	This function sets the horizon parameter. More...

void	setValues (Values v)
	This function sets the starting value function. More...

double	getTolerance () const
	This function will return the currently set tolerance parameter. More...

unsigned	getHorizon () const
	This function will return the current horizon parameter. More...

const Values &	getValues () const
	This function will return the currently set default values. More...

Detailed Description

template<IsModel M>
class AIToolbox::MDP::PolicyEvaluation< M >

This class applies the policy evaluation algorithm on a policy.

Policy Evaluation computes the values and QFunction for a particular policy used on a given Model.

This class is setup so it is easy to reuse on multiple policies using the same Model, so that no redundant computations have to be performed.

Template Parameters

M	The type of model that is solved by the algorithm.

Constructor & Destructor Documentation

◆ PolicyEvaluation()

template<IsModel M>

AIToolbox::MDP::PolicyEvaluation< M >::PolicyEvaluation	(	const M &	m,
		unsigned	horizon,
		double	tolerance = `0.001`,
		Values	v = `Values()`
	)

Basic constructor.

The tolerance parameter must be >= 0.0, otherwise the constructor will throw an std::runtime_error. The tolerance parameter sets the convergence criterion. A tolerance of 0.0 forces PolicyEvaluation to perform a number of iterations equal to the horizon specified. Otherwise, PolicyEvaluation will stop as soon as the difference between two iterations is less than the tolerance specified.

Note that the default value function size needs to match the number of states of the Model. Otherwise it will be ignored. An empty value function will be defaulted to all zeroes.

Parameters

m	The MDP to evaluate a policy for.
horizon	The maximum number of iterations to perform.
tolerance	The tolerance factor to stop the policy evaluation loop.
v	The initial value function from which to start the loop.

Member Function Documentation

◆ getHorizon()

template<IsModel M>

unsigned AIToolbox::MDP::PolicyEvaluation< M >::getHorizon

This function will return the current horizon parameter.

Returns: The currently set horizon parameter.

◆ getTolerance()

template<IsModel M>

double AIToolbox::MDP::PolicyEvaluation< M >::getTolerance

This function will return the currently set tolerance parameter.

Returns: The currently set tolerance parameter.

◆ getValues()

template<IsModel M>

const Values & AIToolbox::MDP::PolicyEvaluation< M >::getValues

This function will return the currently set default values.

Returns: The currently set default values.

◆ operator()()

template<IsModel M>

std::tuple< double, Values, QFunction > AIToolbox::MDP::PolicyEvaluation< M >::operator() ( const PolicyInterface & p )

This function applies policy evaluation on a policy.

The algorithm is constrained by the currently set parameters.

Parameters

p	The policy to be evaluated.

Returns: A tuple containing the maximum variation for the ValueFunction, the ValueFunction and the QFunction for the Model and policy.

◆ setHorizon()

template<IsModel M>

void AIToolbox::MDP::PolicyEvaluation< M >::setHorizon ( unsigned h )

This function sets the horizon parameter.

Parameters

h	The new horizon parameter.

◆ setTolerance()

template<IsModel M>

void AIToolbox::MDP::PolicyEvaluation< M >::setTolerance ( double e )

This function sets the tolerance parameter.

The tolerance parameter must be >= 0.0, otherwise the constructor will throw an std::runtime_error. The tolerance parameter sets the convergence criterion. A tolerance of 0.0 forces PolicyEvaluation to perform a number of iterations equal to the horizon specified. Otherwise, PolicyEvaluation will stop as soon as the difference between two iterations is less than the tolerance specified.

Parameters

e	The new tolerance parameter.

◆ setValues()

template<IsModel M>

void AIToolbox::MDP::PolicyEvaluation< M >::setValues ( Values v )

This function sets the starting value function.

An empty value function defaults to all zeroes. Note that the default value function size needs to match the number of states of the Model that needs to be solved. Otherwise it will be ignored.

Parameters

v	The new starting value function.

The documentation for this class was generated from the following file:

include/AIToolbox/MDP/Algorithms/Utils/PolicyEvaluation.hpp

Public Member Functions

Detailed Description

template<IsModel M> class AIToolbox::MDP::PolicyEvaluation< M >

Constructor & Destructor Documentation

◆ PolicyEvaluation()

Member Function Documentation

◆ getHorizon()

◆ getTolerance()

◆ getValues()

◆ operator()()

◆ setHorizon()

◆ setTolerance()

◆ setValues()

template<IsModel M>
class AIToolbox::MDP::PolicyEvaluation< M >