This class implements the Point Based Value Iteration algorithm. More...

#include <AIToolbox/POMDP/Algorithms/PBVI.hpp>

Public Member Functions
	PBVI (size_t nBeliefs, unsigned h, double tolerance)
	Basic constructor. More...

void	setTolerance (double tolerance)
	This function sets the tolerance parameter. More...

void	setHorizon (unsigned h)
	This function sets a new horizon parameter. More...

void	setBeliefSize (size_t nBeliefs)
	This function sets a new number of support beliefs. More...

double	getTolerance () const
	This function returns the currently set tolerance parameter. More...

unsigned	getHorizon () const
	This function returns the currently set horizon parameter. More...

size_t	getBeliefSize () const
	This function returns the currently set number of support beliefs to use during a solve pass. More...

template<IsModel M>
std::tuple< double, ValueFunction >	operator() (const M &model, ValueFunction v={})
	This function solves a POMDP::Model approximately. More...

template<IsModel M>
std::tuple< double, ValueFunction >	operator() (const M &model, const std::vector< Belief > &beliefs, ValueFunction v={})
	This function solves a POMDP::Model approximately. More...

Detailed Description

This class implements the Point Based Value Iteration algorithm.

The idea behind this algorithm is to solve a POMDP Model approximately. When computing a perfect solution, the main problem is pruning the resulting ValueFunction in order to contain only a parsimonious representation. What this means is that many vectors inside can be dominated by others, and so they do not add any additional information, while at the same time occupying memory and computational time.

The way this method tries to fix the problem is by solving the Model in a set of specified Beliefs. Doing so results in no need for pruning at all, since every belief uniquely identifies one of the optimal solution vectors (only uniqueness in the final set is required, but it is way cheaper than linear programming).

The Beliefs can be given as input, or stochastically computed as to cover as much as possible of the belief space, to ensure minimization of the final error. The final solution will be correct 100% in the Beliefs that have been selected, and will (possibly) undershoot in non-covered Beliefs.

In addition, the fact that we solve only for a fixed set of Beliefs guarantees that our final solution is limited in size, which is useful since even small POMDP true solutions can explode in size with high horizons, for very little gain.

There is no convergence guarantee of this method, but the error is bounded.

Constructor & Destructor Documentation

◆ PBVI()

AIToolbox::POMDP::PBVI::PBVI	(	size_t	nBeliefs,
		unsigned	h,
		double	tolerance
	)

Basic constructor.

This constructor sets the default horizon/tolerance used to solve a POMDP::Model and the number of beliefs used to approximate the ValueFunction.

Parameters

nBeliefs	The number of support beliefs to use.
h	The horizon chosen.
tolerance	The tolerance factor to stop the PBVI loop.

Member Function Documentation

◆ getBeliefSize()

size_t AIToolbox::POMDP::PBVI::getBeliefSize ( ) const

This function returns the currently set number of support beliefs to use during a solve pass.

Returns: The number of support beliefs.

◆ getHorizon()

unsigned AIToolbox::POMDP::PBVI::getHorizon ( ) const

This function returns the currently set horizon parameter.

Returns: The current horizon.

◆ getTolerance()

double AIToolbox::POMDP::PBVI::getTolerance ( ) const

This function returns the currently set tolerance parameter.

Returns: The current tolerance.

◆ operator()() [1/2]

template<IsModel M>

std::tuple< double, ValueFunction > AIToolbox::POMDP::PBVI::operator()	(	const M &	model,
		const std::vector< Belief > &	beliefs,
		ValueFunction	v = `{}`
	)

This function solves a POMDP::Model approximately.

This function uses and evaluates the input beliefs.

The final solution will only contain ValueFunctions for those Beliefs and will interpolate them for points it did not solve for. Even though the resulting solution is approximate very often it is good enough, and this comes with an incredible increase in speed.

Template Parameters

M	The type of POMDP model that needs to be solved.

Parameters

model	The POMDP model that needs to be solved.
beliefs	The list of beliefs to evaluate.
v	The ValueFunction to startup the process from, if needed.

Returns: A tuple containing the maximum variation for the ValueFunction and the computed ValueFunction.

◆ operator()() [2/2]

template<IsModel M>

std::tuple< double, ValueFunction > AIToolbox::POMDP::PBVI::operator()	(	const M &	model,
		ValueFunction	v = `{}`
	)

This function solves a POMDP::Model approximately.

This function computes a set of beliefs for which to solve the input model. The beliefs are chosen stochastically, trying to cover as much as possible of the belief space in order to offer as precise a solution as possible. The final solution will only contain ValueFunctions for those Beliefs and will interpolate them for points it did not solve for. Even though the resulting solution is approximate very often it is good enough, and this comes with an incredible increase in speed.

Note that even in the beliefs sampled the solution is not guaranteed to be optimal. This is because a solution for horizon h can only be computed with the true solution from horizon h-1. If such a solution is approximate (and it is here), then the solution for h will not be optimal by definition.

Template Parameters

M	The type of POMDP model that needs to be solved.

Parameters

model	The POMDP model that needs to be solved.
v	The ValueFunction to startup the process from, if needed.

Returns: A tuple containing the maximum variation for the ValueFunction and the computed ValueFunction.

◆ setBeliefSize()

void AIToolbox::POMDP::PBVI::setBeliefSize ( size_t nBeliefs )

This function sets a new number of support beliefs.

Parameters

nBeliefs The new number of support beliefs.

◆ setHorizon()

void AIToolbox::POMDP::PBVI::setHorizon ( unsigned h )

This function sets a new horizon parameter.

Parameters

h	The new horizon parameter.

◆ setTolerance()

void AIToolbox::POMDP::PBVI::setTolerance ( double tolerance )

This function sets the tolerance parameter.

The tolerance parameter must be >= 0.0, otherwise the function will throw an std::runtime_error. The tolerance parameter sets the convergence criterion. A tolerance of 0.0 forces PBVI to perform a number of iterations equal to the horizon specified. Otherwise, PBVI will stop as soon as the difference between two iterations is less than the tolerance specified.

Parameters

tolerance The new tolerance parameter.

The documentation for this class was generated from the following file:

include/AIToolbox/POMDP/Algorithms/PBVI.hpp

Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ PBVI()

Member Function Documentation

◆ getBeliefSize()

◆ getHorizon()

◆ getTolerance()

◆ operator()() [1/2]

◆ operator()() [2/2]

◆ setBeliefSize()

◆ setHorizon()

◆ setTolerance()