AIToolbox
A library that offers tools for AI problem solving.
AIToolbox::POMDP::BlindStrategies Class Reference

This class implements the blind strategies lower bound. More...

#include <AIToolbox/POMDP/Algorithms/BlindStrategies.hpp>

Public Member Functions

 BlindStrategies (unsigned horizon, double tolerance=0.001)
 Basic constructor. More...
 
template<IsModel M>
std::tuple< double, VListoperator() (const M &m, bool fasterConvergence)
 This function computes the blind strategies for the input POMDP. More...
 
void setTolerance (double tolerance)
 This function sets the tolerance parameter. More...
 
void setHorizon (unsigned h)
 This function sets the horizon parameter. More...
 
double getTolerance () const
 This function returns the currently set toleranc parameter. More...
 
unsigned getHorizon () const
 This function returns the current horizon parameter. More...
 

Detailed Description

This class implements the blind strategies lower bound.

This class is useful in order to obtain a very simple lower bound for a POMDP. The values for each action assume that the agent is always going to take that same action forever afterwards.

While this bound is somewhat loose, it can be a good starting point for other algorithms as it's incredibly cheap to compute.

We return the alphavectors for all actions. There's an incredibly high likelihood that of the resulting alphavectors many are going to be dominated, but we leave the pruning to the clients as maybe the additional per-action information may be useful to somebody (and also makes for easier testing ;) )

Constructor & Destructor Documentation

◆ BlindStrategies()

AIToolbox::POMDP::BlindStrategies::BlindStrategies ( unsigned  horizon,
double  tolerance = 0.001 
)

Basic constructor.

Parameters
horizonThe maximum number of iterations to perform.
toleranceThe tolerance factor to stop the value iteration loop.

Member Function Documentation

◆ getHorizon()

unsigned AIToolbox::POMDP::BlindStrategies::getHorizon ( ) const

This function returns the current horizon parameter.

Returns
The currently set horizon parameter.

◆ getTolerance()

double AIToolbox::POMDP::BlindStrategies::getTolerance ( ) const

This function returns the currently set toleranc parameter.

Returns
The currently set tolerance parameter.

◆ operator()()

template<IsModel M>
std::tuple< double, VList > AIToolbox::POMDP::BlindStrategies::operator() ( const M &  m,
bool  fasterConvergence 
)

This function computes the blind strategies for the input POMDP.

Here we return a simple VList for the specified horizon/tolerance. Returning a ValueFunction would be pretty pointless, as the implied policy here it's pretty obvious (always execute the same action) so there's little sense in wrapping the bounds up.

The bounds are still returned in a VList since at the moment most POMDP utils expect this to work.

An optional parameter for faster convengence can be specified. If true, the algorithm won't initialize the values for each action from zero, but from the minimum possible for that action divided by 1 minus the model's discount (fixed so that division by zero is impossible).

This will make the algorithm converge faster, but the returned values won't be the correct ones for the horizon specified (the horizon will simply represent a bound on the number of iteration performed by the algorithm.

Parameters
mThe POMDP to be solved.
fasterConvergenceWhether to initialize the internal vector for faster convergence.
Returns
A tuple containing the maximum variation over all actions and the VList containing the found bounds.

◆ setHorizon()

void AIToolbox::POMDP::BlindStrategies::setHorizon ( unsigned  h)

This function sets the horizon parameter.

Parameters
hThe new horizon parameter.

◆ setTolerance()

void AIToolbox::POMDP::BlindStrategies::setTolerance ( double  tolerance)

This function sets the tolerance parameter.

The tolerance parameter must be >= 0.0, otherwise the function will throw an std::invalid_argument. The tolerance parameter sets the convergence criterion. A tolerance of 0.0 forces BlindStrategies to perform a number of iterations equal to the horizon specified. Otherwise, BlindStrategies will stop as soon as the difference between two iterations is less than the tolerance specified.

Parameters
toleranceThe new tolerance parameter.

The documentation for this class was generated from the following file: