AIToolbox
A library that offers tools for AI problem solving.
|
This class implements the blind strategies lower bound. More...
#include <AIToolbox/POMDP/Algorithms/BlindStrategies.hpp>
Public Member Functions | |
BlindStrategies (unsigned horizon, double tolerance=0.001) | |
Basic constructor. More... | |
template<IsModel M> | |
std::tuple< double, VList > | operator() (const M &m, bool fasterConvergence) |
This function computes the blind strategies for the input POMDP. More... | |
void | setTolerance (double tolerance) |
This function sets the tolerance parameter. More... | |
void | setHorizon (unsigned h) |
This function sets the horizon parameter. More... | |
double | getTolerance () const |
This function returns the currently set toleranc parameter. More... | |
unsigned | getHorizon () const |
This function returns the current horizon parameter. More... | |
This class implements the blind strategies lower bound.
This class is useful in order to obtain a very simple lower bound for a POMDP. The values for each action assume that the agent is always going to take that same action forever afterwards.
While this bound is somewhat loose, it can be a good starting point for other algorithms as it's incredibly cheap to compute.
We return the alphavectors for all actions. There's an incredibly high likelihood that of the resulting alphavectors many are going to be dominated, but we leave the pruning to the clients as maybe the additional per-action information may be useful to somebody (and also makes for easier testing ;) )
AIToolbox::POMDP::BlindStrategies::BlindStrategies | ( | unsigned | horizon, |
double | tolerance = 0.001 |
||
) |
Basic constructor.
horizon | The maximum number of iterations to perform. |
tolerance | The tolerance factor to stop the value iteration loop. |
unsigned AIToolbox::POMDP::BlindStrategies::getHorizon | ( | ) | const |
This function returns the current horizon parameter.
double AIToolbox::POMDP::BlindStrategies::getTolerance | ( | ) | const |
This function returns the currently set toleranc parameter.
std::tuple< double, VList > AIToolbox::POMDP::BlindStrategies::operator() | ( | const M & | m, |
bool | fasterConvergence | ||
) |
This function computes the blind strategies for the input POMDP.
Here we return a simple VList for the specified horizon/tolerance. Returning a ValueFunction would be pretty pointless, as the implied policy here it's pretty obvious (always execute the same action) so there's little sense in wrapping the bounds up.
The bounds are still returned in a VList since at the moment most POMDP utils expect this to work.
An optional parameter for faster convengence can be specified. If true, the algorithm won't initialize the values for each action from zero, but from the minimum possible for that action divided by 1 minus the model's discount (fixed so that division by zero is impossible).
This will make the algorithm converge faster, but the returned values won't be the correct ones for the horizon specified (the horizon will simply represent a bound on the number of iteration performed by the algorithm.
m | The POMDP to be solved. |
fasterConvergence | Whether to initialize the internal vector for faster convergence. |
void AIToolbox::POMDP::BlindStrategies::setHorizon | ( | unsigned | h | ) |
This function sets the horizon parameter.
h | The new horizon parameter. |
void AIToolbox::POMDP::BlindStrategies::setTolerance | ( | double | tolerance | ) |
This function sets the tolerance parameter.
The tolerance parameter must be >= 0.0, otherwise the function will throw an std::invalid_argument. The tolerance parameter sets the convergence criterion. A tolerance of 0.0 forces BlindStrategies to perform a number of iterations equal to the horizon specified. Otherwise, BlindStrategies will stop as soon as the difference between two iterations is less than the tolerance specified.
tolerance | The new tolerance parameter. |