AIToolbox
A library that offers tools for AI problem solving.
|
This class implements the SARSOP algorithm. More...
#include <AIToolbox/POMDP/Algorithms/SARSOP.hpp>
Public Member Functions | |
SARSOP (double tolerance, double delta=0.1) | |
Basic constructor. More... | |
void | setTolerance (double tolerance) |
This function sets the tolerance to reach when solving a POMDP. More... | |
double | getTolerance () const |
This function returns the currently set tolerance to reach when solving a POMDP. More... | |
void | setDelta (double delta) |
This function sets the delta for pruning to use at the start of a solving process. More... | |
double | getDelta () const |
This function returns the delta for pruning to use at the start of a solving process. More... | |
template<IsModel M> | |
std::tuple< double, double, VList, MDP::QFunction > | operator() (const M &model, const Belief &initialBelief) |
This function efficiently computes bounds for the optimal value of the input belief for the input POMDP. More... | |
This class implements the SARSOP algorithm.
This algorithm works by computing lower and upper bounds on what is believed to be the optimal policy.
SARSOP tries to keep computational costs in check by only computing alphavectors and upper bounds by exploring future action/observation pairs which are believed to fall in the path of the optimal policy.
Since at the start the optimal policy is not known, SARSOP employs a series of heuristics to ensure that the paths it explores are indeed correct. At the same time, it also aggressively prunes the found alphavectors and beliefs to keep further exploration cheap.
The result should be lower/upper bounds that are reasonably close to optimal as long as one remains in the part of the belief space reachable via the optimal policy. Once a non-optimal action is taken, the bounds are likely to be loose.
AIToolbox::POMDP::SARSOP::SARSOP | ( | double | tolerance, |
double | delta = 0.1 |
||
) |
Basic constructor.
tolerance | The tolerance to reach when solving a POMDP. |
delta | The initial delta to use for pruning. |
double AIToolbox::POMDP::SARSOP::getDelta | ( | ) | const |
This function returns the delta for pruning to use at the start of a solving process.
double AIToolbox::POMDP::SARSOP::getTolerance | ( | ) | const |
This function returns the currently set tolerance to reach when solving a POMDP.
std::tuple< double, double, VList, MDP::QFunction > AIToolbox::POMDP::SARSOP::operator() | ( | const M & | model, |
const Belief & | initialBelief | ||
) |
This function efficiently computes bounds for the optimal value of the input belief for the input POMDP.
model | The model to compute the gap for. |
initialBelief | The belief to compute the gap for. |
void AIToolbox::POMDP::SARSOP::setDelta | ( | double | delta | ) |
This function sets the delta for pruning to use at the start of a solving process.
Note that during the solving process the delta is modified dynamically based on heuristics.
delta | The new delta to use. |
void AIToolbox::POMDP::SARSOP::setTolerance | ( | double | tolerance | ) |
This function sets the tolerance to reach when solving a POMDP.
tolerance | The new tolerance. |