AIToolbox
A library that offers tools for AI problem solving.
AIToolbox::MDP::PrioritizedSweeping< M > Class Template Reference

This class represents the PrioritizedSweeping algorithm. More...

#include <AIToolbox/MDP/Algorithms/PrioritizedSweeping.hpp>

Public Member Functions

 PrioritizedSweeping (const M &m, double theta=0.5, unsigned n=50)
 Basic constructor. More...
 
void stepUpdateQ (size_t s, size_t a)
 This function updates the PrioritizedSweeping internal update queue. More...
 
void batchUpdateQ ()
 This function updates a QFunction based on simulated experience. More...
 
void setQueueThreshold (double t)
 This function sets the theta parameter. More...
 
double getQueueThreshold () const
 This function will return the currently set theta parameter. More...
 
void setN (unsigned n)
 This function sets the number of sampling passes during batchUpdateQ(). More...
 
unsigned getN () const
 This function returns the currently set number of sampling passes during batchUpdateQ(). More...
 
size_t getQueueLength () const
 This function returns the current number of elements unprocessed in the queue. More...
 
const M & getModel () const
 This function returns a reference to the referenced Model. More...
 
const QFunctiongetQFunction () const
 This function returns a reference to the internal QFunction. More...
 
void setQFunction (const QFunction &q)
 This function allows you to set the value of the internal QFunction. More...
 
const ValueFunctiongetValueFunction () const
 This function returns a reference to the internal ValueFunction. More...
 

Detailed Description

template<IsModel M>
class AIToolbox::MDP::PrioritizedSweeping< M >

This class represents the PrioritizedSweeping algorithm.

This algorithm is a refinement of the DynaQ algorithm. Instead of randomly sampling experienced state action pairs to get more information, we order each pair based on an estimate of how much information we can still extract from them.

In particular, pairs are sorted based on the amount they modified the estimated ValueFunction on their last sample. This ensures that we always try to sample from useful pairs instead of randomly, extracting knowledge much faster.

At the same time, this algorithm keeps a threshold for each state-action pair, so that it does not have to internally store all the pairs and save some memory/cpu time keeping the queue updated. Only pairs which obtained an amount of change higher than this treshold are kept in the queue.

Differently from the QLearning and DynaQ algorithm, this class automatically computes the ValueFunction since it is useful to determine which state-action pairs are actually useful, so there's no need to compute it manually.

Given how this algorithm updates the QFunction, the only problems supported by this approach are ones with an infinite horizon.

Constructor & Destructor Documentation

◆ PrioritizedSweeping()

template<IsModel M>
AIToolbox::MDP::PrioritizedSweeping< M >::PrioritizedSweeping ( const M &  m,
double  theta = 0.5,
unsigned  n = 50 
)

Basic constructor.

Parameters
mThe model to be used to update the QFunction.
thetaThe queue threshold.
nThe number of sampling passes to do on the model upon batchUpdateQ().

Member Function Documentation

◆ batchUpdateQ()

template<IsModel M>
void AIToolbox::MDP::PrioritizedSweeping< M >::batchUpdateQ

This function updates a QFunction based on simulated experience.

In PrioritizedSweeping we sample from the queue at most N times for state action pairs that need updating. For each one of them we update the QFunction and recursively check whether this produces new changes worth updating. If so, they are inserted in the queue_ and the function proceeds to the next most urgent iteration.

◆ getModel()

template<IsModel M>
const M & AIToolbox::MDP::PrioritizedSweeping< M >::getModel

This function returns a reference to the referenced Model.

Returns
The internal Model.

◆ getN()

template<IsModel M>
unsigned AIToolbox::MDP::PrioritizedSweeping< M >::getN

This function returns the currently set number of sampling passes during batchUpdateQ().

Returns
The current number of updates().

◆ getQFunction()

template<IsModel M>
const QFunction & AIToolbox::MDP::PrioritizedSweeping< M >::getQFunction

This function returns a reference to the internal QFunction.

Returns
The internal QFunction.

◆ getQueueLength()

template<IsModel M>
size_t AIToolbox::MDP::PrioritizedSweeping< M >::getQueueLength

This function returns the current number of elements unprocessed in the queue.

Returns
The current length of the queue.

◆ getQueueThreshold()

template<IsModel M>
double AIToolbox::MDP::PrioritizedSweeping< M >::getQueueThreshold

This function will return the currently set theta parameter.

Returns
The currently set theta parameter.

◆ getValueFunction()

template<IsModel M>
const ValueFunction & AIToolbox::MDP::PrioritizedSweeping< M >::getValueFunction

This function returns a reference to the internal ValueFunction.

Returns
The internal ValueFunction.

◆ setN()

template<IsModel M>
void AIToolbox::MDP::PrioritizedSweeping< M >::setN ( unsigned  n)

This function sets the number of sampling passes during batchUpdateQ().

Parameters
nThe new number of updates.

◆ setQFunction()

template<IsModel M>
void AIToolbox::MDP::PrioritizedSweeping< M >::setQFunction ( const QFunction q)

This function allows you to set the value of the internal QFunction.

This function can be useful in case you are starting with an already populated Experience/Model, which you can solve (for example with ValueIteration) and then improve the solution with new experience.

Parameters
qThe QFunction that will be copied.

◆ setQueueThreshold()

template<IsModel M>
void AIToolbox::MDP::PrioritizedSweeping< M >::setQueueThreshold ( double  t)

This function sets the theta parameter.

The discount parameter must be >= 0.0. otherwise the function will throw an std::invalid_argument.

Parameters
tThe new theta parameter.

◆ stepUpdateQ()

template<IsModel M>
void AIToolbox::MDP::PrioritizedSweeping< M >::stepUpdateQ ( size_t  s,
size_t  a 
)

This function updates the PrioritizedSweeping internal update queue.

This function updates the QFunction for the specified pair, and decides whether any parent couple that can lead to this state is worth pushing into the queue.

Parameters
sThe previous state.
aThe action performed.

The documentation for this class was generated from the following file: