This class represents the PrioritizedSweeping algorithm. More...

#include <AIToolbox/MDP/Algorithms/PrioritizedSweeping.hpp>

Public Member Functions
	PrioritizedSweeping (const M &m, double theta=0.5, unsigned n=50)
	Basic constructor. More...

void	stepUpdateQ (size_t s, size_t a)
	This function updates the PrioritizedSweeping internal update queue. More...

void	batchUpdateQ ()
	This function updates a QFunction based on simulated experience. More...

void	setQueueThreshold (double t)
	This function sets the theta parameter. More...

double	getQueueThreshold () const
	This function will return the currently set theta parameter. More...

void	setN (unsigned n)
	This function sets the number of sampling passes during batchUpdateQ(). More...

unsigned	getN () const
	This function returns the currently set number of sampling passes during batchUpdateQ(). More...

size_t	getQueueLength () const
	This function returns the current number of elements unprocessed in the queue. More...

const M &	getModel () const
	This function returns a reference to the referenced Model. More...

const QFunction &	getQFunction () const
	This function returns a reference to the internal QFunction. More...

void	setQFunction (const QFunction &q)
	This function allows you to set the value of the internal QFunction. More...

const ValueFunction &	getValueFunction () const
	This function returns a reference to the internal ValueFunction. More...

Detailed Description

template<IsModel M>
class AIToolbox::MDP::PrioritizedSweeping< M >

This class represents the PrioritizedSweeping algorithm.

This algorithm is a refinement of the DynaQ algorithm. Instead of randomly sampling experienced state action pairs to get more information, we order each pair based on an estimate of how much information we can still extract from them.

In particular, pairs are sorted based on the amount they modified the estimated ValueFunction on their last sample. This ensures that we always try to sample from useful pairs instead of randomly, extracting knowledge much faster.

At the same time, this algorithm keeps a threshold for each state-action pair, so that it does not have to internally store all the pairs and save some memory/cpu time keeping the queue updated. Only pairs which obtained an amount of change higher than this treshold are kept in the queue.

Differently from the QLearning and DynaQ algorithm, this class automatically computes the ValueFunction since it is useful to determine which state-action pairs are actually useful, so there's no need to compute it manually.

Given how this algorithm updates the QFunction, the only problems supported by this approach are ones with an infinite horizon.

Constructor & Destructor Documentation

◆ PrioritizedSweeping()

template<IsModel M>

AIToolbox::MDP::PrioritizedSweeping< M >::PrioritizedSweeping	(	const M &	m,
		double	theta = `0.5`,
		unsigned	n = `50`
	)

Basic constructor.

Parameters

m	The model to be used to update the QFunction.
theta	The queue threshold.
n	The number of sampling passes to do on the model upon batchUpdateQ().

Member Function Documentation

◆ batchUpdateQ()

template<IsModel M>

void AIToolbox::MDP::PrioritizedSweeping< M >::batchUpdateQ

This function updates a QFunction based on simulated experience.

In PrioritizedSweeping we sample from the queue at most N times for state action pairs that need updating. For each one of them we update the QFunction and recursively check whether this produces new changes worth updating. If so, they are inserted in the queue_ and the function proceeds to the next most urgent iteration.

◆ getModel()

template<IsModel M>

const M & AIToolbox::MDP::PrioritizedSweeping< M >::getModel

This function returns a reference to the referenced Model.

Returns: The internal Model.

◆ getN()

template<IsModel M>

unsigned AIToolbox::MDP::PrioritizedSweeping< M >::getN

This function returns the currently set number of sampling passes during batchUpdateQ().

Returns: The current number of updates().

◆ getQFunction()

template<IsModel M>

const QFunction & AIToolbox::MDP::PrioritizedSweeping< M >::getQFunction

This function returns a reference to the internal QFunction.

Returns: The internal QFunction.

◆ getQueueLength()

template<IsModel M>

size_t AIToolbox::MDP::PrioritizedSweeping< M >::getQueueLength

This function returns the current number of elements unprocessed in the queue.

Returns: The current length of the queue.

◆ getQueueThreshold()

template<IsModel M>

double AIToolbox::MDP::PrioritizedSweeping< M >::getQueueThreshold

This function will return the currently set theta parameter.

Returns: The currently set theta parameter.

◆ getValueFunction()

template<IsModel M>

const ValueFunction & AIToolbox::MDP::PrioritizedSweeping< M >::getValueFunction

This function returns a reference to the internal ValueFunction.

Returns: The internal ValueFunction.

◆ setN()

template<IsModel M>

void AIToolbox::MDP::PrioritizedSweeping< M >::setN ( unsigned n )

This function sets the number of sampling passes during batchUpdateQ().

Parameters

n	The new number of updates.

◆ setQFunction()

template<IsModel M>

void AIToolbox::MDP::PrioritizedSweeping< M >::setQFunction ( const QFunction & q )

This function allows you to set the value of the internal QFunction.

This function can be useful in case you are starting with an already populated Experience/Model, which you can solve (for example with ValueIteration) and then improve the solution with new experience.

Parameters

q	The QFunction that will be copied.

◆ setQueueThreshold()

template<IsModel M>

void AIToolbox::MDP::PrioritizedSweeping< M >::setQueueThreshold ( double t )

This function sets the theta parameter.

The discount parameter must be >= 0.0. otherwise the function will throw an std::invalid_argument.

Parameters

t	The new theta parameter.

◆ stepUpdateQ()

template<IsModel M>

void AIToolbox::MDP::PrioritizedSweeping< M >::stepUpdateQ	(	size_t	s,
		size_t	a
	)

This function updates the PrioritizedSweeping internal update queue.

This function updates the QFunction for the specified pair, and decides whether any parent couple that can lead to this state is worth pushing into the queue.

Parameters

s	The previous state.
a	The action performed.

The documentation for this class was generated from the following file:

include/AIToolbox/MDP/Algorithms/PrioritizedSweeping.hpp

Public Member Functions

Detailed Description

template<IsModel M> class AIToolbox::MDP::PrioritizedSweeping< M >

Constructor & Destructor Documentation

◆ PrioritizedSweeping()

Member Function Documentation

◆ batchUpdateQ()

◆ getModel()

◆ getN()

◆ getQFunction()

◆ getQueueLength()

◆ getQueueThreshold()

◆ getValueFunction()

◆ setN()

◆ setQFunction()

◆ setQueueThreshold()

◆ stepUpdateQ()

template<IsModel M>
class AIToolbox::MDP::PrioritizedSweeping< M >