This class represents the DynaQ algorithm. More...

#include <AIToolbox/MDP/Algorithms/DynaQ.hpp>

Public Member Functions
	DynaQ (const M &m, double alpha=0.5, unsigned n=50)
	Basic constructor. More...

void	stepUpdateQ (size_t s, size_t a, size_t s1, double rew)
	This function updates the internal QFunction. More...

void	batchUpdateQ ()
	This function updates a QFunction based on simulated experience. More...

void	setLearningRate (double a)
	This function sets the learning rate parameter. More...

double	getLearningRate () const
	This function will return the current set learning rate parameter. More...

void	setN (unsigned n)
	This function sets the current sample number parameter. More...

unsigned	getN () const
	This function returns the currently set number of sampling passes during batchUpdateQ(). More...

const QFunction &	getQFunction () const
	This function returns a reference to the internal QFunction. More...

const M &	getModel () const
	This function returns a reference to the referenced Model. More...

Detailed Description

template<IsGenerativeModel M>
class AIToolbox::MDP::DynaQ< M >

This class represents the DynaQ algorithm.

This algorithm is a simple extension to the QLearning algorithm. What it does is it keeps track of every experienced state-action pair. Each QFunction update is exactly equivalent to the QLearning one, however this algorithm allows for an additional learning phase that can take place, time permitting, before the agent takes another action.

The state-action pairs we already explored are thus known as possible, and so we use the generative model to obtain more and more data about them. This, of course, requires that the model be sampled from, in constrast with QLearning which does not require this.

The algorithm selects randomly which state action pairs to try again from.

Constructor & Destructor Documentation

◆ DynaQ()

template<IsGenerativeModel M>

AIToolbox::MDP::DynaQ< M >::DynaQ	(	const M &	m,
		double	alpha = `0.5`,
		unsigned	n = `50`
	)

explicit

Basic constructor.

Parameters

m	The model to be used to update the QFunction.
alpha	The learning rate of the QLearning method.
n	The number of sampling passes to do on the model upon batchUpdateQ().

Member Function Documentation

◆ batchUpdateQ()

template<IsGenerativeModel M>

void AIToolbox::MDP::DynaQ< M >::batchUpdateQ

This function updates a QFunction based on simulated experience.

In DynaQ we sample N times from already experienced state-action pairs, and we update the resulting QFunction as if this experience was actually real.

The idea is that since we know which state action pairs we already explored, we know that whose pairs are actually possible. Thus we use the generative model to sample them again, and obtain a better estimate of the QFunction.

◆ getLearningRate()

template<IsGenerativeModel M>

double AIToolbox::MDP::DynaQ< M >::getLearningRate

This function will return the current set learning rate parameter.

Returns: The currently set learning rate parameter.

◆ getModel()

template<IsGenerativeModel M>

const M & AIToolbox::MDP::DynaQ< M >::getModel

This function returns a reference to the referenced Model.

Returns: The internal Model.

◆ getN()

template<IsGenerativeModel M>

unsigned AIToolbox::MDP::DynaQ< M >::getN

This function returns the currently set number of sampling passes during batchUpdateQ().

Returns: The current number of updates().

◆ getQFunction()

template<IsGenerativeModel M>

const QFunction & AIToolbox::MDP::DynaQ< M >::getQFunction

This function returns a reference to the internal QFunction.

Returns: The internal QFunction.

◆ setLearningRate()

template<IsGenerativeModel M>

void AIToolbox::MDP::DynaQ< M >::setLearningRate ( double a )

This function sets the learning rate parameter.

The learning rate parameter must be > 0.0 and <= 1.0, otherwise the function will throw an std::invalid_argument.

Parameters

a	The new learning rate parameter.

◆ setN()

template<IsGenerativeModel M>

void AIToolbox::MDP::DynaQ< M >::setN ( unsigned n )

This function sets the current sample number parameter.

Parameters

n	The new sample number parameter.

◆ stepUpdateQ()

template<IsGenerativeModel M>

void AIToolbox::MDP::DynaQ< M >::stepUpdateQ	(	size_t	s,
		size_t	a,
		size_t	s1,
		double	rew
	)

This function updates the internal QFunction.

This function takes a single experience point and uses it to update a QFunction. This is a very efficient method to keep the QFunction up to date with the latest experience.

In addition, the sampling list is updated so that batch updating becomes possible as a second phase.

The sampling list in DynaQ is a simple list of all visited state action pairs. This function is responsible for inserting them in a set, keeping them unique.

Parameters

s	The previous state.
a	The action performed.
s1	The new state.
rew	The reward obtained.

The documentation for this class was generated from the following file:

include/AIToolbox/MDP/Algorithms/DynaQ.hpp

Public Member Functions

Detailed Description

template<IsGenerativeModel M> class AIToolbox::MDP::DynaQ< M >

Constructor & Destructor Documentation

◆ DynaQ()

Member Function Documentation

◆ batchUpdateQ()

◆ getLearningRate()

◆ getModel()

◆ getN()

◆ getQFunction()

◆ setLearningRate()

◆ setN()

◆ stepUpdateQ()

template<IsGenerativeModel M>
class AIToolbox::MDP::DynaQ< M >