AIToolbox
A library that offers tools for AI problem solving.
AIToolbox::MDP::DynaQ< M > Class Template Reference

This class represents the DynaQ algorithm. More...

#include <AIToolbox/MDP/Algorithms/DynaQ.hpp>

Public Member Functions

 DynaQ (const M &m, double alpha=0.5, unsigned n=50)
 Basic constructor. More...
 
void stepUpdateQ (size_t s, size_t a, size_t s1, double rew)
 This function updates the internal QFunction. More...
 
void batchUpdateQ ()
 This function updates a QFunction based on simulated experience. More...
 
void setLearningRate (double a)
 This function sets the learning rate parameter. More...
 
double getLearningRate () const
 This function will return the current set learning rate parameter. More...
 
void setN (unsigned n)
 This function sets the current sample number parameter. More...
 
unsigned getN () const
 This function returns the currently set number of sampling passes during batchUpdateQ(). More...
 
const QFunctiongetQFunction () const
 This function returns a reference to the internal QFunction. More...
 
const M & getModel () const
 This function returns a reference to the referenced Model. More...
 

Detailed Description

template<IsGenerativeModel M>
class AIToolbox::MDP::DynaQ< M >

This class represents the DynaQ algorithm.

This algorithm is a simple extension to the QLearning algorithm. What it does is it keeps track of every experienced state-action pair. Each QFunction update is exactly equivalent to the QLearning one, however this algorithm allows for an additional learning phase that can take place, time permitting, before the agent takes another action.

The state-action pairs we already explored are thus known as possible, and so we use the generative model to obtain more and more data about them. This, of course, requires that the model be sampled from, in constrast with QLearning which does not require this.

The algorithm selects randomly which state action pairs to try again from.

Constructor & Destructor Documentation

◆ DynaQ()

template<IsGenerativeModel M>
AIToolbox::MDP::DynaQ< M >::DynaQ ( const M &  m,
double  alpha = 0.5,
unsigned  n = 50 
)
explicit

Basic constructor.

Parameters
mThe model to be used to update the QFunction.
alphaThe learning rate of the QLearning method.
nThe number of sampling passes to do on the model upon batchUpdateQ().

Member Function Documentation

◆ batchUpdateQ()

template<IsGenerativeModel M>
void AIToolbox::MDP::DynaQ< M >::batchUpdateQ

This function updates a QFunction based on simulated experience.

In DynaQ we sample N times from already experienced state-action pairs, and we update the resulting QFunction as if this experience was actually real.

The idea is that since we know which state action pairs we already explored, we know that whose pairs are actually possible. Thus we use the generative model to sample them again, and obtain a better estimate of the QFunction.

◆ getLearningRate()

template<IsGenerativeModel M>
double AIToolbox::MDP::DynaQ< M >::getLearningRate

This function will return the current set learning rate parameter.

Returns
The currently set learning rate parameter.

◆ getModel()

template<IsGenerativeModel M>
const M & AIToolbox::MDP::DynaQ< M >::getModel

This function returns a reference to the referenced Model.

Returns
The internal Model.

◆ getN()

template<IsGenerativeModel M>
unsigned AIToolbox::MDP::DynaQ< M >::getN

This function returns the currently set number of sampling passes during batchUpdateQ().

Returns
The current number of updates().

◆ getQFunction()

template<IsGenerativeModel M>
const QFunction & AIToolbox::MDP::DynaQ< M >::getQFunction

This function returns a reference to the internal QFunction.

Returns
The internal QFunction.

◆ setLearningRate()

template<IsGenerativeModel M>
void AIToolbox::MDP::DynaQ< M >::setLearningRate ( double  a)

This function sets the learning rate parameter.

The learning rate parameter must be > 0.0 and <= 1.0, otherwise the function will throw an std::invalid_argument.

Parameters
aThe new learning rate parameter.

◆ setN()

template<IsGenerativeModel M>
void AIToolbox::MDP::DynaQ< M >::setN ( unsigned  n)

This function sets the current sample number parameter.

Parameters
nThe new sample number parameter.

◆ stepUpdateQ()

template<IsGenerativeModel M>
void AIToolbox::MDP::DynaQ< M >::stepUpdateQ ( size_t  s,
size_t  a,
size_t  s1,
double  rew 
)

This function updates the internal QFunction.

This function takes a single experience point and uses it to update a QFunction. This is a very efficient method to keep the QFunction up to date with the latest experience.

In addition, the sampling list is updated so that batch updating becomes possible as a second phase.

The sampling list in DynaQ is a simple list of all visited state action pairs. This function is responsible for inserting them in a set, keeping them unique.

Parameters
sThe previous state.
aThe action performed.
s1The new state.
rewThe reward obtained.

The documentation for this class was generated from the following file: