AIToolbox
A library that offers tools for AI problem solving.
AIToolbox::POMDP::SparseModel< M > Class Template Reference

This class represents a Partially Observable Markov Decision Process. More...

#include <AIToolbox/POMDP/SparseModel.hpp>

Public Types

using ObservationMatrix = SparseMatrix3D
 

Public Member Functions

template<typename... Args>
 SparseModel (size_t o, Args &&... parameters)
 Basic constructor. More...
 
template<IsNaive3DMatrix ObFun, typename... Args>
 SparseModel (size_t o, ObFun &&of, Args &&... parameters)
 Basic constructor. More...
 
template<typename PM >
requires IsModel< PM > &&std::constructible_from< M, PM > SparseModel (const PM &model)
 Copy constructor from any valid POMDP model. More...
 
template<typename... Args>
 SparseModel (NoCheck, size_t o, ObservationMatrix &&ot, Args &&... parameters)
 Unchecked constructor. More...
 
template<IsNaive3DMatrix ObFun>
void setObservationFunction (const ObFun &of)
 This function replaces the SparseModel observation function with the one provided. More...
 
void setObservationFunction (const ObservationMatrix &of)
 This function sets the observation function using a SparseMatrix3D. More...
 
std::tuple< size_t, size_t, double > sampleSOR (size_t s, size_t a) const
 This function samples the POMDP for the specified state action pair. More...
 
std::tuple< size_t, double > sampleOR (size_t s, size_t a, size_t s1) const
 This function samples the POMDP for the specified state action pair. More...
 
double getObservationProbability (size_t s1, size_t a, size_t o) const
 This function returns the stored observation probability for the specified state-action pair. More...
 
double getObservationProbability (const Belief &b, size_t o, size_t a) const
 This function computes the probability of obtaining an observation given an action and an initial belief. More...
 
const SparseMatrix2DgetObservationFunction (size_t a) const
 This function returns the observation function for a given action. More...
 
size_t getO () const
 This function returns the number of observations possible. More...
 
const ObservationMatrixgetObservationFunction () const
 This function returns the observation matrix for inspection. More...
 

Detailed Description

template<MDP::IsModel M>
class AIToolbox::POMDP::SparseModel< M >

This class represents a Partially Observable Markov Decision Process.

This class inherits from any valid MDP model type, so that it can use its base methods, and it builds from those. Templated inheritance was chosen to improve performance and keep code small, instead of doing composition.

A POMDP is an MDP where the agent, at each timestep, does not know in which state it is. Instead, after each action is performed, it obtains an "observation", which offers some information as to which new state the agent has transitioned to. This observation is determined by an "observation function", that maps S'xAxO to a probability: the probability of obtaining observation O after taking action A and landing in state S'.

Since now its knowledge is imperfect, in order to represent the knowledge of the state it is currently in, the agent is thus forced to use Beliefs: probability distributions over states.

The way a Belief works is that, after each action and observation, the agent can reason as follows: given my previous Belief (distribution over states) that I think I was in, what is now the probability that I transitioned to any particular state? This new Belief can be computed from the Model, given that the agent knows the distributions of the transition and observation functions.

Turns out that a POMDP can be viewed as an MDP with an infinite number of states, where each state is essentially a Belief. Since a Belief is a vector of real numbers, there are infinite of them, thus the infinite number of states. While POMDPs can be much more powerful than MDPs for modeling real world problems, where information is usually not perfect, it turns out that this infinite-state property makes them so much harder to solve perfectly, and their solutions much more complex.

A POMDP solution is composed by several policies, which apply in different ranges of the Belief space, and suggest different actions depending on the observations received by the agent at each timestep. The values of those policies can be, in the same way, represented as a number of value vectors (called alpha vectors in the literature) that apply in those same ranges of the Belief space. Each alpha vector is somewhat similar to an MDP ValueFunction.

The difference between this class and the POMDP::Model class is that this class stores observations in a sparce matrix. This results in a possibly slower access to individual probabilities, but immeasurably speeds up computation with some classes of planning algorithms in case the number of possible observations is very small with respect to the total theoretic observation space of SxAxO. It also of course incredibly reduces memory consumption in such cases, which may also improve speed by effect of improved caching.

Template Parameters
MThe particular MDP type that we want to extend.

Member Typedef Documentation

◆ ObservationMatrix

template<MDP::IsModel M>
using AIToolbox::POMDP::SparseModel< M >::ObservationMatrix = SparseMatrix3D

Constructor & Destructor Documentation

◆ SparseModel() [1/4]

template<MDP::IsModel M>
template<typename... Args>
AIToolbox::POMDP::SparseModel< M >::SparseModel ( size_t  o,
Args &&...  parameters 
)

Basic constructor.

This constructor initializes the observation function so that all actions will return observation 0.

Template Parameters
ArgsAll types of the parent constructor arguments.
Parameters
oThe number of possible observations the agent could make.
parametersAll arguments needed to build the parent Model.

◆ SparseModel() [2/4]

template<MDP::IsModel M>
template<IsNaive3DMatrix ObFun, typename... Args>
AIToolbox::POMDP::SparseModel< M >::SparseModel ( size_t  o,
ObFun &&  of,
Args &&...  parameters 
)

Basic constructor.

This constructor takes an arbitrary three dimensional container and tries to copy its contents into the observations matrix.

The container needs to support data access through operator[]. In addition, the dimensions of the container must match the ones provided as arguments both directly (o) and indirectly (s,a), in the order s, a, o.

This is important, as this constructor DOES NOT perform any size checks on the external containers.

Internal values of the containers will be converted to double, so these conversions must be possible.

In addition, the observation container must contain a valid transition function.

Template Parameters
ObFunThe external observations container type.
Parameters
oThe number of possible observations the agent could make.
ofThe observation probability matrix.
parametersAll arguments needed to build the parent Model.

◆ SparseModel() [3/4]

template<MDP::IsModel M>
template<typename PM >
requires IsModel< PM > &&std::constructible_from< M, PM > AIToolbox::POMDP::SparseModel< M >::SparseModel ( const PM &  model)

Copy constructor from any valid POMDP model.

This allows to copy from any other model. A nice use for this is to convert any model which computes probabilities on the fly into an POMDP::SparseModel where probabilities are all stored for fast access. Of course such a solution can be done only when the number of states, actions and observations is not too big.

Of course this constructor is available only if the underlying MDP Model can to be constructed from the input as well.

Template Parameters
PMThe type of the other model.
Parameters
modelThe model that needs to be copied.

◆ SparseModel() [4/4]

template<MDP::IsModel M>
template<typename... Args>
AIToolbox::POMDP::SparseModel< M >::SparseModel ( NoCheck  ,
size_t  o,
ObservationMatrix &&  ot,
Args &&...  parameters 
)

Unchecked constructor.

This constructor takes ownership of the data that it is passed to it to avoid any sorts of copies and additional work (sanity checks), in order to speed up as much as possible the process of building a new Model.

Note that to use it you have to explicitly use the NO_CHECK tag parameter first.

Parameters
oThe number of possible observations the agent could make.
otThe observation probability matrix.
parametersAll arguments needed to build the parent Model.

Member Function Documentation

◆ getO()

template<MDP::IsModel M>
size_t AIToolbox::POMDP::SparseModel< M >::getO

This function returns the number of observations possible.

Returns
The total number of observations.

◆ getObservationFunction() [1/2]

template<MDP::IsModel M>
const SparseModel< M >::ObservationMatrix & AIToolbox::POMDP::SparseModel< M >::getObservationFunction

This function returns the observation matrix for inspection.

Returns
The observation matrix.

◆ getObservationFunction() [2/2]

template<MDP::IsModel M>
const SparseMatrix2D & AIToolbox::POMDP::SparseModel< M >::getObservationFunction ( size_t  a) const

This function returns the observation function for a given action.

Parameters
aThe action requested.
Returns
The observation function for the input action.

◆ getObservationProbability() [1/2]

template<MDP::IsModel M>
double AIToolbox::POMDP::SparseModel< M >::getObservationProbability ( const Belief b,
size_t  o,
size_t  a 
) const

This function computes the probability of obtaining an observation given an action and an initial belief.

Parameters
bThe initial belief state.
aThe action performed.
oThe resulting observation.
Returns
The probability of obtaining the specified observation.

◆ getObservationProbability() [2/2]

template<MDP::IsModel M>
double AIToolbox::POMDP::SparseModel< M >::getObservationProbability ( size_t  s1,
size_t  a,
size_t  o 
) const

This function returns the stored observation probability for the specified state-action pair.

Parameters
s1The final state of the transition.
aThe action performed in the transition.
oThe recorded observation for the transition.
Returns
The probability of the specified observation.

◆ sampleOR()

template<MDP::IsModel M>
std::tuple< size_t, double > AIToolbox::POMDP::SparseModel< M >::sampleOR ( size_t  s,
size_t  a,
size_t  s1 
) const

This function samples the POMDP for the specified state action pair.

This function samples the model for simulated experience. The transition, observation and reward functions are used to produce, from the state, action and new state inserted as arguments, a possible new observation and reward. The observation and rewards are picked so that they are consistent with the specified new state.

Parameters
sThe state that needs to be sampled.
aThe action that needs to be sampled.
s1The resulting state of the s,a transition.
Returns
A tuple containing a new observation and reward.

◆ sampleSOR()

template<MDP::IsModel M>
std::tuple< size_t, size_t, double > AIToolbox::POMDP::SparseModel< M >::sampleSOR ( size_t  s,
size_t  a 
) const

This function samples the POMDP for the specified state action pair.

This function samples the model for simulated experience. The transition, observation and reward functions are used to produce, from the state action pair inserted as arguments, a possible new state with respective observation and reward. The new state is picked from all possible states that the MDP allows transitioning to, each with probability equal to the same probability of the transition in the model. After a new state is picked, an observation is sampled from the observation function distribution, and finally the reward is the corresponding reward contained in the reward function.

Parameters
sThe state that needs to be sampled.
aThe action that needs to be sampled.
Returns
A tuple containing a new state, observation and reward.

◆ setObservationFunction() [1/2]

template<MDP::IsModel M>
template<IsNaive3DMatrix ObFun>
void AIToolbox::POMDP::SparseModel< M >::setObservationFunction ( const ObFun &  of)

This function replaces the SparseModel observation function with the one provided.

The container needs to support data access through operator[]. In addition, the dimensions of the containers must match the ones provided as arguments (for three dimensions: S, A, O, in this order).

This is important, as this function DOES NOT perform any size checks on the external containers.

Internal values of the container will be converted to double, so these conversions must be possible.

Template Parameters
ObFunThe external observations container type.
Parameters
ofThe external observations container.

◆ setObservationFunction() [2/2]

template<MDP::IsModel M>
void AIToolbox::POMDP::SparseModel< M >::setObservationFunction ( const ObservationMatrix of)

This function sets the observation function using a SparseMatrix3D.

This function will throw a std::invalid_argument if the matrix provided does not contain valid probabilities.

The dimensions of the container must match the ones used during construction (for three dimensions: A, S, O). BE CAREFUL. The matrices MUST be SxO, while the std::vector containing them MUST be of size A.

This function does DOES NOT perform any size checks on the input.

Parameters
ofThe external observations container.

The documentation for this class was generated from the following file: