AIToolbox
A library that offers tools for AI problem solving.
Types.hpp File Reference
#include <vector>
#include <AIToolbox/Types.hpp>
#include <AIToolbox/TypeTraits.hpp>

Go to the source code of this file.

Classes

struct  AIToolbox::MDP::ValueFunction
 

Namespaces

 AIToolbox
 
 AIToolbox::MDP
 

Typedefs

MDP Value Types

QFunctions and ValueFunctions are specific functions that are defined in terms of policies; as in, in any particular state, they can evaluate the performance that the policy will have. In general however here we do not specifically specify what the policy is, and since we are most probably interested in the best possible policy, we try to store as little information as possible in order to find that out.

A QFunction is a function that takes in a state and action, and returns the value for that particular pair. The higher the value is, the better we predict we will perform. Using a QFunction to obtain the perfect policy is straightforward, since at each state we can simply check which action will yeld the best value, and choose that one (assuming that all actions taken from that point are optimal, which we would like to assume since we are trying to find out the best).

In theory, a ValueFunction is a function that is a max over actions of the QFunction, as in it takes a state and returns the best value obtainable from that state (following the implied policy). However, that is not very useful in a practical scenario. Thus we want to store not only that value, but also the action that resulted in that particular choice. Instead of storing, as it would make more intuitive sense, this function as a vector of tuples, we are going to store it as a tuple of vectors, to allow for easy manipulations of the underlying values (sums, products and so on).

using AIToolbox::MDP::Values = Vector
 
using AIToolbox::MDP::Actions = std::vector< size_t >
 
using AIToolbox::MDP::QFunction = Matrix2D
 
MDP Value Types

QFunctions and ValueFunctions are specific functions that are defined in terms of policies; as in, in any particular state, they can evaluate the performance that the policy will have. In general however here we do not specifically specify what the policy is, and since we are most probably interested in the best possible policy, we try to store as little information as possible in order to find that out.

A QFunction is a function that takes in a state and action, and returns the value for that particular pair. The higher the value is, the better we predict we will perform. Using a QFunction to obtain the perfect policy is straightforward, since at each state we can simply check which action will yeld the best value, and choose that one (assuming that all actions taken from that point are optimal, which we would like to assume since we are trying to find out the best).

In theory, a ValueFunction is a function that is a max over actions of the QFunction, as in it takes a state and returns the best value obtainable from that state (following the implied policy). However, that is not very useful in a practical scenario. Thus we want to store not only that value, but also the action that resulted in that particular choice. Instead of storing, as it would make more intuitive sense, this function as a vector of tuples, we are going to store it as a tuple of vectors, to allow for easy manipulations of the underlying values (sums, products and so on).

using AIToolbox::MDP::Values = Vector
 
using AIToolbox::MDP::Actions = std::vector< size_t >
 
using AIToolbox::MDP::QFunction = Matrix2D