AIToolbox
A library that offers tools for AI problem solving.
AIToolbox::Factored::MDP Namespace Reference

Namespaces

 SysAdminUtils
 

Classes

class  BanditPolicyAdaptor
 This class extends a Bandit policy so that it can be called from MDP code. More...
 
class  CooperativeExperience
 This class keeps track of registered events and rewards. More...
 
class  CooperativeMaximumLikelihoodModel
 This class models CooperativeExperience as a CooperativeModel using Maximum Likelihood. More...
 
class  CooperativeModel
 This class models a cooperative MDP. More...
 
class  CooperativePrioritizedSweeping
 This class implements PrioritizedSweeping for cooperative environments. More...
 
class  CooperativeQLearning
 This class represents the Cooperative QLearning algorithm. More...
 
class  CooperativeThompsonModel
 This class models CooperativeExperience as a CooperativeModel using Thompson Sampling. More...
 
class  EpsilonPolicy
 This class represents an epsilon-greedy policy for Factored MDPs. More...
 
class  FactoredLP
 This class represents the Factored LP algorithm. More...
 
class  JointActionLearner
 This class represents a single Joint Action Learner agent. More...
 
class  LinearProgramming
 This class solves a factored MDP with Linear Programming. More...
 
struct  MakeGraph
 This class is the public interface for initializing the graph in generic code that uses the maximizers. More...
 
struct  MakeGraph< Bandit::MaxPlus >
 
struct  MakeGraph< Bandit::ReusingIterativeLocalSearch >
 
struct  MakeGraphImpl
 This class clumps all implementations that create graphs for data for certain Maximizers. More...
 
struct  MakeGraphImpl< Bandit::LocalSearch, Iterable >
 
struct  MakeGraphImpl< Bandit::LocalSearch, MDP::QFunction >
 
struct  MakeGraphImpl< Bandit::VariableElimination, Data >
 
struct  MOQFunctionRule
 This struct represents a single state/action/values tuple. More...
 
struct  QFunctionRule
 This struct represents a single state/action/value tuple. More...
 
class  QGreedyPolicy
 This class implements a greedy policy through a QFunction. More...
 
class  SparseCooperativeQLearning
 This class represents the Sparse Cooperative QLearning algorithm. More...
 
class  TigerAntelope
 This class represents a 2-agent tiger antelope environment. More...
 
struct  UpdateGraph
 This class is the public interface for updating the input graph with the input data in generic code that uses the maximizers. More...
 
struct  UpdateGraph< Bandit::MaxPlus >
 
struct  UpdateGraph< Bandit::ReusingIterativeLocalSearch >
 
struct  UpdateGraphImpl
 This class clumps all implementations that update graphs with data for certain Maximizers. More...
 
struct  UpdateGraphImpl< Bandit::LocalSearch, Iterable >
 
struct  UpdateGraphImpl< Bandit::LocalSearch, MDP::QFunction >
 
struct  UpdateGraphImpl< Bandit::VariableElimination, Iterable >
 
struct  UpdateGraphImpl< Bandit::VariableElimination, MDP::QFunction >
 
struct  ValueFunction
 This struct represents a factored ValueFunction. More...
 

Typedefs

using QFunction = FactoredMatrix2D
 This represents a factored QFunction. More...
 

Functions

CooperativeModel makeSysAdminUniRing (unsigned agents, double pFailBase, double pFailBonus, double pDeadBase, double pDeadBonus, double pLoad, double pDoneG, double pDoneF)
 This function creates a ring where each machine affects only the next adjacent one. More...
 
CooperativeModel makeSysAdminBiRing (unsigned agents, double pFailBase, double pFailBonus, double pDeadBase, double pDeadBonus, double pLoad, double pDoneG, double pDoneF)
 This function creates a ring where each machine affects the two adjacent ones. More...
 
std::string printSysAdminRing (const State &s)
 This function creates a graphical representation of a SysAdmin ring problem. More...
 
CooperativeModel makeSysAdminGrid (unsigned width, unsigned height, double pFailBase, double pFailBonus, double pDeadBase, double pDeadBonus, double pLoad, double pDoneG, double pDoneF)
 This function creates a grid where each machine is connected with its 4 neighbors. More...
 
CooperativeModel makeSysAdminTorus (unsigned width, unsigned height, double pFailBase, double pFailBonus, double pDeadBase, double pDeadBonus, double pLoad, double pDoneG, double pDoneF)
 This function creates a toroidal grid where each machine is connected with its 4 neighbors. More...
 
std::string printSysAdminGrid (const State &s, unsigned width)
 This function creates a graphical representation of a SysAdmin grid problem. More...
 
QFunction bellmanBackup (const CooperativeModel &m, const ValueFunction &v)
 This function applies a one-step backup on the input ValueFunction. More...
 
QFunction makeQFunction (const DDNGraph &graph, const std::vector< std::vector< size_t >> &basisDomains)
 This function creates a new factored QFunction from the given graph and basis domain. More...
 
Matrix2D makeA0MatrixStatus (unsigned neighbors, size_t id, double pFailBase, double pFailBonus, double pDeadBase, double pDeadBonus)
 This function builds the transition matrix for a single status state factor in the SysAdmin problem in case of action 0 (no-reboot). More...
 
Matrix2D makeA1MatrixStatus ()
 This function builds the transition matrix for a single status state factor in the SysAdmin problem in case of action 1 (reboot). More...
 
Matrix2D makeA0MatrixLoad (double pLoad, double pDoneG, double pDoneF)
 This function builds the transition matrix for a single load state factor in the SysAdmin problem in case of action 1 (reboot). More...
 
Matrix2D makeA1MatrixLoad ()
 This function builds the transition matrix for a single load state factor in the SysAdmin problem in case of action 1 (reboot). More...
 
Matrix2D makeRewardMatrix (const Matrix2D &la0Matrix)
 This function builds the reward function which is the same for all agents. More...
 
char printMachineStatus (unsigned s)
 This function returns a printable character of a machine's status. More...
 
char printMachineLoad (unsigned l)
 This function returns a printable character of a machine's load. More...
 

Variables

template<typename QR >
concept IsQFunctionRule
 This concept models the interface for a QFunctionRule. More...
 
template<typename T >
concept QFRuleRange = std::ranges::range<T> && IsQFunctionRule<std::ranges::range_value_t<T>>
 This concept represents a range of QFunctionRules. More...
 

Typedef Documentation

◆ QFunction

This represents a factored QFunction.

Function Documentation

◆ bellmanBackup()

QFunction AIToolbox::Factored::MDP::bellmanBackup ( const CooperativeModel m,
const ValueFunction v 
)

This function applies a one-step backup on the input ValueFunction.

Parameters
mThe model used to do the backup.
vThe ValueFunction to backup.
Returns
The QFunction resulting from the backup.

◆ makeA0MatrixLoad()

Matrix2D AIToolbox::Factored::MDP::makeA0MatrixLoad ( double  pLoad,
double  pDoneG,
double  pDoneF 
)
inline

This function builds the transition matrix for a single load state factor in the SysAdmin problem in case of action 1 (reboot).

This function assumes that the status factor (on which it depends) always comes before the load factor in the state space/tags.

Parameters
pLoadThe probability of receiving a job.
pDoneGThe probability of finishing a job when in good status.
pDoneFThe probability of finishing a job when in failing status.
Returns
The transition matrix of size (3*3, 3).

◆ makeA0MatrixStatus()

Matrix2D AIToolbox::Factored::MDP::makeA0MatrixStatus ( unsigned  neighbors,
size_t  id,
double  pFailBase,
double  pFailBonus,
double  pDeadBase,
double  pDeadBonus 
)
inline

This function builds the transition matrix for a single status state factor in the SysAdmin problem in case of action 0 (no-reboot).

Parameters
neighborsThe number of neighbors of this agent.
idA number in [0, neighbors), indicating at which position in the tag this state-factor is.
pFailBaseThe base probability of failing.
pFailBonusThe bonus probability of failing.
pDeadBaseThe base probability of dying.
pDeadBonusThe bonus probability of dying.
Returns
The transition matrix of size ((neighbors+1)^3, 3).

◆ makeA1MatrixLoad()

Matrix2D AIToolbox::Factored::MDP::makeA1MatrixLoad ( )
inline

This function builds the transition matrix for a single load state factor in the SysAdmin problem in case of action 1 (reboot).

Note that this does not depend on anything, since we are rebooting the machine. Thus the matrix is always the same for all load state factors.

Returns
The transition matrix of size (3, 3).

◆ makeA1MatrixStatus()

Matrix2D AIToolbox::Factored::MDP::makeA1MatrixStatus ( )
inline

This function builds the transition matrix for a single status state factor in the SysAdmin problem in case of action 1 (reboot).

Note that this does not depend on anything, since we are rebooting the machine. Thus the matrix is always the same for all status state factors.

Returns
The transition matrix of size (3, 3).

◆ makeQFunction()

QFunction AIToolbox::Factored::MDP::makeQFunction ( const DDNGraph graph,
const std::vector< std::vector< size_t >> &  basisDomains 
)

This function creates a new factored QFunction from the given graph and basis domain.

The basisDomains is a set of state feature id groups. Each group should be unique.

For each such group, this function will identify the ids of the parent features and agents as defined by the input graph, and a new basis function is created inside the output QFunction. The parent features/agents ids are then set as the tags of a new basis function.

Parameters
graphThe DDNGraph containing the information about the factored MDP structure.
basisDomainsThe required domains for each of the outputted basis functions of the QFunction.
Returns
A newly built QFunction.

◆ makeRewardMatrix()

Matrix2D AIToolbox::Factored::MDP::makeRewardMatrix ( const Matrix2D la0Matrix)
inline

This function builds the reward function which is the same for all agents.

The parameter can be built using the makeA0MatrixLoad(double, double, double) function.

The reward matrix is all zero but for loaded states (since they are the only ones from which it is possible to complete a job).

We assume completing a job yelds 1.0 reward.

Parameters
la0MatrixThe previously constructed transition matrix for the load factor.
Returns
The reward matrix of size (3*3, 2).

◆ makeSysAdminBiRing()

CooperativeModel AIToolbox::Factored::MDP::makeSysAdminBiRing ( unsigned  agents,
double  pFailBase,
double  pFailBonus,
double  pDeadBase,
double  pDeadBonus,
double  pLoad,
double  pDoneG,
double  pDoneF 
)

This function creates a ring where each machine affects the two adjacent ones.

Note that pFailBonus and pDeadBonus are the total additional bonuses counted when all neighbors are faulty/dead, respectively. However, the bonuses are counted per-agent.

If a machine with 2 neighbors has a single faulty neighbor, it will get an additional failing probability of pFailBonus/2. If the same machine has one faulty neighbor and one dead neighbor, it will get a penalty of pFailBonus/2 + pDeadBonus/2.

Parameters
agentsThe number of agents in the ring.
pFailBaseThe base probability of a machine to fail.
pFailBonusThe total additional probability to fail/die when all neighbors are faulty (counted per-neighbor).
pDeadBaseThe base probability of a faulty machine to die.
pDeadBonusThe total additional probability to fail/die when all neighbors are dead (counted per-neighbor).
pLoadThe probability of getting a job when idle.
pDoneGThe probability of completing a job when good.
pDoneFThe probability of completing a job when faulty.
Returns
The CooperativeModel representing the problem.

◆ makeSysAdminGrid()

CooperativeModel AIToolbox::Factored::MDP::makeSysAdminGrid ( unsigned  width,
unsigned  height,
double  pFailBase,
double  pFailBonus,
double  pDeadBase,
double  pDeadBonus,
double  pLoad,
double  pDoneG,
double  pDoneF 
)

This function creates a grid where each machine is connected with its 4 neighbors.

Grids are notoriously hard to solve as the induced width of the VariableElimination graph is min(width, height), which usually results in extremely high computational costs.

Parameters
widthThe number of agents for the width of the grid.
heightThe number of agents for the height of the grid.
pFailBaseThe base probability of a machine to fail.
pFailBonusThe total additional probability to fail/die when all neighbors are faulty (counted per-neighbor).
pDeadBaseThe base probability of a faulty machine to die.
pDeadBonusThe total additional probability to fail/die when all neighbors are dead (counted per-neighbor).
pLoadThe probability of getting a job when idle.
pDoneGThe probability of completing a job when good.
pDoneFThe probability of completing a job when faulty.
Returns
The CooperativeModel representing the problem.

◆ makeSysAdminTorus()

CooperativeModel AIToolbox::Factored::MDP::makeSysAdminTorus ( unsigned  width,
unsigned  height,
double  pFailBase,
double  pFailBonus,
double  pDeadBase,
double  pDeadBonus,
double  pLoad,
double  pDoneG,
double  pDoneF 
)

This function creates a toroidal grid where each machine is connected with its 4 neighbors.

Toruses are notoriously hard to solve as the induced width of the VariableElimination graph is 2*min(width, height), which usually results in extremely high computational costs.

Parameters
widthThe number of agents for the width of the torus.
heightThe number of agents for the height of the torus.
pFailBaseThe base probability of a machine to fail.
pFailBonusThe total additional probability to fail/die when all neighbors are faulty (counted per-neighbor).
pDeadBaseThe base probability of a faulty machine to die.
pDeadBonusThe total additional probability to fail/die when all neighbors are dead (counted per-neighbor).
pLoadThe probability of getting a job when idle.
pDoneGThe probability of completing a job when good.
pDoneFThe probability of completing a job when faulty.
Returns
The CooperativeModel representing the problem.

◆ makeSysAdminUniRing()

CooperativeModel AIToolbox::Factored::MDP::makeSysAdminUniRing ( unsigned  agents,
double  pFailBase,
double  pFailBonus,
double  pDeadBase,
double  pDeadBonus,
double  pLoad,
double  pDoneG,
double  pDoneF 
)

This function creates a ring where each machine affects only the next adjacent one.

Note that pFailBonus and pDeadBonus are the total additional bonuses counted when all neighbors are faulty/dead, respectively. However, the bonuses are counted per-agent.

If a machine with 2 neighbors has a single faulty neighbor, it will get an additional failing probability of pFailBonus/2. If the same machine has one faulty neighbor and one dead neighbor, it will get a penalty of pFailBonus/2 + pDeadBonus/2.

Parameters
agentsThe number of agents in the ring.
pFailBaseThe base probability of a machine to fail.
pFailBonusThe total additional probability to fail/die when all neighbors are faulty (counted per-neighbor).
pDeadBaseThe base probability of a faulty machine to die.
pDeadBonusThe total additional probability to fail/die when all neighbors are dead (counted per-neighbor).
pLoadThe probability of getting a job when idle.
pDoneGThe probability of completing a job when good.
pDoneFThe probability of completing a job when faulty.
Returns
The CooperativeModel representing the problem.

◆ printMachineLoad()

char AIToolbox::Factored::MDP::printMachineLoad ( unsigned  l)
inline

This function returns a printable character of a machine's load.

◆ printMachineStatus()

char AIToolbox::Factored::MDP::printMachineStatus ( unsigned  s)
inline

This function returns a printable character of a machine's status.

◆ printSysAdminGrid()

std::string AIToolbox::Factored::MDP::printSysAdminGrid ( const State s,
unsigned  width 
)

This function creates a graphical representation of a SysAdmin grid problem.

Each agent is represented with 2 characters: the first represents the Status ('g'ood, 'f'aulty, 'd'ead), and the second represents the Load ('i'dle, 'l'oaded, 'd'one).

Parameters
sThe State to represent.
widthThe number of agents for the width of the grid.
Returns
A graphical representation to print on screen.

◆ printSysAdminRing()

std::string AIToolbox::Factored::MDP::printSysAdminRing ( const State s)

This function creates a graphical representation of a SysAdmin ring problem.

Each agent is represented with 2 characters: the first represents the Status ('g'ood, 'f'aulty, 'd'ead), and the second represents the Load ('i'dle, 'l'oaded, 'd'one).

Parameters
sThe State to represent.
Returns
A graphical representation to print on screen.

Variable Documentation

◆ IsQFunctionRule

template<typename QR >
concept AIToolbox::Factored::MDP::IsQFunctionRule
Initial value:
= Bandit::IsQFunctionRule<QR> && requires (const QR qr) {
{ qr.state } -> std::convertible_to<PartialState>;
}

This concept models the interface for a QFunctionRule.

See also
AIToolbox::Factored::Bandit::IsQFunctionRule

◆ QFRuleRange

template<typename T >
concept AIToolbox::Factored::MDP::QFRuleRange = std::ranges::range<T> && IsQFunctionRule<std::ranges::range_value_t<T>>

This concept represents a range of QFunctionRules.