This class implements a greedy policy through a QFunction. More...

#include <AIToolbox/Factored/MDP/Policies/QGreedyPolicy.hpp>

Inheritance diagram for AIToolbox::Factored::MDP::QGreedyPolicy< Maximizer >:

Public Types
using	Base = PolicyInterface< State, State, Action >

Public Member Functions
template<typename... Args>
	QGreedyPolicy (State s, Action a, const FilterMap< QFunctionRule > &q, Args &&...args)
	Basic constructor with QFunctionRules. More...

template<typename... Args>
	QGreedyPolicy (State s, Action a, const QFunction &q, Args &&...args)
	Basic constructor with QFunction. More...

virtual Action	sampleAction (const State &s) const override
	This function chooses the greediest action for state s. More...

virtual double	getActionProbability (const State &s, const Action &a) const override
	This function returns the probability of taking the specified action in the specified state. More...

Maximizer &	getMaximizer ()
	This function returns a reference to the internal maximizer. More...

const Maximizer &	getMaximizer () const
	This function returns a reference to the internal maximizer. More...

const Maximizer::Graph &	getGraph () const
	This function returns the currently set graph. More...

Public Member Functions inherited from AIToolbox::PolicyInterface< State, State, Action >
	PolicyInterface (State s, Action a)
	Basic constructor. More...

virtual	~PolicyInterface ()
	Basic virtual destructor. More...

virtual Action	sampleAction (const State &s) const=0
	This function chooses a random action for state s, following the policy distribution. More...

virtual double	getActionProbability (const State &s, const Action &a) const=0
	This function returns the probability of taking the specified action in the specified state. More...

const State &	getS () const
	This function returns the number of states of the world. More...

const Action &	getA () const
	This function returns the number of available actions to the agent. More...

Additional Inherited Members
Protected Attributes inherited from AIToolbox::PolicyInterface< State, State, Action >
State	S

Action	A

RandomEngine	rand_

Detailed Description

template<typename Maximizer = Bandit::VariableElimination>
class AIToolbox::Factored::MDP::QGreedyPolicy< Maximizer >

This class implements a greedy policy through a QFunction.

This class allows you to select effortlessly the best greedy actions from a given list of QFunctionRules, or from a QFunction.

In order to compute the best action or a given action probability the QGreedyPolicy must run VariableElimination on the stored rules, so the process can get a bit expensive.

Member Typedef Documentation

◆ Base

template<typename Maximizer = Bandit::VariableElimination>

using AIToolbox::Factored::MDP::QGreedyPolicy< Maximizer >::Base = PolicyInterface<State, State, Action>

Constructor & Destructor Documentation

◆ QGreedyPolicy() [1/2]

template<typename Maximizer >

template<typename... Args>

AIToolbox::Factored::MDP::QGreedyPolicy< Maximizer >::QGreedyPolicy	(	State	s,
		Action	a,
		const FilterMap< QFunctionRule > &	q,
		Args &&...	args
	)

Basic constructor with QFunctionRules.

Parameters

s	The number of states of the world.
a	The number of actions available to the agent.
q	The QFunctionRules this policy is linked with.
...args	Parameters to pass to the maximizer on construction.

◆ QGreedyPolicy() [2/2]

template<typename Maximizer >

template<typename... Args>

AIToolbox::Factored::MDP::QGreedyPolicy< Maximizer >::QGreedyPolicy	(	State	s,
		Action	a,
		const QFunction &	q,
		Args &&...	args
	)

Basic constructor with QFunction.

Parameters

s	The number of states of the world.
a	The number of actions available to the agent.
q	The QFunction this policy is linked with.
...args	Parameters to pass to the maximizer on construction.

Member Function Documentation

◆ getActionProbability()

template<typename Maximizer >

double AIToolbox::Factored::MDP::QGreedyPolicy< Maximizer >::getActionProbability	(	const State &	s,
		const Action &	a
	)		const

overridevirtual

This function returns the probability of taking the specified action in the specified state.

Parameters

s	The selected state.
a	The selected action.

Returns: This function returns 1 if a is equal to the greediest action, and 0 otherwise.

◆ getGraph()

template<typename Maximizer >

const Maximizer::Graph & AIToolbox::Factored::MDP::QGreedyPolicy< Maximizer >::getGraph

This function returns the currently set graph.

◆ getMaximizer() [1/2]

template<typename Maximizer >

const Maximizer & AIToolbox::Factored::MDP::QGreedyPolicy< Maximizer >::getMaximizer

This function returns a reference to the internal maximizer.

This can be used to set the parameters of the chosen maximizer.

◆ getMaximizer() [2/2]

template<typename Maximizer = Bandit::VariableElimination>

const Maximizer& AIToolbox::Factored::MDP::QGreedyPolicy< Maximizer >::getMaximizer ( ) const

This function returns a reference to the internal maximizer.

◆ sampleAction()

template<typename Maximizer >

Action AIToolbox::Factored::MDP::QGreedyPolicy< Maximizer >::sampleAction ( const State & s ) const

overridevirtual

This function chooses the greediest action for state s.

Parameters

s	The sampled state of the policy.

Returns: The chosen action.

The documentation for this class was generated from the following file:

include/AIToolbox/Factored/MDP/Policies/QGreedyPolicy.hpp

Public Types

Public Member Functions

Additional Inherited Members

Detailed Description

template<typename Maximizer = Bandit::VariableElimination> class AIToolbox::Factored::MDP::QGreedyPolicy< Maximizer >

Member Typedef Documentation

◆ Base

Constructor & Destructor Documentation

◆ QGreedyPolicy() [1/2]

◆ QGreedyPolicy() [2/2]

Member Function Documentation

◆ getActionProbability()

◆ getGraph()

◆ getMaximizer() [1/2]

◆ getMaximizer() [2/2]

◆ sampleAction()

template<typename Maximizer = Bandit::VariableElimination>
class AIToolbox::Factored::MDP::QGreedyPolicy< Maximizer >