simple_rl.mdp package¶

Subpackages¶

Submodules¶

simple_rl.mdp.MDPClass module¶

MDPClass.py: Contains the MDP Class.

class simple_rl.mdp.MDPClass.MDP(actions, transition_func, reward_func, init_state, gamma=0.99, step_cost=0)[source]¶

Bases: object

Abstract class for a Markov Decision Process.

end_of_instance()[source]¶

execute_agent_action(action)[source]¶

Args:: action (str)
Returns:: (tuple: <float,State>): reward, State
Summary:: Core method of all of simple_rl. Facilitates interaction between the MDP and an agent.

get_actions()[source]¶

get_curr_state()[source]¶

get_gamma()[source]¶

get_init_state()[source]¶

get_num_state_feats()[source]¶

get_parameters()[source]¶

Returns:: (dict) key=param_name (str) --> val=param_val (object).

get_reward_func()[source]¶

get_slip_prob()[source]¶

get_transition_func()[source]¶

reset()[source]¶

set_gamma(new_gamma)[source]¶

set_slip_prob(slip_prob)[source]¶

set_step_cost(new_step_cost)[source]¶

simple_rl.mdp.MDPDistributionClass module¶

MDPDistributionClass.py: Contains the MDP Distribution Class.

class simple_rl.mdp.MDPDistributionClass.MDPDistribution(mdp_prob_dict, horizon=0)[source]¶

Bases: object

Class for distributions over MDPs.

get_actions()[source]¶

get_all_mdps(prob_threshold=0)[source]¶

Args:: prob_threshold (float)
Returns:: (list): Contains all mdps in the distribution with Pr. > @prob_threshold.

get_average_reward_func()[source]¶

get_gamma()[source]¶

Notes:: Not all MDPs in the distribution are guaranteed to share gamma.

get_horizon()[source]¶

get_init_state()[source]¶

Notes:: Not all MDPs in the distribution are guaranteed to share init states.

get_mdps()[source]¶

get_num_mdps()[source]¶

get_parameters()[source]¶

Returns:: (dict) key=param_name (str) --> val=param_val (object).

get_prob_of_mdp(mdp)[source]¶

get_reward_func(avg=True)[source]¶

remove_mdp(mdp)[source]¶

Args:: (MDP)
Summary:: Removes @mdp from self.mdp_prob_dict and recomputes the distribution.

remove_mdps(mdp_list)[source]¶

Args:: (list): Contains MDP instances.
Summary:: Removes each mdp in @mdp_list from self.mdp_prob_dict and recomputes the distribution.

sample(k=1)[source]¶

Args:: k (int)
Returns:: (List of MDP): Samples @k mdps without replacement.

set_gamma(new_gamma)[source]¶

simple_rl.mdp.MDPDistributionClass.main()[source]¶

simple_rl.mdp.StateClass module¶

class simple_rl.mdp.StateClass.State(data=[], is_terminal=False)[source]¶

Bases: object

Abstract State class

features()[source]¶

Summary: Used by function approximators to represent the state. Override this method in State subclasses to have functiona approximators use a different set of features.
Returns:: (iterable)

get_data()[source]¶

get_num_feats()[source]¶

is_terminal()[source]¶

set_terminal(is_term=True)[source]¶

simple_rl.mdp package¶

Subpackages¶

Submodules¶

simple_rl.mdp.MDPClass module¶

simple_rl.mdp.MDPDistributionClass module¶

simple_rl.mdp.StateClass module¶

Module contents¶

Table Of Contents

Previous topic

Next topic

This Page