simple_rl.mdp package

Subpackages

Submodules

simple_rl.mdp.MDPClass module

MDPClass.py: Contains the MDP Class.

class simple_rl.mdp.MDPClass.MDP(actions, transition_func, reward_func, init_state, gamma=0.99, step_cost=0)[source]

Bases: object

Abstract class for a Markov Decision Process.

end_of_instance()[source]
execute_agent_action(action)[source]
Args:
action (str)
Returns:
(tuple: <float,State>): reward, State
Summary:
Core method of all of simple_rl. Facilitates interaction between the MDP and an agent.
get_actions()[source]
get_curr_state()[source]
get_gamma()[source]
get_init_state()[source]
get_num_state_feats()[source]
get_parameters()[source]
Returns:
(dict) key=param_name (str) --> val=param_val (object).
get_reward_func()[source]
get_slip_prob()[source]
get_transition_func()[source]
reset()[source]
set_gamma(new_gamma)[source]
set_slip_prob(slip_prob)[source]
set_step_cost(new_step_cost)[source]

simple_rl.mdp.MDPDistributionClass module

MDPDistributionClass.py: Contains the MDP Distribution Class.

class simple_rl.mdp.MDPDistributionClass.MDPDistribution(mdp_prob_dict, horizon=0)[source]

Bases: object

Class for distributions over MDPs.

get_actions()[source]
get_all_mdps(prob_threshold=0)[source]
Args:
prob_threshold (float)
Returns:
(list): Contains all mdps in the distribution with Pr. > @prob_threshold.
get_average_reward_func()[source]
get_gamma()[source]
Notes:
Not all MDPs in the distribution are guaranteed to share gamma.
get_horizon()[source]
get_init_state()[source]
Notes:
Not all MDPs in the distribution are guaranteed to share init states.
get_mdps()[source]
get_num_mdps()[source]
get_parameters()[source]
Returns:
(dict) key=param_name (str) --> val=param_val (object).
get_prob_of_mdp(mdp)[source]
get_reward_func(avg=True)[source]
remove_mdp(mdp)[source]
Args:
(MDP)
Summary:
Removes @mdp from self.mdp_prob_dict and recomputes the distribution.
remove_mdps(mdp_list)[source]
Args:
(list): Contains MDP instances.
Summary:
Removes each mdp in @mdp_list from self.mdp_prob_dict and recomputes the distribution.
sample(k=1)[source]
Args:
k (int)
Returns:
(List of MDP): Samples @k mdps without replacement.
set_gamma(new_gamma)[source]
simple_rl.mdp.MDPDistributionClass.main()[source]

simple_rl.mdp.StateClass module

class simple_rl.mdp.StateClass.State(data=[], is_terminal=False)[source]

Bases: object

Abstract State class

features()[source]
Summary
Used by function approximators to represent the state. Override this method in State subclasses to have functiona approximators use a different set of features.
Returns:
(iterable)
get_data()[source]
get_num_feats()[source]
is_terminal()[source]
set_terminal(is_term=True)[source]

Module contents