simple_rl.utils package

Submodules

simple_rl.utils.additional_datastructures module

additional_datastructures.py: File containing custom utility data structures for use in simple_rl.

class simple_rl.utils.additional_datastructures.SimpleRLStack(_list=None)[source]

Bases: object

Implementation for a basic Stack data structure

is_empty()[source]
peek()[source]
pop()[source]
push(element)[source]
size()[source]

simple_rl.utils.chart_utils module

chart_utils.py: Charting utilities for RL experiments.

Functions:
load_data: Loads data from csv files into lists. average_data: Averages data across instances. compute_conf_intervals: Confidence interval computation. compute_single_conf_interval: Helper function for above. _format_title() plot: Creates (and opens) a single plot using matplotlib.pyplot make_plots: Puts everything in order to create the plot. _get_agent_names: Grabs the agent names the experiment parameter file, named @Experiment.EXP_PARAM_FILE_NAME _get_agent_colors: Determines the relevant colors/markers for the plot. _is_episodic: Determines if the experiment was episodic from the experiment parameter file, named @Experiment.EXP_PARAM_FILE_NAME _is_disc_reward() parse_args: Parse command line arguments. main: Loads data from a given path and creates plot.

Author: David Abel (cs.brown.edu/~dabel)

simple_rl.utils.chart_utils.average_data(data, cumulative=False)[source]
Args:
data (list): a 3D matrix, [algorithm][instance][episode] cumulative (bool) *opt: determines if we should compute the average cumulative reward/cost or just regular.
Returns:
(list): a 2D matrix, [algorithm][episode], where the instance rewards have been averaged.
simple_rl.utils.chart_utils.compute_conf_intervals(data, cumulative=False)[source]
Args:
data (list): A 3D matrix, [algorithm][instance][episode] cumulative (bool) *opt
simple_rl.utils.chart_utils.compute_single_conf_interval(datum)[source]
Args:
datum (list): A vector of data points to compute the confidence interval of.
Returns:
(float): Margin of error.
simple_rl.utils.chart_utils.drange(x_min, x_max, x_increment)[source]
Args:
x_min (float) x_max (float) x_increment (float)
Returns:
(generator): Makes a list.
Notes:
A range function for generating lists of floats. Based on code from stack overflow user Sam Bruns:
https://stackoverflow.com/questions/16105485/unsupported-operand-types-for-float-and-decimal
simple_rl.utils.chart_utils.load_data(experiment_dir, experiment_agents)[source]
Args:
experiment_dir (str): Points to the file containing all the data. experiment_agents (list): Points to which results files will be plotted.
Returns:
result (list): A 3d matrix containing rewards, where the dimensions are [algorithm][instance][episode].
simple_rl.utils.chart_utils.main()[source]
Summary:
For manual plotting.
simple_rl.utils.chart_utils.make_plots(experiment_dir, experiment_agents, plot_file_name='', cumulative=True, use_cost=False, episodic=True, open_plot=True, track_disc_reward=False)[source]
Args:
experiment_dir (str): path to results. experiment_agents (list): agent names (looks for "<agent-name>.csv"). plot_file_name (str) cumulative (bool): If true, plots show cumulative trr use_cost (bool): If true, plots are in terms of cost. Otherwise, plots are in terms of reward. episodic (bool): If true, labels the x-axis "Episode Number". Otherwise, "Step Number". track_disc_reward (bool): If true, plots discounted reward (changes plot title, too).
Summary:
Creates plots for all agents run under the experiment. Stores the plot in results/<experiment_name>/<plot_name>.pdf
simple_rl.utils.chart_utils.parse_args()[source]
Summary:
Parses two arguments, 'dir' (directory pointer) and 'a' (bool to indicate avg. plot).
simple_rl.utils.chart_utils.plot(results, experiment_dir, agents, plot_file_name='', conf_intervals=[], use_cost=False, cumulative=False, episodic=True, open_plot=True, track_disc_reward=False)[source]
Args:
results (list of lists): each element is itself the reward from an episode for an algorithm. experiment_dir (str): path to results. agents (list): each element is an agent that was run in the experiment. plot_file_name (str) conf_intervals (list of floats) [optional]: confidence intervals to display with the chart. use_cost (bool) [optional]: If true, plots are in terms of cost. Otherwise, plots are in terms of reward. cumulative (bool) [optional]: If true, plots are cumulative cost/reward. episodic (bool): If true, labels the x-axis "Episode Number". Otherwise, "Step Number". open_plot (bool) track_disc_reward (bool): If true, plots discounted reward.
Summary:
Makes (and opens) a single reward chart plotting all of the data in @data.

simple_rl.utils.make_mdp module

make_mdp.py

Utility for making MDP instances or distributions.

simple_rl.utils.make_mdp.make_markov_game(markov_game_class='grid_game')[source]
simple_rl.utils.make_mdp.make_mdp(mdp_class='grid', grid_dim=7)[source]
Returns:
(MDP)
simple_rl.utils.make_mdp.make_mdp_distr(mdp_class='grid', grid_dim=9, horizon=0, step_cost=0, gamma=0.99)[source]
Args:
mdp_class (str): one of {"grid", "random"} horizon (int) step_cost (float) gamma (float)
Returns:
(MDPDistribution)

simple_rl.utils.mdp_visualizer module

Module contents