simple_rl.utils package¶

Submodules¶

simple_rl.utils.additional_datastructures module¶

additional_datastructures.py: File containing custom utility data structures for use in simple_rl.

class simple_rl.utils.additional_datastructures.SimpleRLStack(_list=None)[source]¶

Bases: object

Implementation for a basic Stack data structure

is_empty()[source]¶

peek()[source]¶

pop()[source]¶

push(element)[source]¶

size()[source]¶

simple_rl.utils.chart_utils module¶

chart_utils.py: Charting utilities for RL experiments.

Functions:: load_data: Loads data from csv files into lists. average_data: Averages data across instances. compute_conf_intervals: Confidence interval computation. compute_single_conf_interval: Helper function for above. _format_title() plot: Creates (and opens) a single plot using matplotlib.pyplot make_plots: Puts everything in order to create the plot. _get_agent_names: Grabs the agent names the experiment parameter file, named @Experiment.EXP_PARAM_FILE_NAME _get_agent_colors: Determines the relevant colors/markers for the plot. _is_episodic: Determines if the experiment was episodic from the experiment parameter file, named @Experiment.EXP_PARAM_FILE_NAME _is_disc_reward() parse_args: Parse command line arguments. main: Loads data from a given path and creates plot.

Author: David Abel (cs.brown.edu/~dabel)

simple_rl.utils.chart_utils.average_data(data, cumulative=False)[source]¶

Args:: data (list): a 3D matrix, [algorithm][instance][episode] cumulative (bool) *opt: determines if we should compute the average cumulative reward/cost or just regular.
Returns:: (list): a 2D matrix, [algorithm][episode], where the instance rewards have been averaged.

simple_rl.utils.chart_utils.compute_conf_intervals(data, cumulative=False)[source]¶

Args:: data (list): A 3D matrix, [algorithm][instance][episode] cumulative (bool) *opt

simple_rl.utils.chart_utils.compute_single_conf_interval(datum)[source]¶

Args:: datum (list): A vector of data points to compute the confidence interval of.
Returns:: (float): Margin of error.

simple_rl.utils.chart_utils.drange(x_min, x_max, x_increment)[source]¶

Args:

x_min (float) x_max (float) x_increment (float)

Returns:

(generator): Makes a list.

Notes:

A range function for generating lists of floats. Based on code from stack overflow user Sam Bruns:: https://stackoverflow.com/questions/16105485/unsupported-operand-types-for-float-and-decimal

simple_rl.utils.chart_utils.load_data(experiment_dir, experiment_agents)[source]¶

Args:: experiment_dir (str): Points to the file containing all the data. experiment_agents (list): Points to which results files will be plotted.
Returns:: result (list): A 3d matrix containing rewards, where the dimensions are [algorithm][instance][episode].

simple_rl.utils.chart_utils.main()[source]¶

Summary:: For manual plotting.

simple_rl.utils.chart_utils.make_plots(experiment_dir, experiment_agents, plot_file_name='', cumulative=True, use_cost=False, episodic=True, open_plot=True, track_disc_reward=False)[source]¶

Args:: experiment_dir (str): path to results. experiment_agents (list): agent names (looks for "<agent-name>.csv"). plot_file_name (str) cumulative (bool): If true, plots show cumulative trr use_cost (bool): If true, plots are in terms of cost. Otherwise, plots are in terms of reward. episodic (bool): If true, labels the x-axis "Episode Number". Otherwise, "Step Number". track_disc_reward (bool): If true, plots discounted reward (changes plot title, too).
Summary:: Creates plots for all agents run under the experiment. Stores the plot in results/<experiment_name>/<plot_name>.pdf

simple_rl.utils.chart_utils.parse_args()[source]¶

Summary:: Parses two arguments, 'dir' (directory pointer) and 'a' (bool to indicate avg. plot).

simple_rl.utils.chart_utils.plot(results, experiment_dir, agents, plot_file_name='', conf_intervals=[], use_cost=False, cumulative=False, episodic=True, open_plot=True, track_disc_reward=False)[source]¶

Args:: results (list of lists): each element is itself the reward from an episode for an algorithm. experiment_dir (str): path to results. agents (list): each element is an agent that was run in the experiment. plot_file_name (str) conf_intervals (list of floats) [optional]: confidence intervals to display with the chart. use_cost (bool) [optional]: If true, plots are in terms of cost. Otherwise, plots are in terms of reward. cumulative (bool) [optional]: If true, plots are cumulative cost/reward. episodic (bool): If true, labels the x-axis "Episode Number". Otherwise, "Step Number". open_plot (bool) track_disc_reward (bool): If true, plots discounted reward.
Summary:: Makes (and opens) a single reward chart plotting all of the data in @data.

simple_rl.utils.make_mdp module¶

make_mdp.py

Utility for making MDP instances or distributions.

simple_rl.utils.make_mdp.make_markov_game(markov_game_class='grid_game')[source]¶

simple_rl.utils.make_mdp.make_mdp(mdp_class='grid', grid_dim=7)[source]¶

Returns:: (MDP)

simple_rl.utils.make_mdp.make_mdp_distr(mdp_class='grid', grid_dim=9, horizon=0, step_cost=0, gamma=0.99)[source]¶

Args:: mdp_class (str): one of {"grid", "random"} horizon (int) step_cost (float) gamma (float)
Returns:: (MDPDistribution)

simple_rl.utils package¶

Submodules¶

simple_rl.utils.additional_datastructures module¶

simple_rl.utils.chart_utils module¶

simple_rl.utils.make_mdp module¶

simple_rl.utils.mdp_visualizer module¶

Module contents¶

Table Of Contents

Previous topic

This Page