plot_tools

A set of convenience functions used for producing plots in dabest.

source

sankeydiag

 sankeydiag (data:pandas.core.frame.DataFrame, xvar:str, yvar:str,
             left_idx:str, right_idx:str, leftLabels:list=None,
             rightLabels:list=None, palette:Union[str,dict]=None, ax=None,
             one_sankey:bool=False, width:float=0.4,
             rightColor:bool=False, align:str='center', alpha:float=0.65,
             **kwargs)

Read in melted pd.DataFrame, and draw multiple sankey diagram on a single axes using the value in column yvar according to the value in column xvar left_idx in the column xvar is on the left side of each sankey diagram right_idx in the column xvar is on the right side of each sankey diagram

Type Default Details
data pd.DataFrame
xvar str x column to be plotted.
yvar str y column to be plotted.
left_idx str the value in column xvar that is on the left side of each sankey diagram
right_idx str the value in column xvar that is on the right side of each sankey diagram, if len(left_idx) == 1, it will be broadcasted to the same length as right_idx, otherwise it should have the same length as right_idx
leftLabels list None labels for the left side of the diagram. The diagram will be sorted by these labels.
rightLabels list None labels for the right side of the diagram. The diagram will be sorted by these labels.
palette str | dict None
ax NoneType None matplotlib axes to be drawn on
one_sankey bool False determined by the driver function on plotter.py, if True, draw the sankey diagram across the whole raw data axes
width float 0.4 the width of each sankey diagram
rightColor bool False if True, each strip of the diagram will be colored according to the corresponding left labels
align str center the alignment of each sankey diagram, can be ‘center’ or ‘left’
alpha float 0.65 the transparency of each strip
kwargs

source

single_sankey

 single_sankey (left:<built-infunctionarray>, right:<built-
                infunctionarray>, xpos:float=0, leftWeight:<built-
                infunctionarray>=None, rightWeight:<built-
                infunctionarray>=None, colorDict:dict=None,
                leftLabels:list=None, rightLabels:list=None, ax=None,
                width=0.5, alpha=0.65, bar_width=0.2,
                rightColor:bool=False, align:bool='center')

Make a single Sankey diagram showing proportion flow from left to right Original code from: https://github.com/anazalea/pySankey Changes are added to normalize each diagram’s height to be 1

Type Default Details
left np.array data on the left of the diagram
right np.array data on the right of the diagram, len(left) == len(right)
xpos float 0 the starting point on the x-axis
leftWeight np.array None weights for the left labels, if None, all weights are 1
rightWeight np.array None weights for the right labels, if None, all weights are corresponding leftWeight
colorDict dict None input format: {‘label’: ‘color’}
leftLabels list None labels for the left side of the diagram. The diagram will be sorted by these labels.
rightLabels list None labels for the right side of the diagram. The diagram will be sorted by these labels.
ax NoneType None matplotlib axes to be drawn on
width float 0.5
alpha float 0.65
bar_width float 0.2
rightColor bool False if True, each strip of the diagram will be colored according to the corresponding left labels
align bool center if ‘center’, the diagram will be centered on each xtick, if ‘edge’, the diagram will be aligned with the left edge of each xtick

source

normalize_dict

 normalize_dict (nested_dict, target)

source

check_data_matches_labels

 check_data_matches_labels (labels, data, side:str)

Function to check that the labels and data match in the sankey diagram. And enforce labels and data to be lists. Raises an exception if the labels and data do not match.

Type Details
labels list of input labels
data Pandas Series of input data
side str ‘left’ or ‘right’ on the sankey diagram

source

error_bar

 error_bar (data:pandas.core.frame.DataFrame, x:str, y:str,
            type:str='mean_sd', offset:float=0.2, ax=None,
            line_color='black', gap_width_percent=1, pos:list=[0, 1],
            method:str='gapped_lines', **kwargs:dict)

Function to plot the standard deviations as vertical errorbars. The mean is a gap defined by negative space.

This function combines the functionality of gapped_lines(), proportional_error_bar(), and sankey_error_bar().

Type Default Details
data pd.DataFrame This DataFrame should be in ‘long’ format.
x str x column to be plotted.
y str y column to be plotted.
type str mean_sd Choose from [‘mean_sd’, ‘median_quartiles’]. Plots the summary statistics for each group. If ‘mean_sd’, then the mean and standard deviation of each group is plotted as a gapped line. If ‘median_quantiles’, then the median and 25th and 75th percentiles of each group is plotted instead.
offset float 0.2 Give a single float (that will be used as the x-offset of all gapped lines), or an iterable containing the list of x-offsets.
ax NoneType None If a matplotlib Axes object is specified, the gapped lines will be plotted in order on this axes. If None, the current axes (plt.gca()) is used.
line_color str black
gap_width_percent int 1
pos list [0, 1] The positions of the error bars for the sankey_error_bar method.
method str gapped_lines The method to use for drawing the error bars. Options are: ‘gapped_lines’, ‘proportional_error_bar’, and ‘sankey_error_bar’.
kwargs dict

source

get_swarm_spans

 get_swarm_spans (coll)

Given a matplotlib Collection, will obtain the x and y spans for the collection. Will return None if this fails.


source

halfviolin

 halfviolin (v, half='right', fill_color='k', alpha=1, line_color='k',
             line_width=0)