plot_tools
dabest
.
sankeydiag
sankeydiag (data:pandas.core.frame.DataFrame, xvar:str, yvar:str, left_idx:str, right_idx:str, leftLabels:list=None, rightLabels:list=None, palette:Union[str,dict]=None, ax=None, one_sankey:bool=False, width:float=0.4, rightColor:bool=False, align:str='center', alpha:float=0.65, **kwargs)
Read in melted pd.DataFrame, and draw multiple sankey diagram on a single axes using the value in column yvar according to the value in column xvar left_idx in the column xvar is on the left side of each sankey diagram right_idx in the column xvar is on the right side of each sankey diagram
Type | Default | Details | |
---|---|---|---|
data | pd.DataFrame | ||
xvar | str | x column to be plotted. | |
yvar | str | y column to be plotted. | |
left_idx | str | the value in column xvar that is on the left side of each sankey diagram | |
right_idx | str | the value in column xvar that is on the right side of each sankey diagram, if len(left_idx) == 1, it will be broadcasted to the same length as right_idx, otherwise it should have the same length as right_idx | |
leftLabels | list | None | labels for the left side of the diagram. The diagram will be sorted by these labels. |
rightLabels | list | None | labels for the right side of the diagram. The diagram will be sorted by these labels. |
palette | str | dict | None | |
ax | NoneType | None | matplotlib axes to be drawn on |
one_sankey | bool | False | determined by the driver function on plotter.py, if True, draw the sankey diagram across the whole raw data axes |
width | float | 0.4 | the width of each sankey diagram |
rightColor | bool | False | if True, each strip of the diagram will be colored according to the corresponding left labels |
align | str | center | the alignment of each sankey diagram, can be ‘center’ or ‘left’ |
alpha | float | 0.65 | the transparency of each strip |
kwargs |
single_sankey
single_sankey (left:<built-infunctionarray>, right:<built- infunctionarray>, xpos:float=0, leftWeight:<built- infunctionarray>=None, rightWeight:<built- infunctionarray>=None, colorDict:dict=None, leftLabels:list=None, rightLabels:list=None, ax=None, width=0.5, alpha=0.65, bar_width=0.2, rightColor:bool=False, align:bool='center')
Make a single Sankey diagram showing proportion flow from left to right Original code from: https://github.com/anazalea/pySankey Changes are added to normalize each diagram’s height to be 1
Type | Default | Details | |
---|---|---|---|
left | np.array | data on the left of the diagram | |
right | np.array | data on the right of the diagram, len(left) == len(right) | |
xpos | float | 0 | the starting point on the x-axis |
leftWeight | np.array | None | weights for the left labels, if None, all weights are 1 |
rightWeight | np.array | None | weights for the right labels, if None, all weights are corresponding leftWeight |
colorDict | dict | None | input format: {‘label’: ‘color’} |
leftLabels | list | None | labels for the left side of the diagram. The diagram will be sorted by these labels. |
rightLabels | list | None | labels for the right side of the diagram. The diagram will be sorted by these labels. |
ax | NoneType | None | matplotlib axes to be drawn on |
width | float | 0.5 | |
alpha | float | 0.65 | |
bar_width | float | 0.2 | |
rightColor | bool | False | if True, each strip of the diagram will be colored according to the corresponding left labels |
align | bool | center | if ‘center’, the diagram will be centered on each xtick, if ‘edge’, the diagram will be aligned with the left edge of each xtick |
normalize_dict
normalize_dict (nested_dict, target)
check_data_matches_labels
check_data_matches_labels (labels, data, side:str)
Function to check that the labels and data match in the sankey diagram. And enforce labels and data to be lists. Raises an exception if the labels and data do not match.
Type | Details | |
---|---|---|
labels | list of input labels | |
data | Pandas Series of input data | |
side | str | ‘left’ or ‘right’ on the sankey diagram |
error_bar
error_bar (data:pandas.core.frame.DataFrame, x:str, y:str, type:str='mean_sd', offset:float=0.2, ax=None, line_color='black', gap_width_percent=1, pos:list=[0, 1], method:str='gapped_lines', **kwargs:dict)
Function to plot the standard deviations as vertical errorbars. The mean is a gap defined by negative space.
This function combines the functionality of gapped_lines(), proportional_error_bar(), and sankey_error_bar().
Type | Default | Details | |
---|---|---|---|
data | pd.DataFrame | This DataFrame should be in ‘long’ format. | |
x | str | x column to be plotted. | |
y | str | y column to be plotted. | |
type | str | mean_sd | Choose from [‘mean_sd’, ‘median_quartiles’]. Plots the summary statistics for each group. If ‘mean_sd’, then the mean and standard deviation of each group is plotted as a gapped line. If ‘median_quantiles’, then the median and 25th and 75th percentiles of each group is plotted instead. |
offset | float | 0.2 | Give a single float (that will be used as the x-offset of all gapped lines), or an iterable containing the list of x-offsets. |
ax | NoneType | None | If a matplotlib Axes object is specified, the gapped lines will be plotted in order on this axes. If None, the current axes (plt.gca()) is used. |
line_color | str | black | |
gap_width_percent | int | 1 | |
pos | list | [0, 1] | The positions of the error bars for the sankey_error_bar method. |
method | str | gapped_lines | The method to use for drawing the error bars. Options are: ‘gapped_lines’, ‘proportional_error_bar’, and ‘sankey_error_bar’. |
kwargs | dict |
get_swarm_spans
get_swarm_spans (coll)
Given a matplotlib Collection, will obtain the x and y spans for the collection. Will return None if this fails.
halfviolin
halfviolin (v, half='right', fill_color='k', alpha=1, line_color='k', line_width=0)