plotting (sa.plotting)

This script contains various plotting functions for strains analysis, such as histograms, PCA projection, MDS, Silhouette plot.

sa.plotting.plot_hist_WT_WT_distance(wt_wt, out_dir)[source]

Plot histogram of distances between WT strains and save the histogram to directory :param:`out_dir`.

Parameters:wt_wt (list) – Distances between all pairs of WT strains.
sa.plotting.plot_hist_coll(colls, colls_labels, out_dir)[source]

Plot and save to directory :param:`out_dir` overlayed histograms of distances between collections :param:`colls`.

Parameters:colls_labels (list) – Names of collections as ordered and specified by parameter :param:`colls`.
sa.plotting.plot_hist_mean_MT_WT__WT_WT_distance(m_mt_wt, m_wt_wt, out_dir)[source]

Plot overlayed (i) histogram of mean distances of WT strains to other WT strains and (ii) and histogram of mean mutant distances to WT strains. Save the plot to directory :param:`out_dir` named hist_WT-WT_MT-MT_distances.pdf.

Parameters:
  • m_mt_wt (list) – Mean distances of MT strains to WT strains.
  • m_wt_wt (list) – Mean distances between WT strain and other WT strains.
sa.plotting.plot_hist_mean_MT_WT_distance(m_mt_wt, out_dir)[source]

Plot histogram of mean distances of mutant strains to WT strains and save the histogram to directory :param:`out_dir`.

Parameters:m_mt_wt (list) – Mean distance of each mutant strain to WT strains averaging distances if replicates exist.
sa.plotting.plot_hist_signif_MT_WT(mut, plates_mt, dnp_mtc, tot_mean, out_dir)[source]

Compute and plot observed mutant distance and mean mutant distance to sampled random set of mutant strains. Save histogram showing null distribution of distances and observed distance for each mutant to directory :param:`out_dir`.

Compare observed mutant strain distance to WT strains and mutant distance to sampled sets of mutant strains:
  1. Construct N (100) random sets of size S (100) which contain mutant strains.
  2. Compute mean distance between mutant strain and each of constructed sets.
  3. Plot histogram of distances from previous step and observed mean distance.
  4. Repeat for each mutant strain. Mean distances are taken for replicates.
Parameters:
  • mut (list) – Observed distances of mutants and some meta data in the format [ (orf, mean_dist, [(pn1, date1, r1, c1, dist1), (pn2, date2, r2, c2, dist1), ...]) ]
  • plates_mt (list) – Complete profiles of mutant strains
  • dnp_mtc (numpy.array) – Computational profiles of mutant strains.
  • tot_mean (float) – Total mean distance between MT and WT strains.
  • out_dir (str) – Full path to output directory where plot will be saved.
sa.plotting.plot_hist_with_norm_fit(X, attr, title, out_name)[source]

Plot a histogram of of input data and plot the analytic normal PDF over it. Save it to a file :param:`out_name`.

Parameters:title (str) – Plate identifier used as part of title in the plot.
sa.plotting.plot_pca_projection(X, plate, title, out_name)[source]

Plot 3D projection along PCA components with the most variance and save it to a file :param:`out_name`.

Parameters:
  • X (numpy.array) – Transformed data set of computational features (dimensionality reduction).
  • plate (list) – Plate data as returned from utilities.read.
sa.plotting.plot_plate_by_mean_well_distance(dist_matrix, title, out_name, plate)[source]

Produce matrix plot of wells as placed on the plate. The entries in matrix are mean observation distances in param:dist_matrix from other observations on plate. Save to file named :param:`out_name`.

White entries in the plot represent missing strains or removed WT strains due to outlier detection.

Parameters:title (str) – Plate identifier used as part of title in the plot.

Return matrix of mean well distances.

sa.plotting.plot_silhouette(scores, c_idx, K, out_name)[source]

Save a silhouette plot to file named :param:`out_name`, showing the distributions of silhouette scores :param:`scores` in clusters. Scores represent the silhouette scores for each observation.

Parameters:
  • scores (list) – Silhouette score for each observation.
  • c_idx (list) – Cluster index for each observation.
  • K (int) – The number of clusters.

Previous topic

methods (sa.methods)

Next topic

utilities (sa.utilities)