utilities (sa.utilities)

This script reads CSV files with extracted data.

Assumed file format:
  • sequence number
  • plate
  • date
  • row
  • column
  • ... (computational features – those that can be interpreted as countable real numbers)
  • ORF
sa.utilities.combine(meta, dataL)[source]

Combine strains data from different plates into one set, add attribute explaining plate membership for observation.

sa.utilities.data2np(data, skip_first=5, skip_last=1)[source]

Convert data to numpy array for further analysis skipping some non-computational features.

sa.utilities.filter_attribute(attrs, data, attr_name='ORF', attr_values=['YOR202W'])[source]

Filter data by selecting only rows whose column specified in :param:`fname` match values in :param:`fvalues`.

sa.utilities.plates2dict(plate, dnp)[source]

Return dictionary from :param:`plate` indexed by plate identifier, row and column number.

sa.utilities.pp_plate_info(meta, plate)[source]
sa.utilities.read(dir_path=None, *files)[source]

Read files and return for each file a list of data. Header incl.

Return a list with entries describing files. Each entry is a of format ((file_name, attr_names), plate_data).

sa.utilities.read_repl(file_path, keys=['RT', '37'])[source]

Read file with repeating mutants in csv format with attributes [ORF, plate, row, column]. Header incl.

Parameters:
  • file_path (str) – Full file path to CSV file with information on replicates.
  • keys (list) – Names of TS (temperature sensitive mutants) plates’ extensions. By default, these are RT and 37.

Return a list where each entry is one repeating mutant.

sa.utilities.split_WT_MT(meta, data, wt_mt_name='ORF', wt_name=['YOR202W'])[source]

Split plate data to two groups: (i) wild-type, (ii) mutants. Wild-type strains are in entire border.

sa.utilities.std_prep(data_del, data_ts, data_sg, out_dir, wt_attr_name='ORF', wt_name=['YOR202W'])[source]

Standard preprocessing; (1) standardize WT strains in each plate and remove outiers, (2) standardize mutant strains, (3) combine computational and non-computational features from all plates.

Save preprocessed data in Orange format to directory :param:`out_dir` named preprocessed_del_ts_sg.tab. Save preprocessed data in CSV to directory :param:`out_dir` named preprocessed_del_ts_sg.csv.

Parameters:
  • data_del (tuple (meta_data, plates_data)) – Deletion collection plates data as returned from utilities.read.
  • data_ts (tuple (meta_data, plates_data)) – TS collection plates data as returned from utilities.read.
  • data_sg (tuple (meta_data, plates_data)) – SG collection plates data as returned from utilities.read.
  • out_dir (str) – Full path to directory where data in Orange format is saved.
  • wt_attr_name (str) – Identifier of attribute that contains ORFs.
  • wt_name (list) – Names of the wild-type ORFs.

Return preprocessed computational profiles and plates data.

See also

See also functions sa.methods.standardize() and sa.methods.detect_outliers().

sa.utilities.to_csv(names, data_org, dnp, out_name, noncomp_first=5, noncomp_last=1)[source]

Save data in CSV format.

sa.utilities.to_orange(names, data_org, dnp, out_name, noncomp_first=5, noncomp_last=1)[source]

Save data in Orange format. Non-computational features are stored as meta attributes.

Previous topic

plotting (sa.plotting)