Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure excel_ui.run() to process Excel files lacking Beads or Samples sheets. #339

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
176 changes: 108 additions & 68 deletions FlowCal/excel_ui.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@
- **Time Channel**: Name of the time channel, as specified by the
``$PnN`` keyword in the associated FCS files.

- **Beads**: Describes the calibration beads samples that will be used
to calibrate cell samples in the **Samples** table. The following
information should be available for each beads sample:
- **Beads** (optional): Describes the calibration beads samples that will
be used to calibrate cell samples in the **Samples** table. The
following information should be available for each beads sample:

- **ID**: Short string identifying the beads sample. Will be
referenced by cell samples in the **Samples** table.
Expand All @@ -42,8 +42,9 @@
- **Clustering Channels**: The fluorescence channels used to identify
the different bead subpopulations.

- **Samples**: Describes the biological samples to be processed. The
following information should be available for each sample:
- **Samples** (optional): Describes the biological samples to be
processed. The following information should be available for each
sample:

- **ID**: Short string identifying the sample. Will be used as part
of the plot's filenames and in the **Histograms** table in the
Expand Down Expand Up @@ -97,6 +98,10 @@
import numpy as np
import pandas as pd
import openpyxl
try:
import xlrd
except ImportError:
pass

import FlowCal.io
import FlowCal.plot
Expand Down Expand Up @@ -780,7 +785,9 @@ def process_samples_table(samples_table,
elif units.lower() == 'mef':
units_label = "Molecules of Equivalent Fluorophore, MEF"
# Check if transformation function is available
if mef_transform_fxns[sample_row['Beads ID']] is None:
if mef_transform_fxns is None or \
mef_transform_fxns[sample_row['Beads ID']] \
is None:
raise ExcelUIException("MEF transformation "
"function not available")

Expand Down Expand Up @@ -1523,17 +1530,15 @@ def run(input_path=None,

1. If `input_path` is not specified, show a dialog to choose an input
Excel file.
2. Extract data from the Instruments, Beads, and Samples tables.
3. Process all the bead samples specified in the Beads table.
4. Generate statistics for each bead sample.
5. Process all the cell samples in the Samples table.
6. Generate statistics for each sample.
7. If requested, generate a histogram table for each fluorescent
channel specified for each sample.
8. Generate a table with run time, date, FlowCal version, among
others.
9. Save statistics and (if requested) histograms in an output Excel
file.
2. Read the Instruments table from the Instruments sheet.
3. If a Beads sheet is specified, read the Beads table, process beads
samples, and calculate statistics.
4. If a Samples sheet is specified, read the Samples table, process
samples, calculate statistics, and (if requested) generate a
histogram table describing each fluorescence channel of each sample.
5. Generate a table describing the run (e.g. with run time, date,
FlowCal version, etc.).
6. Save tables and calculated statistics to an output Excel file.

Parameters
----------
Expand All @@ -1551,10 +1556,9 @@ def run(input_path=None,
sample, and each beads sample.
hist_sheet : bool, optional
Whether to generate a sheet in the output Excel file specifying
histogram bin information.
histogram bin information. Ignored if Samples sheet is not specified.

"""

# If input file has not been specified, show open file dialog
if input_path is None:
input_path = show_open_file_dialog(filetypes=[('Excel files',
Expand All @@ -1563,72 +1567,108 @@ def run(input_path=None,
if verbose:
print("No input file selected.")
return

# Extract directory, filename, and filename with no extension from path
input_dir, input_filename = os.path.split(input_path)
input_filename_no_ext, __ = os.path.splitext(input_filename)

# Read relevant tables from workbook
# Process tables
if verbose:
print("Reading {}...".format(input_filename))
table_list = [] # List of (str, DataFrame) tuples, one for each
# sheet to be written to the output Excel file.

instruments_table = read_table(input_path,
sheetname='Instruments',
index_col='ID')
beads_table = read_table(input_path,
sheetname='Beads',
index_col='ID')
samples_table = read_table(input_path,
sheetname='Samples',
index_col='ID')
table_list.append(('Instruments', instruments_table))

# Process beads samples
beads_samples, mef_transform_fxns, mef_outputs = process_beads_table(
beads_table,
instruments_table,
base_dir=input_dir,
verbose=verbose,
plot=plot,
plot_dir='plot_beads',
full_output=True)

# Add stats to beads table
if verbose:
print("")
print("Calculating statistics for beads...")
add_beads_stats(beads_table, beads_samples, mef_outputs)
try:
beads_table = read_table(input_path,
sheetname='Beads',
index_col='ID')
except KeyError as e:
if 'Beads' in str(e):
# no Beads tab (openpyxl)
beads_table = None
else:
raise
except Exception as e:
if 'xlrd' in sys.modules and isinstance(e, xlrd.biffh.XLRDError) \
and 'Beads' in str(e):
# no Beads tab (xlrd)
beads_table = None
else:
raise

if beads_table is not None:
beads_samples, mef_transform_fxns, mef_outputs = process_beads_table(
beads_table,
instruments_table,
base_dir=input_dir,
verbose=verbose,
plot=plot,
plot_dir='plot_beads',
full_output=True)

# Add stats to beads table
if verbose:
print("")
print("Calculating statistics for beads...")
add_beads_stats(beads_table, beads_samples, mef_outputs)
table_list.append(('Beads', beads_table))
else:
beads_samples = mef_transform_fxns = mef_outputs = None

# Process samples
samples = process_samples_table(
samples_table,
instruments_table,
mef_transform_fxns=mef_transform_fxns,
beads_table=beads_table,
base_dir=input_dir,
verbose=verbose,
plot=plot,
plot_dir='plot_samples')

# Add stats to samples table
if verbose:
print("")
print("Calculating statistics for all samples...")
add_samples_stats(samples_table, samples)

# Generate histograms
if hist_sheet:
try:
samples_table = read_table(input_path,
sheetname='Samples',
index_col='ID')
except KeyError as e:
if 'Samples' in str(e):
# no Samples tab (openpyxl)
samples_table = None
else:
raise
except Exception as e:
if 'xlrd' in sys.modules and isinstance(e, xlrd.biffh.XLRDError) \
and 'Samples' in str(e):
# no Samples tab (xlrd)
samples_table = None
else:
raise

if samples_table is not None:
samples = process_samples_table(
samples_table,
instruments_table,
mef_transform_fxns=mef_transform_fxns,
beads_table=beads_table,
base_dir=input_dir,
verbose=verbose,
plot=plot,
plot_dir='plot_samples')

# Add stats to samples table
if verbose:
print("Generating histograms table...")
histograms_table = generate_histograms_table(samples_table, samples)
print("")
print("Calculating statistics for all samples...")
add_samples_stats(samples_table, samples)
table_list.append(('Samples', samples_table))

# Generate histograms
if hist_sheet:
if verbose:
print("Generating histograms table...")
histograms_table = generate_histograms_table(samples_table, samples)
table_list.append(('Histograms', histograms_table))
else:
samples = None

# Generate about table
about_table = generate_about_table({'Input file path': input_path})

# Generate list of tables to save
table_list = []
table_list.append(('Instruments', instruments_table))
table_list.append(('Beads', beads_table))
table_list.append(('Samples', samples_table))
if hist_sheet:
table_list.append(('Histograms', histograms_table))
table_list.append(('About Analysis', about_table))

# Write output excel file
Expand Down
6 changes: 3 additions & 3 deletions doc/excel_ui/analysis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ The analysis that FlowCal's Excel UI performs is divided roughly in two phases:

Processing of Calibration Beads
-------------------------------
The following steps are performed for each calibration beads sample specified in the **Beads** sheet of the :doc:`input Excel file<input_format>`:
If a **Beads** sheet is specified, the following steps are performed for each calibration beads sample:

1. :doc:`Density gating</fundamentals/density_gate>` is applied in the forward/side scatter channels. This is an automated procedure that eliminates microbead aggregates and debris.
2. The individual microbead subpopulations are identified using automated clustering.
Expand All @@ -19,10 +19,10 @@ For an introductory discussion of flow cytometry calibration, go to the :doc:`fu

Processing of Cell Samples
--------------------------
The following steps are performed for each cell sample specified in the **Samples** sheet of the :doc:`input Excel file<input_format>`:
If a **Samples** sheet is specified, the following steps are performed for each sample:

1. :doc:`Density gating</fundamentals/density_gate>` is applied in the forward/side scatter channels.
2. Fluorescence data for each specified fluorescence channel is transformed to the units specified in the **Units** column of the :doc:`input Excel file<input_format>`.
3. :ref:`Statistics<excel-ui-outputs-excel>` of the specified fluorescence channels are calculated, including mean, standard deviation, and others. A histogram of each fluorescence channel is also generated.

Statistics and histograms are saved to the :ref:`output Excel file<excel-ui-outputs-excel>`.
Statistics and histograms are saved to the :ref:`output Excel file<excel-ui-outputs-excel>`.
2 changes: 1 addition & 1 deletion doc/excel_ui/input_format.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Format of the Input Excel File
==============================

``FlowCal``'s Excel interface requires a properly formatted Excel file that depicts the samples to be analyzed and the data processing parameters. The Excel input file must have at least three sheets, named **Instruments**, **Beads**, and **Samples**. Other sheets may be present, but ``FlowCal`` will ignore them.
``FlowCal``'s Excel interface requires a properly formatted Excel file that depicts the samples to be analyzed and the data processing parameters. The Excel input file must have an **Instruments** sheet and typically also has **Beads** and **Samples** sheets. Other sheets may also be present, but ``FlowCal`` will ignore them.

.. warning:: Sheet and column names are case-sensitive.

Expand Down
4 changes: 3 additions & 1 deletion doc/excel_ui/outputs.rst
Original file line number Diff line number Diff line change
@@ -1,13 +1,15 @@
Outputs of the Excel UI
=======================

During processing of the calibration beads and cell samples, ``FlowCal`` creates two folders with images and an output Excel file in the same location as the :doc:`input Excel file<input_format>`. Here we describe these. In what follows, <ID> refers to the value specified in the ID column of the input Excel file.
The ``FlowCal`` Excel UI creates analysis plots and an output Excel file. If a Beads sheet is specified in the input Excel file, a `plot_beads` folder is created containing relevant plots. Similarly, if a Samples sheet is specified, a `plot_samples` folder is created with relevant plots. Beads and Samples sheets are also populated in the output Excel file if specified in the input Excel file.

.. _excel-ui-outputs-plots:

Plots
-----

Note: `<ID>` refers to the unique ID of a sample as labeled in the ID column of the input Excel file.

1. The folder ``plot_beads`` contains plots of the individual steps of processing of the calibration particle samples:

a. ``density_hist_<ID>.png``: A forward/side scatter 2D density diagram of the calibration particle sample, and a histogram for each relevant fluorescence channel.
Expand Down