diff --git a/docs/source/creating_an_image_processing_pipeline/index.rst b/docs/source/creating_an_image_processing_pipeline/index.rst index 8f765024e..954128261 100644 --- a/docs/source/creating_an_image_processing_pipeline/index.rst +++ b/docs/source/creating_an_image_processing_pipeline/index.rst @@ -31,8 +31,8 @@ and running it on an experiment. Loading Data ------------ -* :ref:`Formatting your data ` -* :ref:`Using formatted example data ` +* Tutorial: :ref:`Formatting your data ` +* Tutorial: :ref:`Using formatted example data ` .. _section_manipulating_images: @@ -43,8 +43,8 @@ Sometimes it can be useful subset the images by, for example, excluding out-of-f cropping out edge effects. For sparse data, it can be useful to project the z-volume into a single image, as this produces a much faster processing routine. -* :ref:`Cropping ` -* :ref:`Projecting ` +* Tutorial: :ref:`Cropping ` +* Tutorial: :ref:`Projecting ` .. _section_correcting_images: @@ -56,11 +56,11 @@ handling or microfluidices that are involved in capturing the images. These step *independent* of the assay. *Starfish* enables the user to design a pipeline that matches their imaging system -* :ref:`Illumination Correction ` -* :ref:`Chromatic Aberration ` -* :ref:`Deconvolution ` -* :ref:`Image Registration ` -* :ref:`Image Correction Pipeline ` +* Tutorial: :ref:`Illumination Correction ` +* Tutorial: :ref:`Chromatic Aberration ` +* Tutorial: :ref:`Deconvolution ` +* Tutorial: :ref:`Image Registration ` +* Tutorial: :ref:`Image Correction Pipeline ` .. _section_improving_snr: @@ -72,7 +72,7 @@ some level of autofluorescence which causes cellular compartments to have more b intracellular regions. This can confound spot finders, which look for local intensity differences. These approaches ameliorate these problems. -* :ref:`Removing Autofluorescence ` +* Tutorial: :ref:`Removing Autofluorescence ` .. _section_normalizing_intensities: @@ -116,7 +116,8 @@ How to normalize How to normalize depends on your data and a key assumption. There are two approaches for normalizing images in starfish: -:ref:`Normalizing Intensity Distributions` +Normalizing Intensity Distributions +""""""""""""""""""""""""""""""""""" If you know a priori that image volumes acquired for every channel and/or every round should have the same distribution of intensities then the intensity *distributions* of image volumes can be @@ -124,7 +125,10 @@ normalized with :py:class:`.MatchHistograms`. Typically this means the number of background autofluorescence in every image volume is approximately uniform across channels and/or rounds. -:ref:`Normalizing Intensity Values ` +* Tutorial: :ref:`Normalizing Intensity Distributions` + +Normalizing Intensity Values +"""""""""""""""""""""""""""" In most data sets the differences in gene expression leads to too much variation in number of spots between channels and rounds. Normalizing intensity distributions would incorrectly skew the @@ -132,6 +136,8 @@ intensities. Instead you can use :py:class:`.Clip`, :py:class:`.ClipPercentileTo :py:class:`.ClipValueToZero` to normalize intensity *values* by clipping extreme values and rescaling. +* Tutorial: :ref:`Normalizing Intensity Values ` + .. _section_finding_and_decoding: Finding and Decoding Spots @@ -142,6 +148,88 @@ Finding and Decoding Spots Segmenting Cells ---------------- +Unlike single-cell RNA sequencing, image-based transcriptomics methods do not physically separate +cells before acquiring RNA information. Therefore, in order to characterize cells, the RNA must be +assigned into single cells by partitioning the image volume. Accurate unsupervised cell-segmentation +is an `open problem `_ for all biomedical imaging +disciplines ranging from digital pathology to neuroscience. + +The challenge of segmenting cells depends on the structural complexity of the sample and quality +of images available. For example, a sparse cell mono-layer with a strong cytosol stain would be +trivial to segment but a dense heterogeneous population of cells in 3D tissue with only a DAPI stain +can be impossible to segment perfectly. On the experimental side, selecting good cell stains and +acquiring images with low background will make segmenting a more tractable task. + +There are many approaches for segmenting cells from image-based transcriptomics assays. Below are +a few methods that are implemented or integrated with starfish to output a +:py:class:`.BinaryMaskCollection`, which represents a collection of labeled objects. If you do not +know which segmentation method to use, a safe bet is to start with thresholding and watershed. On +the other hand, if you can afford to manually define ROI masks there is no better way to +guarantee accurate segmentation. + +.. note:: + While there is no "ground truth" for cell segmentation, the closest approximation is manual + segmentation by an expert in the tissue of interest. + +Thresholding and Watershed +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The traditional method for segmenting cells in fluorescence microscopy images is to threshold the +image into foreground pixels and background pixels and then label connected foreground as +individual cells. Common issues that affect thresholding such as background noise can be corrected +by preprocessing images before thresholding and filtering connected components after. There are +`many automated image thresholding algorithms `_ but currently +starfish requires manually selecting a global threshold value in :py:class:`.ThresholdBinarize`. + +When overlapping cells are labeled as one connected component, they are typically segmented by +using a `distance transformation followed by the watershed algorithm `_. Watershed is a classic image processing algorithm for separating objects in images and +can be applied to all types of images. Pairing it with a distance transform is particularly +useful for segmenting convex shapes like cells. + +A segmentation pipeline that consists of thresholding, connected component analysis, and watershed +is simple and fast to implement but its accuracy is highly dependent on image quality. +The signal-to-noise ratio of the cell stain must be high enough for minimal errors after +thresholding and binary operations. And the nuclei or cell shapes must be convex to meet the +assumptions of the distance transform or else it will over-segment. Starfish includes the basic +functions to build a watershed segmentation pipeline and a predefined :py:class:`.Watershed` +segmentation class that uses the :term:`primary images` as the cell stain. + +* Tutorial: :ref:`Ways to segment by thresholding and watershed` + +Manually Defining Cells +^^^^^^^^^^^^^^^^^^^^^^^ + +The most accurate but time-consuming approach is to manually segment images using a tool such as +`ROI manager `_ in FIJI (ImageJ). It +is a straightforward process that starfish supports by allowing ROI set binaries to be imported as a +:py:class:`.BinaryMaskCollection`. These masks can then be integrated into the pipeline for +visualization and assigning spots to cells. + +* Tutorial: :ref:`Loading ImageJ ROI set` + +Machine-Learning Methods +^^^^^^^^^^^^^^^^^^^^^^^^ + +Besides the two classic cell segmentation approaches mentioned above, there are machine-learning +methods that aim to replicate the accuracy of manual cell segmentation while reducing the labor +required. Machine-learning algorithms for segmentation are continually improving but there is no +perfect solution for all image types yet. These methods require training data (e.g. stained +images with manually defined labels) to train a model to predict cell or nuclei locations in test +data. There are `exceptions that don't require training on your specific data `_ but generally training the model is something to consider when evaluating how much time +each segmentation approach will require. + +Starfish currently has built-in functionality to support `ilastik `_, a +segmentation toolkit that leverages machine-learning. Ilastik has a Pixel Classification +workflow that performs semantic segmentation of the image, returning probability maps for each +label such as cells and background. To transform the images of pixel probabilities to binary +masks, you can use the same thresholding and watershed methods in starfish that are used for +segmenting images of stained cells. + +* Tutorial: :ref:`Using ilastik in starfish` + .. _section_assigning_spots: Assigning Spots to Cells @@ -170,4 +258,3 @@ origin. At this point, it's trivial to create a cell x gene matrix. * :ref:`Spot Decoding ` * :ref:`Segmenting Cells ` * :ref:`Assigning Spots to Cells ` - diff --git a/examples/RoiSet.zip b/examples/RoiSet.zip new file mode 100644 index 000000000..5bf468473 Binary files /dev/null and b/examples/RoiSet.zip differ diff --git a/examples/dapi_Probabilities.h5 b/examples/dapi_Probabilities.h5 new file mode 100644 index 000000000..d70709103 Binary files /dev/null and b/examples/dapi_Probabilities.h5 differ diff --git a/examples/ilastik_segmentation.py b/examples/ilastik_segmentation.py new file mode 100644 index 000000000..7dee22e9b --- /dev/null +++ b/examples/ilastik_segmentation.py @@ -0,0 +1,146 @@ +""" +.. _tutorial_ilastik_segmentation: + +Using ilastik in starfish +========================= + +In order to create a cell by gene expression matrix from image-based transcriptomics data, RNA +spots must be assigned to cells. Segmenting a microscopy image into single cells is either +time-consuming when done manually or inaccurate when done with thresholding and watershed. The +advent of deep learning models for image segmentation promises a solution that is both fast and +accurate. + +Starfish currently has built-in functionality to support `ilastik `_, +a segmentation toolkit that leverages machine-learning. Ilastik has a `Pixel Classification +workflow `_ +that performs semantic segmentation of the image, returning probability maps for each +label (e.g. cells). To use ilastik in starfish first install ilastik locally and follow the +Pixel Classification workflow using the GUI to train a classifier, which is saved in an ilastik +project file. + +There are two options for using ilastik in starfish. The first is running the trained pixel +classifier on images outside of starfish to generate a probability map that can then be loaded in +starfish as an :py:class:`.ImageStack`. The second is to run the pixel classifier from within +starfish by calling out to ilastik and passing it the trained classifier. The results will be the +same. + +The probability map is an nd-array with the same size as the input :py:class:`.ImageStack`. The +value of each element in the nd-array equals the probability of the corresponding pixel to belong to +a label. In ilastik, you can train the pixel classifier to predict the probability for multiple +labels such as nuclei, cells, and background. However, starfish will only load the probability +map of the first label so you should use the first label for cells or nuclei only. + +This tutorial will first describe how you can run a trained ilastik classifier from within a +starfish process. In the second section this tutorial will demonstrate how to load an ilastik +probability map hdf5 file as an :py:class:`.ImageStack`. The last section transforms the +:py:class:`.ImageStack` into a :py:class:`BinaryMaskCollection` that can be used by +:py:class:`.Label` to assign spots to cells. +""" + +################################################################################################### +# Calling Out to ilastik Trained Pixel Classifier +# =============================================== +# +# Running a trained ilastik pixel classifier from within starfish allows a pipeline to take +# advantage of ilastik's machine-learning model without having to process images from every field +# of view, switching to ilastik to run classifier, exporting all the probability maps, and then +# loading the files back into a starfish pipeline. +# +# The requirements for calling out to ilastik from a starfish pipeline is having ilastik +# installed locally and having an ilastik project that contains a trained classifier. The +# classifier can only be run on an :py:class:`ImageStack` with a single round and channel. +# Obviously, the :py:class:`ImageStack` should be an image of the same type that the classifier +# was trained on for expected probability results. + +# Load MERFISH data to get dapi ImageStack +import os +import matplotlib +import matplotlib.pyplot as plt + +import starfish.data +from starfish.image import Filter +from starfish.types import Axes, Coordinates, Levels + +matplotlib.rcParams["figure.dpi"] = 150 + +experiment = starfish.data.MERFISH() +fov = experiment["fov_000"] +dapi = fov.get_image("nuclei") + +# Process dapi images by blurring and clipping +def preprocess(dapi): + blur = Filter.GaussianLowPass(sigma=5) + blurred = blur.run(dapi) + clip = Filter.Clip(p_min=1, p_max=95, level_method=Levels.SCALE_BY_CHUNK) + clipped = clip.run(blurred) + return clipped + +dapi = preprocess(dapi) + +# Need ilastik script and ilastik project with trained classifier to instantiate filter +ilastik_exe_path = os.path.join(os.path.dirname("__file__"), 'run_ilastik.sh') +ilastik_proj_path = os.path.join(os.path.dirname("__file__"), 'merfish_dapi.ilp') +ipp = Filter.IlastikPretrainedProbability(ilastik_executable=ilastik_exe_path, ilastik_project=ilastik_proj_path) + +# Run IlastikPretrainedProbability filter to get probabilities of 'Label 1' as ImageStack +# probabilities = ipp.run(stack=dapi) + +################################################################################################### +# Loading ilastik Probability Map +# =============================== +# +# If you already have probability maps of your images from ilastik or prefer to use the ilastik +# GUI it is possible to load the exported hdf5 files into starfish. When using the 'Prediction +# Export' panel in ilastik be sure to use ``Source: Probabilities`` and ``Format: hdf5``. If you +# edit the Dataset name then it must also be passed as ``dataset_name`` to +# :py:meth:`.import_ilastik_probabilities`. The example below loads the same probability map that +# would have been generated in the first section of this tutorial. + +# Need to instantiate Filter even though it is not run +ilastik_exe_path = os.path.join(os.path.dirname("__file__"), 'run_ilastik.sh') +ilastik_proj_path = os.path.join(os.path.dirname("__file__"), 'merfish_dapi.ilp') +ipp = Filter.IlastikPretrainedProbability(ilastik_executable=ilastik_exe_path, ilastik_project=ilastik_proj_path) + +# Load hdf5 file as ImageStack +h5_file_path = os.path.join(os.path.dirname("__file__"), 'dapi_Probabilities.h5') +imported_probabilities = ipp.import_ilastik_probabilities(path_to_h5_file=h5_file_path) + +# View probability map next to dapi image +f, (ax1, ax2) = plt.subplots(ncols=2) +ax1.imshow(dapi.xarray.values.squeeze()) +ax1.set_title("Dapi") +ax2.imshow(imported_probabilities.xarray.values.squeeze()) +ax2.set_title("Probabilities") +f.tight_layout() + +################################################################################################### +# Transforming Probability Map to Masks +# ===================================== +# +# In order to use ilastik semantic segmentation probabilities to assign spots to cells in +# starfish they must be transformed into binary cell masks. A straightforward approach to +# transforming the :py:class:`.ImageStack` to a :py:class:`.BinaryMaskCollection` is to threshold +# the probabilities, run connected component analysis, and watershed. This process is analogous to +# segmenting a stained cell image. + +# Scale probability map +clip = Filter.Clip(p_min=0, p_max=100, level_method=Levels.SCALE_BY_CHUNK) +clipped_probabilities = clip.run(imported_probabilities) + +# Threshold, connected components, watershed, and filter by area +from starfish.morphology import Binarize, Filter +prob_thresh = 0.05 +min_dist = 100 +min_allowed_size = 5000 +max_allowed_size = 100000 +binarized_probabilities = Binarize.ThresholdBinarize(prob_thresh).run(clipped_probabilities) +labeled_masks = Filter.MinDistanceLabel(min_dist, 1).run(binarized_probabilities) +masks = Filter.AreaFilter(min_area=min_allowed_size, max_area=max_allowed_size).run(labeled_masks) + +# Show probability map along with cell masks +f, (ax1, ax2) = plt.subplots(ncols=2) +ax1.imshow(imported_probabilities.xarray.values.squeeze()) +ax1.set_title("Probabilities") +ax2.imshow(labeled_masks.to_label_image().xarray.values.squeeze(), cmap=plt.cm.nipy_spectral) +ax2.set_title("Masks") +f.tight_layout() diff --git a/examples/manual_segmentation.py b/examples/manual_segmentation.py new file mode 100644 index 000000000..29eac961e --- /dev/null +++ b/examples/manual_segmentation.py @@ -0,0 +1,72 @@ +""" +.. _tutorial_manual_segmentation: + +Loading ImageJ ROI Sets +======================= + +.. note:: + Starfish only supports importing 2D ROIs. + +In order to create a cell by gene expression matrix from image-based transcriptomics data, RNA +spots must be assigned to cells by segmenting an image. The best quality cell segmentation +annotations are manually drawn by experts. If you have ROI sets exported with `ROI manager +`_ in ImageJ or FIJI, they can be +loaded into starfish as a :py:class:`.BinaryMaskCollection`. The ROI set for each field of view +must be passed with the corresponding :py:class:`.ImageStack` to :py:meth:`.from_fiji_roi_set` in +order to assign accurate ``pixel ticks`` and ``physical ticks`` to the +:py:class:`.BinaryMaskCollection`. + +This tutorial demonstrates how to load an ROI set that was exported form ROI manager as a .zip +and how to plot the resulting :py:class:`.BinaryMaskCollection`. + +Brief segmentation workflow in ImageJ ROI Manager: + +* Tools > ROI Manager +* Click "polygon selection" (third button from left on GUI) +* Create a polygon, then click the "+" button to finalize it +* Repeat until segmented +* Click "more >>>" +* Click "save" +* Save the ROISet +""" + +import matplotlib +import matplotlib.pyplot as plt +import os + +import starfish.data +from starfish import BinaryMaskCollection +from starfish.image import Filter +from starfish.types import Levels + +matplotlib.rcParams["figure.dpi"] = 150 + +# Load MERFISH data to get dapi ImageStack +experiment = starfish.data.MERFISH() +fov = experiment["fov_000"] +dapi = fov.get_image("nuclei") # nuclei + + +# Preprocess MERFISH dapi imagestack +def preprocess(dapi): + blur = Filter.GaussianLowPass(sigma=5) + blurred = blur.run(dapi) + + clip = Filter.Clip(p_min=1, p_max=95, level_method=Levels.SCALE_BY_CHUNK) + clipped = clip.run(blurred) + return clipped + + +dapi = preprocess(dapi) + +# Import RoiSet.zip as BinaryMaskCollection +roi_path = os.path.join(os.path.dirname("__file__"), 'RoiSet.zip') +masks = BinaryMaskCollection.from_fiji_roi_set(path_to_roi_set_zip=roi_path, original_image=dapi) + +# display BinaryMaskCollection next to original dapi image +f, (ax1, ax2) = plt.subplots(ncols=2) +ax1.imshow(dapi.xarray.values.squeeze()) +ax1.set_title("Dapi") +ax2.imshow(masks.to_label_image().xarray.values.squeeze(), cmap=plt.cm.nipy_spectral) +ax2.set_title("Imported ROI") +f.tight_layout() diff --git a/examples/merfish_dapi.ilp b/examples/merfish_dapi.ilp new file mode 100644 index 000000000..1a10c25fb Binary files /dev/null and b/examples/merfish_dapi.ilp differ diff --git a/examples/run_ilastik.sh b/examples/run_ilastik.sh new file mode 100644 index 000000000..94655da53 --- /dev/null +++ b/examples/run_ilastik.sh @@ -0,0 +1 @@ +THIS IS A FAKE SHELL SCRIPT INTENDED TO TRICK :py:class:`.IlastikPretrainedProbability` diff --git a/examples/watershed_segmentation.py b/examples/watershed_segmentation.py new file mode 100644 index 000000000..7acbae5f2 --- /dev/null +++ b/examples/watershed_segmentation.py @@ -0,0 +1,247 @@ +""" +.. _tutorial_watershed_segmentation: + +Watershed Segmentation +====================== + +In order to create a cell by gene expression matrix from image-based transcriptomics data, RNA +spots must be assigned to cells. One approach is to binarize images of stained cellular structures +to define cell boundaries. A problem arises when there are contiguous or partially overlapping +cells, as is often the case in tissue. These clumped cells are mislabeled as one cell by connected +component labeling. + +Watershed segmentation can be used to divide connected objects like clumped cells by finding +watershed lines that separate pixel intensity basins. The classic method for computing pixel +intensity values from a binary image is applying a distance transform, which labels foreground +pixels furthest from the background with the lowest values and pixels close to the background +with higher values. The overall effect after watershed is to segment objects into convex shapes, +which is generally a good approximation for a cell or nucleus. More details about the watershed +algorithm can be found +`here `_. + +Starfish allows users to implement watershed segmentation into their pipelines in two ways. The +first is to use :py:class:`.WatershedSegment` to explicitly define a +segmentation pipeline. The second is to use :py:class:`.Watershed`, a predefined watershed +segmentation pipeline that uses watershed segmented nuclei as seeds to run +watershed segmentation on cell stain images. + +The predefined watershed segmentation pipeline will not work for all data, so this tutorial +will first show you how you can replicate the predefined watershed segmentation pipeline using the +classes and methods provided in :py:mod:`.morphology`. Then this tutorial will cover how to run +the predefined segmentation pipeline. + +Inputs for this tutorial are :py:class:`.ImageStack`\s: + +* registered primary images to mimic a cell stain +* registered nuclei images to seed the water segmentation of cells + +Output is a :py:class:`.BinaryMaskCollection`: + +* each binary mask in the collection is a labeled cell +""" + +# First we need to create our inputs by filtering and registering in situ sequencing (ISS) data. + +import matplotlib +import matplotlib.pyplot as plt +import numpy as np +from showit import image +from starfish.image import ApplyTransform, LearnTransform, Filter +from starfish.types import Axes, Levels +from starfish import data, FieldOfView + +matplotlib.rcParams["figure.dpi"] = 150 + +experiment = data.ISS() +fov = experiment.fov() + +imgs = fov.get_image(FieldOfView.PRIMARY_IMAGES) # primary images +dots = fov.get_image("dots") # reference round for image registration +nuclei = fov.get_image("nuclei") # nuclei` + +# filter raw data +masking_radius = 15 +filt = Filter.WhiteTophat(masking_radius, is_volume=False) +filt.run(imgs, in_place=True) +filt.run(dots, in_place=True) +filt.run(nuclei, in_place=True) + +# register primary images to reference round +learn_translation = LearnTransform.Translation(reference_stack=dots, axes=Axes.ROUND, upsampling=1000) +transforms_list = learn_translation.run(imgs.reduce({Axes.CH, Axes.ZPLANE}, func="max")) +warp = ApplyTransform.Warp() +warp.run(imgs, transforms_list=transforms_list, in_place=True) + +################################################################################################### +# Example of Custom Watershed Pipeline +# ==================================== +# +# Starfish allows the user to build a custom watershed segmentation pipeline. This tutorial +# will demonstrate how construct a pipeline by replicating the algorithm behind starfish's +# :py:class:`.Watershed` segmentation. +# +# In order to use this algorithm, the :term:`primary images` must have higher +# background intensity in intracellular regions than extracellular regions such that the +# intracellular regions can be labeled foreground by thresholding. This is not always the case +# since it is usually beneficial to tune the microscope parameters to minimize background and +# increase the SNR of spots in the acquired primary images. :term:`Auxiliary images` with cell stains would be ideal for this purpose and should be used instead if +# experimentally feasible. +# +# :term:`Auxiliary images ` with nuclear stains (e.g. DAPI) are also required in +# this algorithm to seed watershed segmentation of the cells. While a distance transformation of +# the binarized cell stain images could also be used to seed watershed, nuclear stained images +# are almost always included in an experiment. The advantage of using nuclei is that it will +# result in a more accurate number of cells since it is not as prone to over-segmentation +# artifacts as using the distance transform and nuclear stains are usually more robust than +# cellular stains. +# +# There are a number of parameters that need to be tuned for optimal segmentation. Generally, +# the same parameters can be used across an experiment unless there is variation in microscope +# settings or tissue characteristics (e.g. autofluorescence). + +# import methods for transformations on or to morphological data +from starfish.morphology import Binarize, Filter, Merge, Segment + +# set parameters +dapi_thresh = .18 # global threshold value for nuclei images +stain_thresh = .22 # global threshold value for primary images +min_dist = 57 # minimum distance (pixels) between nuclei distance transformed peaks +min_allowed_size = 10 # minimum size (in pixels) of nuclei +max_allowed_size = 10000 # maximum size (in pixels) of nuclei + +################################################################################################### +# This segmentation is done on 2D images so :py:class:`.reduce` is used to maximum intensity project +# and scale the primary and nuclei image volumes. Displaying the projected and scaled images can +# be useful for choosing thresholds. + +mp = imgs.reduce({Axes.CH, Axes.ZPLANE}, func="max") +stain = mp.reduce( + {Axes.ROUND}, + func="mean", + level_method=Levels.SCALE_BY_IMAGE) + +nuclei_mp_scaled = nuclei.reduce( + {Axes.ROUND, Axes.CH, Axes.ZPLANE}, + func="max", + level_method=Levels.SCALE_BY_IMAGE) + +f = plt.figure(figsize=(12,5)) +ax1 = f.add_subplot(121) +nuclei_numpy = nuclei_mp_scaled._squeezed_numpy(Axes.ROUND, Axes.CH, Axes.ZPLANE) +image(nuclei_numpy, ax=ax1, size=20, bar=True) +plt.title('Nuclei') + +ax2 = f.add_subplot(122) +image( + stain._squeezed_numpy(Axes.ROUND, Axes.CH, Axes.ZPLANE), + ax=ax2, size=20, bar=True) +plt.title('Stain') + +#################################################################################################### +# In order to obtain the seeds to watershed segment cells, the nuclei are labeled first. +# :py:class:`.ThresholdBinarize` thresholds and :py:class:`.MinDistanceLabel` labels the +# binarized nuclei by using a distance transform followed by watershed. :py:class:`.AreaFilter` +# then filters nuclei by area. The ``min_dist`` parameter that limits how close two +# peaks in the distance transformed image can be is key to preventing over-segmentation of nuclei +# with non-circular morphology but may also incorrectly merge two nuclei in close proximity. +# +# The binarized and segmented nuclei can be viewed to determine whether parameters need to be +# adjusted. + +binarized_nuclei = Binarize.ThresholdBinarize(dapi_thresh).run(nuclei_mp_scaled) +labeled_masks = Filter.MinDistanceLabel(min_dist, 1).run(binarized_nuclei) +watershed_markers = Filter.AreaFilter(min_area=min_allowed_size, max_area=max_allowed_size).run(labeled_masks) + +plt.subplot(121) +image( + binarized_nuclei.uncropped_mask(0).squeeze(Axes.ZPLANE.value).values, + bar=False, + ax=plt.gca(), +) +plt.title('Nuclei Thresholded') + +plt.subplot(122) +image( + watershed_markers.to_label_image().xarray.squeeze(Axes.ZPLANE.value).values, + size=20, + cmap=plt.cm.nipy_spectral, + ax=plt.gca(), +) +plt.title('Found: {} cells'.format(len(watershed_markers))) + +#################################################################################################### +# The cell stain image (i.e. primary image) is binarized and then the union of the binary cell +# image and binary nuclei image is used to create a mask for watershed. This ensures that the +# nuclei markers that seed watershed segmentation of cells are all within cells and that no area +# outside of cells is labeled. + +thresholded_stain = Binarize.ThresholdBinarize(stain_thresh).run(stain) +markers_and_stain = Merge.SimpleMerge().run([thresholded_stain, watershed_markers]) +watershed_mask = Filter.Reduce( + "logical_or", + lambda shape: np.zeros(shape=shape, dtype=np.bool) +).run(markers_and_stain) + +image( + watershed_mask.to_label_image().xarray.squeeze(Axes.ZPLANE.value).values, + bar=False, + ax=plt.gca(), +) +plt.title('Watershed Mask') + +#################################################################################################### +# The final step is to run seeded watershed segmentation. The pixel values of the cell stain +# define the topology of the watershed basins. The basins are filled starting from seeds, +# which are the nuclei markers. And the boundaries of the basins are the watershed mask. + +segmenter = Segment.WatershedSegment(connectivity=np.ones((1, 3, 3), dtype=np.bool)) + +# masks is BinaryMaskCollection for downstream steps +masks = segmenter.run( + stain, + watershed_markers, + watershed_mask, +) + +# display result +image( + masks.to_label_image().xarray.squeeze(Axes.ZPLANE.value).values, + size=20, + cmap=plt.cm.nipy_spectral, + ax=plt.gca(), +) +plt.title('Segmented Cells') + +################################################################################################### +# Pre-defined Watershed Segmentation Pipeline +# =========================================== +# +# Running :py:class:`.Watershed` from the :py:mod:`.image` module (not to be confused with +# :py:class:`.WatershedSegment` from the :py:mod:`.morphology` module) is a convenient method to +# apply the same segmentation algorithm that was built in the previous section of this tutorial. +# It sets the ``min_allowed_size`` and ``max_allowed_size`` of the nuclei to 10 +# pixels and 1,000 pixels, respectively, but accepts +# +# Here is an example of how to run :py:class:`.Watershed` on the same set of images as the +# previous section. The intermediate results are saved as attributes of the +# :py:class:`.Watershed` instance and can be displayed to assess performance. + +from starfish.image import Segment + +# set parameters +dapi_thresh = .18 # global threshold value for nuclei images +stain_thresh = .22 # global threshold value for primary images +min_dist = 57 # minimum distance (pixels) between nuclei distance transformed peaks + +seg = Segment.Watershed( + nuclei_threshold=dapi_thresh, + input_threshold=stain_thresh, + min_distance=min_dist +) + +# masks is BinaryMaskCollection for downstream steps +masks = seg.run(imgs, nuclei) + +# display intermediate images and result +seg.show() diff --git a/expression b/expression index 5a0e9f0b7..f71b255fa 100644 Binary files a/expression and b/expression differ diff --git a/starfish/core/image/Filter/ilastik_pre_trained_probability.py b/starfish/core/image/Filter/ilastik_pre_trained_probability.py index 704c4f15a..193cfa19c 100644 --- a/starfish/core/image/Filter/ilastik_pre_trained_probability.py +++ b/starfish/core/image/Filter/ilastik_pre_trained_probability.py @@ -20,8 +20,9 @@ class IlastikPretrainedProbability(FilterAlgorithm): Parameters ---------- ilastik_executable: Union[Path, str] - Path to run_ilastik.sh needed for running ilastik in headless mode. Typically the script is - located: "/Applications/{ILASTIK_VERSION}/Contents/ilastik-release/run_ilastik.sh" + Path to run_ilastik.sh (Linux and Mac) or ilastik.bat (Windows) needed for running + ilastik in headless mode. Typically the script is located: "/Applications/{ + ILASTIK_VERSION}/Contents/ilastik-release/run_ilastik.sh" ilastik_project: Union[Path, str] path to ilastik project .ilp file @@ -31,9 +32,9 @@ class IlastikPretrainedProbability(FilterAlgorithm): def __init__(self, ilastik_executable: Union[Path, str], ilastik_project: Union[Path, str]): ilastik_executable = Path(ilastik_executable) if not ilastik_executable.exists(): - raise EnvironmentError("Can not find run_ilastik.sh. Make sure you've provided the " - "correct location. If you need to download ilastik please" - " visit: https://www.ilastik.org/download.html") + raise EnvironmentError("Can not find run_ilastik.sh or ilastik.bat. Make sure you've " + "provided the correct location. If you need to download ilastik " + "please visit: https://www.ilastik.org/download.html") self.ilastik_executable = ilastik_executable self.ilastik_project = ilastik_project @@ -46,7 +47,7 @@ def run( *args, ) -> Optional[ImageStack]: """ - Use a pre trained probability pixel classification model to generate probabilites + Use a pre trained probability pixel classification model to generate probabilities for a dapi image Parameters @@ -94,8 +95,11 @@ def run( return self.import_ilastik_probabilities(output_file) @classmethod - def import_ilastik_probabilities(cls, path_to_h5_file: Union[str, Path] - ) -> ImageStack: + def import_ilastik_probabilities( + cls, + path_to_h5_file: Union[str, Path], + dataset_name: str = "exported_data" + ) -> ImageStack: """ Import cell probabilities provided by ilastik as an ImageStack. @@ -103,7 +107,8 @@ def import_ilastik_probabilities(cls, path_to_h5_file: Union[str, Path] ---------- path_to_h5_file : Union[str, Path] Path to the .h5 file outputted by ilastik - + dataset_name : str + Name of dataset in ilastik Export Image Settings Returns ------- ImageStack : @@ -111,7 +116,7 @@ def import_ilastik_probabilities(cls, path_to_h5_file: Union[str, Path] """ h5 = h5py.File(path_to_h5_file) - probability_images = h5["exported_data"][:] + probability_images = h5[dataset_name][:] h5.close() cell_probabilities, _ = probability_images[:, :, 0], probability_images[:, :, 1] label_array = ndi.label(cell_probabilities)[0] diff --git a/starfish/core/morphology/Filter/min_distance_label.py b/starfish/core/morphology/Filter/min_distance_label.py index 1afc0caaf..1320301d9 100644 --- a/starfish/core/morphology/Filter/min_distance_label.py +++ b/starfish/core/morphology/Filter/min_distance_label.py @@ -21,7 +21,7 @@ class MinDistanceLabel(FilterAlgorithm): minimum_distance_xy : int The minimum distance between the peaks along the x or y axis. (default: 1) minimum_distance_z : int - The minimum distance between the peaks along the x or y axis. (default: 1) + The minimum distance between the peaks along the z axis. (default: 1) exclude_border : bool Exclude the borders for consideration for peaks. (default: False) """ diff --git a/starfish/core/morphology/binary_mask/binary_mask.py b/starfish/core/morphology/binary_mask/binary_mask.py index 8514d815c..5970799ea 100644 --- a/starfish/core/morphology/binary_mask/binary_mask.py +++ b/starfish/core/morphology/binary_mask/binary_mask.py @@ -281,7 +281,7 @@ def from_fiji_roi_set( path_to_roi_set_zip : Union[str, Path] Path to an external fiji roi file original_image : ImageStack - Dapi image used in fiji segmentation workflow + image from same FOV used in fiji segmentation workflow Returns --------