Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recipe runner CLI entry point #1160

Merged
merged 1 commit into from
Apr 23, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions starfish/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
from .experiment.experiment import Experiment, FieldOfView
from .imagestack.imagestack import ImageStack
from .intensity_table.intensity_table import IntensityTable
from .recipe import cli
from .starfish import starfish


Expand Down
28 changes: 28 additions & 0 deletions starfish/recipe/cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
from slicedimage.io import resolve_path_or_url

from starfish.config import StarfishConfig
from starfish.starfish import starfish
from starfish.util import click
from .recipe import Recipe


@starfish.command("recipe")
@click.option("--recipe", required=True, type=str, metavar="RECIPE_PATH_OR_URL")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why we have a public run method now that we can run recipes and you've shown this to be faster.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For API users.

@click.option(
"--input", type=str, multiple=True, metavar="INPUT_FILE_PATH_OR_URL",
help="input file paths or urls to map to the recipe input parameters")
@click.option(
"--output", type=str, multiple=True, metavar="OUTPUT_FILE_PATH",
help="output file paths to write recipe outputs to")
@click.pass_context
def run_recipe(ctx, recipe, input, output):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs usage examples. It's also not clear what any of the parameters must contain for the run to be valid. That should probably be documented here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is recipe dependent. Please let me know if #1245 is adequate.

"""Runs a recipe with a given set of inputs and outputs."""
config = StarfishConfig()

backend, relativeurl, _ = resolve_path_or_url(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General style question: We have a bunch of custom classes, and in this case I needed to follow 3 wrapper functions in slicedimage.io to find out that this backend is a subclass of slicedimage.backends.Backend.

Since we have type hints, I would find it helpful to use them in the cases of our custom classes, so that developers have a better sense of what the code is doing.

In this case:

    backend: slicedimage.io.Backend
    relativeurl: str
    backend, relativeurl, _ = resolve_path_or_url(
        recipe, backend_config=config.slicedimage)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can do this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction: Backend was never exposed.

I will correct this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recipe, backend_config=config.slicedimage)
with backend.read_contextmanager(relativeurl) as fh:
recipe_str = fh.read()

recipe_obj = Recipe(recipe_str, input, output)
recipe_obj.run_and_save()
Empty file.
45 changes: 45 additions & 0 deletions starfish/test/full_pipelines/recipe/_base_recipe_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
import os
import shutil
from pathlib import Path
from typing import Iterable, Optional

from starfish.util import exec


class RecipeTest:
"""This is a base class for testing recipes. Each recipe test should define its recipe file,
the input files, the output files, and a test method that verifies the correctness of the
results.
"""
@property
def recipe(self) -> Path:
raise NotImplementedError()

@property
def input_url_or_paths(self) -> Iterable[str]:
raise NotImplementedError()

@property
def output_paths(self) -> Iterable[Path]:
raise NotImplementedError()

def verify_results(self, tempdir: Path):
raise NotImplementedError()

def test_run_recipe(self):
cmdline = ["starfish", "recipe", "--recipe", self.recipe]
for input_url_or_path in self.input_url_or_paths:
cmdline.extend(["--input", input_url_or_path])
for output_path in self.output_paths:
cmdline.extend([
"--output",
lambda tempdir, *args, **kwargs: os.path.join(tempdir, os.fspath(output_path))])

tempdir: Optional[str] = None
try:
tempdir = exec.stages([cmdline], keep_data=True)

self.verify_results(Path(tempdir))
finally:
if tempdir is not None and os.getenv("TEST_KEEP_DATA") is None:
shutil.rmtree(tempdir)
55 changes: 55 additions & 0 deletions starfish/test/full_pipelines/recipe/iss_recipe.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
primary_image = file_inputs[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. We need to doc how this works and what your mini-language can do.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dots = file_inputs[1]
nuclei = file_inputs[2]
codebook = file_inputs[3]

max_proj = compute(
Filter.MaxProject,
primary_image,
dims=['c', 'z'])

transformation_list = compute(
LearnTransform.Translation,
max_proj,
reference_stack=dots, upsampling=1000, axes=Axes.ROUND)

transformed = compute(
ApplyTransform.Warp,
primary_image,
transformation_list)

filtered_primary = compute(
Filter.WhiteTophat,
transformed,
masking_radius=15)

filtered_nuclei = compute(
Filter.WhiteTophat,
nuclei,
masking_radius=15)

filtered_dots = compute(
Filter.WhiteTophat,
dots,
masking_radius=15)

spots = compute(
DetectSpots.BlobDetector,
filtered_primary, filtered_dots, {Axes.ROUND, Axes.CH},
min_sigma=4, max_sigma=6, num_sigma=20, threshold=0.01)

segmentation = compute(
Segment.Watershed,
filtered_primary, filtered_nuclei,
nuclei_threshold=.16, input_threshold=.22, min_distance=57)

target_assignment = compute(
AssignTargets.Label,
segmentation, spots)

decoded = compute(
Decode.PerRoundMaxChannel,
target_assignment,
codebook=codebook)

file_outputs[0] = decoded
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean you only output once at the end of the pipeline?
That would definitely explain why this process is way faster the exporting and importing after each step

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is true. The speed improvements (IMO) is mostly from not starting up the starfish process repeatedly.

53 changes: 53 additions & 0 deletions starfish/test/full_pipelines/recipe/test_iss.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
"""
Notes
-----
This test and docs/source/usage/iss/iss_cli.sh test the same code paths and should be updated
together
"""
import os
import unittest
from pathlib import Path
from typing import Iterable

import numpy as np
import pandas as pd
import pytest

from starfish.intensity_table.intensity_table import IntensityTable
from starfish.types import Features
from ._base_recipe_test import RecipeTest


URL = "https://d2nhj9g34unfro.cloudfront.net/20181005/ISS-TEST/experiment.json"


@pytest.mark.slow
class TestWithIssData(RecipeTest, unittest.TestCase):
@property
def recipe(self) -> Path:
test_file_path = Path(__file__)
recipe = test_file_path.parent / "iss_recipe.txt"
return recipe

@property
def input_url_or_paths(self) -> Iterable[str]:
return [
f"@{URL}[fov_001][primary]", # primary image
f"@{URL}[fov_001][dots]", # dots image
f"@{URL}[fov_001][nuclei]", # nuclei image
f"@{URL}", # codebook
]

@property
def output_paths(self) -> Iterable[Path]:
return [
Path("decoded_spots.nc")
]

def verify_results(self, tempdir: Path):
intensities = IntensityTable.open_netcdf(os.fspath(tempdir / "decoded_spots.nc"))
genes, counts = np.unique(
intensities.coords[Features.TARGET], return_counts=True)
gene_counts = pd.Series(counts, genes)
assert gene_counts['ACTB'] == 9
assert gene_counts['GAPDH'] == 9