Skip to content

Commit

Permalink
Merge pull request #584 from ewels/stand-alone-schema
Browse files Browse the repository at this point in the history
Stand alone schema
  • Loading branch information
drpatelh authored Mar 17, 2020
2 parents 97ab1cb + cb5fafb commit a84c619
Show file tree
Hide file tree
Showing 9 changed files with 891 additions and 14 deletions.
112 changes: 112 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ A python package with helper tools for the nf-core community.
* [`nf-core licences` - List software licences in a pipeline](#pipeline-software-licences)
* [`nf-core create` - Create a new workflow from the nf-core template](#creating-a-new-workflow)
* [`nf-core lint` - Check pipeline code against nf-core guidelines](#linting-a-workflow)
* [`nf-core schema` - Work with pipeline schema files](#working-with-pipeline-schema)
* [`nf-core bump-version` - Update nf-core pipeline version number](#bumping-a-pipeline-version-number)
* [`nf-core sync` - Synchronise pipeline TEMPLATE branches](#sync-a-pipeline-with-the-template)
* [Citation](#citation)
Expand Down Expand Up @@ -439,6 +440,117 @@ WARNING: Test Warnings:

You can find extensive documentation about each of the lint tests in the [lint errors documentation](https://nf-co.re/errors).

## Working with pipeline schema

nf-core pipelines have a `nextflow_schema.json` file in their root which describes the different parameters used by the workflow.
These files allow automated validation of inputs when running the pipeline, are used to generate command line help and can be used to build interfaces to launch pipelines.
Pipeline schema files are built according to the [JSONSchema specification](https://json-schema.org/) (Draft 7).

To help developers working with pipeline schema, nf-core tools has three `schema` sub-commands:

* `nf-core schema validate`
* `nf-core schema build`
* `nf-core schema lint`

### nf-core schema validate

Nextflow can take input parameters in a JSON or YAML file when running a pipeline using the `-params-file` option.
This command validates such a file against the pipeline schema.

Usage is `nextflow schema validate <pipeline> --params <parameter file>`, eg:

```console
$ nf-core schema validate my_pipeline --params my_inputs.json

,--./,-.
___ __ __ __ ___ /,-._.--~\
|\ | |__ __ / ` / \ |__) |__ } {
| \| | \__, \__/ | \ |___ \`-._,-`-,
`._,._,'


INFO: [✓] Pipeline schema looks valid

ERROR: [✗] Input parameters are invalid: 'reads' is a required property
```

The `pipeline` option can be a directory containing a pipeline, a path to a schema file or the name of an nf-core pipeline (which will be downloaded using `nextflow pull`).

### nf-core schema build

Manually building JSONSchema documents is not trivial and can be very error prone.
Instead, the `nf-core schema build` command collects your pipeline parameters and gives interactive prompts about any missing or unexpected params.
If no existing schema is found it will create one for you.

Once built, the tool can send the schema to the nf-core website so that you can use a graphical interface to organise and fill in the schema.
The tool checks the status of your schema on the website and once complete, saves your changes locally.

Usage is `nextflow schema build <pipeline_directory>`, eg:

```console
$ nf-core schema build nf-core-testpipeline

,--./,-.
___ __ __ __ ___ /,-._.--~\
|\ | |__ __ / ` / \ |__) |__ } {
| \| | \__, \__/ | \ |___ \`-._,-`-,
`._,._,'


INFO: Loaded existing JSON schema with 18 params: nf-core-testpipeline/nextflow_schema.json

Unrecognised 'params.old_param' found in schema but not in Nextflow config. Remove it? [Y/n]:
Unrecognised 'params.we_removed_this_too' found in schema but not in Nextflow config. Remove it? [Y/n]:

INFO: Removed 2 params from existing JSON Schema that were not found with `nextflow config`:
old_param, we_removed_this_too

Found 'params.reads' in Nextflow config. Add to JSON Schema? [Y/n]:
Found 'params.outdir' in Nextflow config. Add to JSON Schema? [Y/n]:

INFO: Added 2 params to JSON Schema that were found with `nextflow config`:
reads, outdir

INFO: Writing JSON schema with 18 params: nf-core-testpipeline/nextflow_schema.json

Launch web builder for customisation and editing? [Y/n]:

INFO: Opening URL: http://localhost:8888/json_schema_build?id=1584441828_b990ac785cd6

INFO: Waiting for form to be completed in the browser. Use ctrl+c to stop waiting and force exit.
..........
INFO: Found saved status from nf-core JSON Schema builder

INFO: Writing JSON schema with 18 params: nf-core-testpipeline/nextflow_schema.json
```

There are three flags that you can use with this command:

* `--no-prompts`: Make changes without prompting for confirmation each time. Does not launch web tool.
* `--web-only`: Skips comparison of the schema against the pipeline parameters and only launches the web tool.
* `--url <web_address>`: Supply a custom URL for the online tool. Useful when testing locally.

### nf-core schema lint

The pipeline schema is linted as part of the main `nf-core lint` command,
however sometimes it can be useful to quickly check the syntax of the JSONSchema without running a full lint run.

Usage is `nextflow schema lint <schema>`, eg:

```console
$ nf-core schema lint nextflow_schema.json

,--./,-.
___ __ __ __ ___ /,-._.--~\
|\ | |__ __ / ` / \ |__) |__ } {
| \| | \__, \__/ | \ |___ \`-._,-`-,
`._,._,'


ERROR: [✗] JSON Schema does not follow nf-core specs:
Schema should have 'properties' section
```

## Bumping a pipeline version number

When releasing a new version of a nf-core pipeline, version numbers have to be updated in several different places. The helper command `nf-core bump-version` automates this for you to avoid manual errors (and frustration!).
Expand Down
28 changes: 24 additions & 4 deletions docs/lint_errors.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,15 @@ The following files are suggested but not a hard requirement. If they are missin
* `conf/base.config`
* A `conf` directory with at least one config called `base.config`

Additionally, the following files must not be present:
The following files will cause a failure if the _are_ present (to fix, delete them):

* `Singularity`
* As we are relying on [Docker Hub](https://https://hub.docker.com/) instead of Singularity
and all containers are automatically pulled from there, repositories should not
have a `Singularity` file present.
* `parameters.settings.json`
* The syntax for pipeline schema has changed - old `parameters.settings.json` should be
deleted and new `nextflow_schema.json` files created instead.

## Error #2 - Docker file check failed ## {#2}

Expand Down Expand Up @@ -306,16 +312,30 @@ The nf-core workflow template contains a number of comment lines with the follow

This lint test runs through all files in the pipeline and searches for these lines.

## Error #11 - Singularity file found ##{#11}
## Error #11 - Pipeline name ## {#11}

As we are relying on [Docker Hub](https://hub.docker.com/) instead of Singularity and all containers are automatically pulled from there, repositories should not have a `Singularity` file present.
_..removed.._

## Error #12 - Pipeline name ## {#12}

In order to ensure consistent naming, pipeline names should contain only lower case, alphabetical characters. Otherwise a warning is displayed.
In order to ensure consistent naming, pipeline names should contain only lower case, alphanumeric characters. Otherwise a warning is displayed.

## Error #13 - Pipeline name ## {#13}

The `nf-core create` pipeline template uses [cookiecutter](https://github.com/cookiecutter/cookiecutter) behind the scenes.
This check fails if any cookiecutter template variables such as `{{ cookiecutter.pipeline_name }}` are fouund in your pipeline code.
Finding a placeholder like this means that something was probably copied and pasted from the template without being properly rendered for your pipeline.

## Error #14 - Pipeline schema syntax ## {#14}

Pipelines should have a `nextflow_schema.json` file that describes the different pipeline parameters (eg. `params.something`, `--something`).

Schema should be valid JSON files and adhere to [JSONSchema](https://json-schema.org/), Draft 7.
The top-level schema should be an `object`, where each of the `properties` corresponds to a pipeline parameter.

## Error #15 - Schema config check ## {#15}

The `nextflow_schema.json` pipeline schema should describe every flat parameter returned from the `nextflow config` command (params that are objects or more complex structures are ignored).
Missing parameters result in a lint failure.

If any parameters are found in the schema that were not returned from `nextflow config` a warning is given.
4 changes: 2 additions & 2 deletions nf_core/launch.py
Original file line number Diff line number Diff line change
Expand Up @@ -279,10 +279,10 @@ def prompt_param_flags(self):
click.style('Parameter group: ', bold=True, underline=True),
click.style(group_label, bold=True, underline=True, fg='red')
))
use_defaults = click.confirm(
no_prompts = click.confirm(
"Do you want to change the group's defaults? "+click.style('[y/N]', fg='green'),
default=False, show_default=False)
if not use_defaults:
if not no_prompts:
continue
for parameter in params:
# Skip this option if the render mode is none
Expand Down
59 changes: 54 additions & 5 deletions nf_core/lint.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
import yaml

import nf_core.utils
import nf_core.schema

# Set up local caching for requests to speed up remote queries
nf_core.utils.setup_requests_cachedir()
Expand Down Expand Up @@ -129,6 +130,7 @@ def __init__(self, path):
self.dockerfile = []
self.conda_config = {}
self.conda_package_info = {}
self.schema_obj = None
self.passed = []
self.warned = []
self.failed = []
Expand Down Expand Up @@ -174,7 +176,9 @@ def lint_pipeline(self, release_mode=False):
'check_conda_dockerfile',
'check_pipeline_todos',
'check_pipeline_name',
'check_cookiecutter_strings'
'check_cookiecutter_strings',
'check_schema_lint',
'check_schema_params'
]
if release_mode:
self.release_mode = True
Expand Down Expand Up @@ -248,7 +252,8 @@ def check_files_exist(self):

# List of strings. Dails / warns if any of the strings exist.
files_fail_ifexists = [
'Singularity'
'Singularity',
'parameters.settings.json'
]
files_warn_ifexists = [
'.travis.yml'
Expand Down Expand Up @@ -908,12 +913,12 @@ def check_pipeline_todos(self):
def check_pipeline_name(self):
"""Check whether pipeline name adheres to lower case/no hyphen naming convention"""

if self.pipeline_name.islower() and self.pipeline_name.isalpha():
if self.pipeline_name.islower() and self.pipeline_name.isalnum():
self.passed.append((12, "Name adheres to nf-core convention"))
if not self.pipeline_name.islower():
self.warned.append((12, "Naming does not adhere to nf-core conventions: Contains uppercase letters"))
if not self.pipeline_name.isalpha():
self.warned.append((12, "Naming does not adhere to nf-core conventions: Contains non alphabetical characters"))
if not self.pipeline_name.isalnum():
self.warned.append((12, "Naming does not adhere to nf-core conventions: Contains non alphanumeric characters"))

def check_cookiecutter_strings(self):
"""
Expand Down Expand Up @@ -950,6 +955,50 @@ def check_cookiecutter_strings(self):
self.passed.append((13, "Did not find any cookiecutter template strings ({} files)".format(num_files)))


def check_schema_lint(self):
""" Lint the pipeline JSON schema file """
# Suppress log messages
logger = logging.getLogger()
logger.disabled = True

# Lint the schema
self.schema_obj = nf_core.schema.PipelineSchema()
schema_path = os.path.join(self.path, 'nextflow_schema.json')
try:
self.schema_obj.lint_schema(schema_path)
self.passed.append((14, "Schema lint passed"))
except AssertionError as e:
self.failed.append((14, "Schema lint failed: {}".format(e)))

# Reset logger
logger.disabled = False

def check_schema_params(self):
""" Check that the schema describes all flat params in the pipeline """

# First, get the top-level config options for the pipeline
# Schema object already created in the previous test
self.schema_obj.get_wf_params(self.path)
self.schema_obj.no_prompts = True

# Remove any schema params not found in the config
removed_params = self.schema_obj.remove_schema_notfound_configs()

# Add schema params found in the config but not the schema
added_params = self.schema_obj.add_schema_found_configs()

if len(removed_params) > 0:
for param in removed_params:
self.warned.append((15, "Schema param '{}' not found from nextflow config".format(param)))

if len(added_params) > 0:
for param in added_params:
self.failed.append((15, "Param '{}' from `nextflow config` not found in nextflow_schema.json".format(param)))

if len(removed_params) == 0 and len(added_params) == 0:
self.passed.append((15, "Schema matched params returned from nextflow config"))


def print_results(self):
# Print results
rl = "\n Using --release mode linting tests" if self.release_mode else ''
Expand Down
Loading

0 comments on commit a84c619

Please sign in to comment.