Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stand alone schema #584

Merged
merged 27 commits into from
Mar 17, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
71477bd
First draft of a parameters.settings.json file for the template
ewels Nov 28, 2019
fd758a0
Fix JSON for template JSONschema file
ewels Mar 6, 2020
d0f20f8
Start writing new nf-core schema commands.
ewels Mar 6, 2020
7280782
Start building 'nf-core schema build' functionality
ewels Mar 6, 2020
1dce31d
Testing and refining code for building JSON Schema
ewels Mar 12, 2020
80bb973
Wrote code to handle interaction with nf-core website schema builder
ewels Mar 12, 2020
a2bbd9f
Encode JSON schema in POST request
ewels Mar 12, 2020
9857ff3
Final testing and tweaks for schema builder command
ewels Mar 13, 2020
030b07b
Rename parameters.settings.json > nextflow_schema.json
ewels Mar 14, 2020
4953503
Schema: Rename Input to params
ewels Mar 14, 2020
28a8e5b
Add option --web_only for nf-core schema build
ewels Mar 15, 2020
336e454
Hyphens for cli flags, not underscores
ewels Mar 15, 2020
3a06c2b
Update pipeline template schema
ewels Mar 15, 2020
064621f
Schema: Handle groups when checking for missing or incorrect params
ewels Mar 15, 2020
60ff7e2
Template schema: Add some help text.
ewels Mar 17, 2020
44df2c6
Remove top-level params object
ewels Mar 17, 2020
b80fdf1
Added nf-core schema validate functionality
ewels Mar 17, 2020
7340d2c
Raise exceptions instead of sys.exit(1)
ewels Mar 17, 2020
f3f0a31
Add schema linting to main lint call
ewels Mar 17, 2020
bfa4ea3
Added new params from the template update
ewels Mar 17, 2020
6dc837e
Schema: If no existing schema found, create a new one from scratch
ewels Mar 17, 2020
5813ad8
Added schema lint test to look for missing or unexpected params in sc…
ewels Mar 17, 2020
fa76bdf
Reordered nf-core schema subcommands in help
ewels Mar 17, 2020
778fee1
Wrote documentation for nf-core schema
ewels Mar 17, 2020
a2acd38
Rename schema use_defaults to no_prompts
ewels Mar 17, 2020
f4df69c
Fix lint pytests
ewels Mar 17, 2020
cb5fafb
Remove dup blank line in readme
ewels Mar 17, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 112 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ A python package with helper tools for the nf-core community.
* [`nf-core licences` - List software licences in a pipeline](#pipeline-software-licences)
* [`nf-core create` - Create a new workflow from the nf-core template](#creating-a-new-workflow)
* [`nf-core lint` - Check pipeline code against nf-core guidelines](#linting-a-workflow)
* [`nf-core schema` - Work with pipeline schema files](#working-with-pipeline-schema)
* [`nf-core bump-version` - Update nf-core pipeline version number](#bumping-a-pipeline-version-number)
* [`nf-core sync` - Synchronise pipeline TEMPLATE branches](#sync-a-pipeline-with-the-template)
* [Citation](#citation)
Expand Down Expand Up @@ -439,6 +440,117 @@ WARNING: Test Warnings:

You can find extensive documentation about each of the lint tests in the [lint errors documentation](https://nf-co.re/errors).

## Working with pipeline schema

nf-core pipelines have a `nextflow_schema.json` file in their root which describes the different parameters used by the workflow.
These files allow automated validation of inputs when running the pipeline, are used to generate command line help and can be used to build interfaces to launch pipelines.
Pipeline schema files are built according to the [JSONSchema specification](https://json-schema.org/) (Draft 7).

To help developers working with pipeline schema, nf-core tools has three `schema` sub-commands:

* `nf-core schema validate`
* `nf-core schema build`
* `nf-core schema lint`

### nf-core schema validate

Nextflow can take input parameters in a JSON or YAML file when running a pipeline using the `-params-file` option.
This command validates such a file against the pipeline schema.

Usage is `nextflow schema validate <pipeline> --params <parameter file>`, eg:

```console
$ nf-core schema validate my_pipeline --params my_inputs.json

,--./,-.
___ __ __ __ ___ /,-._.--~\
|\ | |__ __ / ` / \ |__) |__ } {
| \| | \__, \__/ | \ |___ \`-._,-`-,
`._,._,'


INFO: [✓] Pipeline schema looks valid

ERROR: [✗] Input parameters are invalid: 'reads' is a required property
```

The `pipeline` option can be a directory containing a pipeline, a path to a schema file or the name of an nf-core pipeline (which will be downloaded using `nextflow pull`).

### nf-core schema build

Manually building JSONSchema documents is not trivial and can be very error prone.
Instead, the `nf-core schema build` command collects your pipeline parameters and gives interactive prompts about any missing or unexpected params.
If no existing schema is found it will create one for you.

Once built, the tool can send the schema to the nf-core website so that you can use a graphical interface to organise and fill in the schema.
The tool checks the status of your schema on the website and once complete, saves your changes locally.

Usage is `nextflow schema build <pipeline_directory>`, eg:

```console
$ nf-core schema build nf-core-testpipeline

,--./,-.
___ __ __ __ ___ /,-._.--~\
|\ | |__ __ / ` / \ |__) |__ } {
| \| | \__, \__/ | \ |___ \`-._,-`-,
`._,._,'


INFO: Loaded existing JSON schema with 18 params: nf-core-testpipeline/nextflow_schema.json

Unrecognised 'params.old_param' found in schema but not in Nextflow config. Remove it? [Y/n]:
Unrecognised 'params.we_removed_this_too' found in schema but not in Nextflow config. Remove it? [Y/n]:

INFO: Removed 2 params from existing JSON Schema that were not found with `nextflow config`:
old_param, we_removed_this_too

Found 'params.reads' in Nextflow config. Add to JSON Schema? [Y/n]:
Found 'params.outdir' in Nextflow config. Add to JSON Schema? [Y/n]:

INFO: Added 2 params to JSON Schema that were found with `nextflow config`:
reads, outdir

INFO: Writing JSON schema with 18 params: nf-core-testpipeline/nextflow_schema.json

Launch web builder for customisation and editing? [Y/n]:

INFO: Opening URL: http://localhost:8888/json_schema_build?id=1584441828_b990ac785cd6

INFO: Waiting for form to be completed in the browser. Use ctrl+c to stop waiting and force exit.
..........
INFO: Found saved status from nf-core JSON Schema builder

INFO: Writing JSON schema with 18 params: nf-core-testpipeline/nextflow_schema.json
```

There are three flags that you can use with this command:

* `--no-prompts`: Make changes without prompting for confirmation each time. Does not launch web tool.
* `--web-only`: Skips comparison of the schema against the pipeline parameters and only launches the web tool.
* `--url <web_address>`: Supply a custom URL for the online tool. Useful when testing locally.

### nf-core schema lint

The pipeline schema is linted as part of the main `nf-core lint` command,
however sometimes it can be useful to quickly check the syntax of the JSONSchema without running a full lint run.

Usage is `nextflow schema lint <schema>`, eg:

```console
$ nf-core schema lint nextflow_schema.json

,--./,-.
___ __ __ __ ___ /,-._.--~\
|\ | |__ __ / ` / \ |__) |__ } {
| \| | \__, \__/ | \ |___ \`-._,-`-,
`._,._,'


ERROR: [✗] JSON Schema does not follow nf-core specs:
Schema should have 'properties' section
```

## Bumping a pipeline version number

When releasing a new version of a nf-core pipeline, version numbers have to be updated in several different places. The helper command `nf-core bump-version` automates this for you to avoid manual errors (and frustration!).
Expand Down
28 changes: 24 additions & 4 deletions docs/lint_errors.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,15 @@ The following files are suggested but not a hard requirement. If they are missin
* `conf/base.config`
* A `conf` directory with at least one config called `base.config`

Additionally, the following files must not be present:
The following files will cause a failure if the _are_ present (to fix, delete them):

* `Singularity`
* As we are relying on [Docker Hub](https://https://hub.docker.com/) instead of Singularity
and all containers are automatically pulled from there, repositories should not
have a `Singularity` file present.
* `parameters.settings.json`
* The syntax for pipeline schema has changed - old `parameters.settings.json` should be
deleted and new `nextflow_schema.json` files created instead.

## Error #2 - Docker file check failed ## {#2}

Expand Down Expand Up @@ -306,16 +312,30 @@ The nf-core workflow template contains a number of comment lines with the follow

This lint test runs through all files in the pipeline and searches for these lines.

## Error #11 - Singularity file found ##{#11}
## Error #11 - Pipeline name ## {#11}

As we are relying on [Docker Hub](https://hub.docker.com/) instead of Singularity and all containers are automatically pulled from there, repositories should not have a `Singularity` file present.
_..removed.._

## Error #12 - Pipeline name ## {#12}

In order to ensure consistent naming, pipeline names should contain only lower case, alphabetical characters. Otherwise a warning is displayed.
In order to ensure consistent naming, pipeline names should contain only lower case, alphanumeric characters. Otherwise a warning is displayed.

## Error #13 - Pipeline name ## {#13}

The `nf-core create` pipeline template uses [cookiecutter](https://github.com/cookiecutter/cookiecutter) behind the scenes.
This check fails if any cookiecutter template variables such as `{{ cookiecutter.pipeline_name }}` are fouund in your pipeline code.
Finding a placeholder like this means that something was probably copied and pasted from the template without being properly rendered for your pipeline.

## Error #14 - Pipeline schema syntax ## {#14}

Pipelines should have a `nextflow_schema.json` file that describes the different pipeline parameters (eg. `params.something`, `--something`).

Schema should be valid JSON files and adhere to [JSONSchema](https://json-schema.org/), Draft 7.
The top-level schema should be an `object`, where each of the `properties` corresponds to a pipeline parameter.

## Error #15 - Schema config check ## {#15}

The `nextflow_schema.json` pipeline schema should describe every flat parameter returned from the `nextflow config` command (params that are objects or more complex structures are ignored).
Missing parameters result in a lint failure.

If any parameters are found in the schema that were not returned from `nextflow config` a warning is given.
4 changes: 2 additions & 2 deletions nf_core/launch.py
Original file line number Diff line number Diff line change
Expand Up @@ -279,10 +279,10 @@ def prompt_param_flags(self):
click.style('Parameter group: ', bold=True, underline=True),
click.style(group_label, bold=True, underline=True, fg='red')
))
use_defaults = click.confirm(
no_prompts = click.confirm(
"Do you want to change the group's defaults? "+click.style('[y/N]', fg='green'),
default=False, show_default=False)
if not use_defaults:
if not no_prompts:
continue
for parameter in params:
# Skip this option if the render mode is none
Expand Down
59 changes: 54 additions & 5 deletions nf_core/lint.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
import yaml

import nf_core.utils
import nf_core.schema

# Set up local caching for requests to speed up remote queries
nf_core.utils.setup_requests_cachedir()
Expand Down Expand Up @@ -129,6 +130,7 @@ def __init__(self, path):
self.dockerfile = []
self.conda_config = {}
self.conda_package_info = {}
self.schema_obj = None
self.passed = []
self.warned = []
self.failed = []
Expand Down Expand Up @@ -174,7 +176,9 @@ def lint_pipeline(self, release_mode=False):
'check_conda_dockerfile',
'check_pipeline_todos',
'check_pipeline_name',
'check_cookiecutter_strings'
'check_cookiecutter_strings',
'check_schema_lint',
'check_schema_params'
]
if release_mode:
self.release_mode = True
Expand Down Expand Up @@ -248,7 +252,8 @@ def check_files_exist(self):

# List of strings. Dails / warns if any of the strings exist.
files_fail_ifexists = [
'Singularity'
'Singularity',
'parameters.settings.json'
]
files_warn_ifexists = [
'.travis.yml'
Expand Down Expand Up @@ -908,12 +913,12 @@ def check_pipeline_todos(self):
def check_pipeline_name(self):
"""Check whether pipeline name adheres to lower case/no hyphen naming convention"""

if self.pipeline_name.islower() and self.pipeline_name.isalpha():
if self.pipeline_name.islower() and self.pipeline_name.isalnum():
self.passed.append((12, "Name adheres to nf-core convention"))
if not self.pipeline_name.islower():
self.warned.append((12, "Naming does not adhere to nf-core conventions: Contains uppercase letters"))
if not self.pipeline_name.isalpha():
self.warned.append((12, "Naming does not adhere to nf-core conventions: Contains non alphabetical characters"))
if not self.pipeline_name.isalnum():
self.warned.append((12, "Naming does not adhere to nf-core conventions: Contains non alphanumeric characters"))

def check_cookiecutter_strings(self):
"""
Expand Down Expand Up @@ -950,6 +955,50 @@ def check_cookiecutter_strings(self):
self.passed.append((13, "Did not find any cookiecutter template strings ({} files)".format(num_files)))


def check_schema_lint(self):
""" Lint the pipeline JSON schema file """
# Suppress log messages
logger = logging.getLogger()
logger.disabled = True

# Lint the schema
self.schema_obj = nf_core.schema.PipelineSchema()
schema_path = os.path.join(self.path, 'nextflow_schema.json')
try:
self.schema_obj.lint_schema(schema_path)
self.passed.append((14, "Schema lint passed"))
except AssertionError as e:
self.failed.append((14, "Schema lint failed: {}".format(e)))

# Reset logger
logger.disabled = False

def check_schema_params(self):
""" Check that the schema describes all flat params in the pipeline """

# First, get the top-level config options for the pipeline
# Schema object already created in the previous test
self.schema_obj.get_wf_params(self.path)
self.schema_obj.no_prompts = True

# Remove any schema params not found in the config
removed_params = self.schema_obj.remove_schema_notfound_configs()

# Add schema params found in the config but not the schema
added_params = self.schema_obj.add_schema_found_configs()

if len(removed_params) > 0:
for param in removed_params:
self.warned.append((15, "Schema param '{}' not found from nextflow config".format(param)))

if len(added_params) > 0:
for param in added_params:
self.failed.append((15, "Param '{}' from `nextflow config` not found in nextflow_schema.json".format(param)))

if len(removed_params) == 0 and len(added_params) == 0:
self.passed.append((15, "Schema matched params returned from nextflow config"))


def print_results(self):
# Print results
rl = "\n Using --release mode linting tests" if self.release_mode else ''
Expand Down
Loading