Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jbrowse indexer - track3 - current progress #66

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

vikasguptaebi
Copy link
Contributor

following are addressed for now --

  1. Create a subworkflow that generates the required GFF and FASTA files for JBrowse2, this module should not include the jbrowse-import specifics.
  2. Include tests for the new subworkflow.
  3. Merge the subworkflow into nf-modules

Copy link
Member

@mberacochea mberacochea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice stuff folks. I think it needs a bit of tiding up before mering:

  • Publishing dirs is responsability of the pipelines so remove the bespoke processes to do so
  • Don't use TAB is the input file is a GFF, it makes reading that trickier with no need (for the output use _gff to make it clear that is a modified version of the file
  • I would consider merged GFF_TRIM_FASTA and the indexation process (under a flag).. it will make it faster as there will be less file-copy over done by nextflow (ask me if there are questions around this)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module shouldn't be under jbrowse, this is a generic GFF sorting tool

tuple val(meta), path(tab)

output:
tuple val(meta), path("${meta.id}_sorted.gff"), optional: true, emit: gff
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The output shouldn't be optional, why is it optional?

container 'quay.io/biocontainers/coreutils:8.25--0'

input:
tuple val(meta), path(tab)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tuple val(meta), path(tab)
tuple val(meta), path(gff)

tuple val(meta), path(tab)

output:
tuple val(meta), path("${meta.id}_sorted.gff"), optional: true, emit: gff
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tuple val(meta), path("${meta.id}_sorted.gff"), optional: true, emit: gff
tuple val(meta), path("${meta.id}_sorted.gff"), emit: sorted_gff

label 'process_single'

conda "${moduleDir}/environment.yml"
container 'quay.io/biocontainers/coreutils:8.25--0'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to consider the singularity image too, like this example:

    conda "bioconda::blast=2.14.1"
    container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
        'https://depot.galaxyproject.org/singularity/blast:2.14.1--pl5321h6f7f691_0':
        'biocontainers/blast:2.14.1--pl5321h6f7f691_0' }"

Was this module created with the nf-core tools?

}

// PUBLISH_OUTPUT_FILES process to save the output files
process PUBLISH_OUTPUT_FILES {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this process

@@ -0,0 +1,36 @@
# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/subworkflows/yaml-schema.json
name: "index_fasta"
description: Generate fasta indices
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: Generate fasta indices
description: Generate fasta indices using BGZIP and FADIX

versions = ch_versions // Channel: [ versions.yml ]

// Call the process to publish files
PUBLISH_OUTPUT_FILES(gff_gz, tbi_files, output_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as for the index one, this one should not be here

@@ -0,0 +1,37 @@
# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/subworkflows/yaml-schema.json
name: "index_gff"
description: Generate gff indices
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: Generate gff indices
description: Create an indexed GFF without the FASTA sequence

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this file for?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants