Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

working commit of the fmx-assign-to-gt functionality #44

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

erflynn
Copy link
Collaborator

@erflynn erflynn commented Nov 1, 2023

Addresses #23

Adds optional function to run gtcheck to compare reference vcf and freemuxlet output.

@erflynn erflynn requested a review from amadeovezz November 1, 2023 21:14
@erflynn
Copy link
Collaborator Author

erflynn commented Nov 1, 2023

@amadeovezz -- can you check that these additions to the pre_qc file follow conventions you have set up for channels? would love feedback on this.

@@ -153,6 +155,13 @@ workflow {
.map{it -> [it[0], it[6], it[2], it[3], it[4], it[5]]} // [lib, num_of_samples, plp_files]] (excluding .tsv)
FREEMUXLET_POOL(ch_multi_lib_transformed)

if (params.settings.fmx_assign_to_gt){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As intuitive as it may be to add this logic under the freemuxlet conditionals, I think conceptually and for readability it is better to put this after the de-multiplexing stage. That way we can leave the intention of block entirely for de-multiplexing and the outputs of de-multiplexing flow into the FMX_ASSIGN_TO_GT process. How do you feel about adding it on line 263 (after de-multiplexing). Something like:

if (params.settings.demux_method == "freemuxlet" && params.settings.fmx_assign_to_gt) {
    if (params.settings.merge) {
        FMX_ASSIGN_TO_GT(
            Channel.from(get_pool_vcf())
                .join(FREEMUXLET_POOL.out.vcf) // [pool, ref_vcf, fmx_vcf]
        );
    }

    ch_gt_input = Channel.from(get_pool_vcf()) // [pool, vcf]
        .combine(
            Channel.from(get_library_by_pool()).map { it -> [it[1], it[0]] }, 
            by: 0
        ) // [pool, vcf, lib]
        .map { it -> [it[2], it[1]] } // [lib, vcf]
        .join(FREEMUXLET_LIBRARY.out.vcf); // [lib, ref_vcf, fmx_vcf]

    // TODO: add checks that the reference file exists
    FMX_ASSIGN_TO_GT_LIBRARY(ch_gt_input);
}

This also de-duplicates some of the repeated code for the single library processing

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooo good call. I'll move it around, thanks!

}

// TODO: unify the two processes
process SEPARATE_FMX_PRE {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy you unified these processes!

I remember now why I duplicated it as a short term solution. There are other pipelines that reference SEPARATE_FMX and I wanted to make sure I didnt break them by being explicit about the input:

tuple val(library), path(vcf_file), path(sample_file), path(lmix_file)

Might be worth running those pipelines to be sure?

@amadeovezz
Copy link
Collaborator

amadeovezz commented Dec 8, 2023

@erflynn granted my hotfix testing PR works (and the subsequently merged in). Do you want to try and adding a test cases for this? Seems like a good candidate.

@@ -187,7 +194,8 @@ workflow {
FREEMUXLET_LIBRARY(ch_single_lib_transformed)

ch_sample_map = FREEMUXLET_LIBRARY.out.sample_map
}
ch_lib_vcf = FREEMUXLET_LIBRARY.out.vcf
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I think you can take this line and lines 180-186 and move them to the conditional below.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the channels FREEMUXLET_LIBRARY.out.vcf and SEPARATE_FMX will continue to exist. And that way we encapsulate everything into the conditional

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! just updated

@amadeovezz
Copy link
Collaborator

@amadeovezz -- can you check that these additions to the pre_qc file follow conventions you have set up for channels? would love feedback on this.

ya i was actually looking at the channel logic, i think we can refactor it to make it simpler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants