Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch Optimus inputs to public bucket #86

Merged
merged 1 commit into from
Feb 23, 2018
Merged

Conversation

mckinsel
Copy link
Contributor

No description provided.

@@ -4,20 +4,20 @@
"TestOptimusPR.expected_matrix_summary_hash": "dd513351d4e7688c97f7bf902ba2876f",
"TestOptimusPR.expected_picard_metrics_hash": "7b7be5c9a2236920ca09f05811dca6d5",
"TestOptimusPR.r1": [
"gs://broad-dsde-mint-dev-teststorage/10x/demo/fastqs/pbmc8k_S1_L007_R1_001.fastq.gz",
"gs://broad-dsde-mint-dev-teststorage/10x/demo/fastqs/pbmc8k_S1_L007_R1_001.fastq.gz"
"gs://hca-dcp-mint-test-data/optimus/fastqs/pbmc8k_S1_L007_R1_001.fastq.gz",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the test inputs should be the same as the inputs here for consistency: https://github.com/HumanCellAtlas/skylab/pull/84/files#diff-7787cfa08c74e3392fe187d33f67dca1R4, although I'm not sure whether we want to use the data in the Optimus or 10x folder (I didn't realize there were both)!

Copy link
Member

@ambrosejcarr ambrosejcarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once all the tests pass.

Regarding Saman's comment, I think the 10x count and the Optimus inputs are the same files.

If we replace the 10x/demo files with these, this test will definitely pass, and the 10x/count pipeline can then reference the same file set.

@mckinsel
Copy link
Contributor Author

Hey thanks for the comments. So there are files in two places right now, 10x/demo and optimus. It seems like they should only be in one place, but which one?

@ambrosejcarr
Copy link
Member

My opinion is that the files that are the current targets of the PR tests should reside in 10x/demo and be used as inputs for all of our 10x v2 testing.

@TimothyTickle
Copy link

Often in my experience, "demo" files in software repos are toy data sets that are super fast and help people get up and running, minimal data sets that demonstrate install and functionality pass. I am not sure that is the kind of file you want a PR based on. Last time we spoke it seemed the team was advocating inputs that were closer to reality. These may be two different data.

@ambrosejcarr
Copy link
Member

I understood that we were talking about a "PR test" which would run rapidly to test that results are identical, and a "scientific test" which would run frequently but not on PR, which would provide a baseline against which we could compare the outputs.

These data files are appropriate for the PR test, as defined above, but not the scientific test.

@mckinsel mckinsel force-pushed the mk-optimus-public-bucket branch from a566092 to 64c981e Compare February 21, 2018 16:37
@mckinsel
Copy link
Contributor Author

jenkins retest

@mckinsel mckinsel force-pushed the mk-optimus-public-bucket branch from 64c981e to f5bcb73 Compare February 23, 2018 19:47
"TestOptimusPR.sample_id": "pbmc8k_test",
"TestOptimusPR.annotations_gtf": "gs://broad-dsde-mint-dev-teststorage/reference/hg19_ds/GSM1629193_hg19_ERCC.gtf.gz",
"TestOptimusPR.ref_genome_fasta": "gs://broad-dsde-mint-dev-teststorage/demo/chr21.fa"
"TestOptimusPR.annotations_gtf": "gs://hca-dcp-mint-test-data/optimus/GSM1629193_hg19_ERCC.gtf.gz",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mckinsel is this the same file as gs://hca-dcp-mint-test-data/reference/demo/hg19_ds/GSM1629193_hg19_ERCC.gtf.gz? That's what is listed in the example inputs for the adapter pipeline, and I'm not sure if they're supposed to be the same: https://github.com/HumanCellAtlas/skylab/blob/master/pipelines/optimus/example_test_inputs.json#L17

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure it's the same, I'll change the path.

"TestOptimusPR.annotations_gtf": "gs://broad-dsde-mint-dev-teststorage/reference/hg19_ds/GSM1629193_hg19_ERCC.gtf.gz",
"TestOptimusPR.ref_genome_fasta": "gs://broad-dsde-mint-dev-teststorage/demo/chr21.fa"
"TestOptimusPR.annotations_gtf": "gs://hca-dcp-mint-test-data/optimus/GSM1629193_hg19_ERCC.gtf.gz",
"TestOptimusPR.ref_genome_fasta": "gs://hca-dcp-mint-test-data/optimus/chr21.fa"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, the adapter input file is using "gs://hca-dcp-mint-test-data/reference/demo/chr21.fa". Just wondering if that's a duplicate file, or if they are supposed to be different here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also should be the same, as well as the star.tar. I'll change both paths.

Copy link
Contributor

@samanehsan samanehsan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making these changes @mckinsel!

@mckinsel mckinsel force-pushed the mk-optimus-public-bucket branch from a42e8c6 to 248e273 Compare February 23, 2018 21:55
@mckinsel mckinsel merged commit dbda12d into master Feb 23, 2018
@mckinsel mckinsel deleted the mk-optimus-public-bucket branch February 23, 2018 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants