Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VS-222 dont hard code the dataset name! #7704

Merged
merged 15 commits into from
Mar 9, 2022
Merged

Conversation

RoriCremer
Copy link
Contributor

@RoriCremer RoriCremer commented Mar 2, 2022

GvsCreateFilterSet.wdl failed recently for Morgan because of this bug. When run in a brand new project, filter model creation fails because we expect the project to have a hard coded dataset named "temp_tables" which is likely does not have. The workaround is simply to manually create one. This ticket removes the need for this dataset altogether.

This is removed, and instead, the default dataset is used (that the many other tables created in this pipeline use as the default)

able to reproduce with a dummy dataset name:
Screen Shot 2022-03-03 at 10 44 39 PM

tested here:
https://app.terra.bio/#workspaces/broad-dsp-spec-ops-fc/gvs_testing_ingest/job_history/1dd27d90-82c4-44e6-8172-15c10c8a9c7f

@RoriCremer RoriCremer changed the title update wdl too dont hard code the dataset name! Mar 4, 2022
@RoriCremer RoriCremer changed the title dont hard code the dataset name! VS-222 dont hard code the dataset name! Mar 4, 2022
@Argument(
fullName = "dataset-id",
doc = "ID of the Google Cloud dataset to use when executing queries",
optional = true // I guess, but wont it break otherwise or require that a dataset be created with the name temp_tables?
Copy link
Contributor Author

@RoriCremer RoriCremer Mar 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to delete this comment, but why is projectId above optional?

Copy link

@rsasch rsasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there an available GATK jar with these updates to test with?

@RoriCremer
Copy link
Contributor Author

new jar here: gs://broad-dsp-spec-ops/scratch/bigquery-jointcalling/jars/rc_testing_dataset_id_20220303/gatk-package-4.2.0.0-478-g29bb3da-SNAPSHOT-local.jar

I suppose I should add it to the WDL as the default

@RoriCremer RoriCremer force-pushed the rc-vs-222-dataset-id branch from 23df84d to 4941fc9 Compare March 7, 2022 19:47
Copy link

@rsasch rsasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍🏻

@RoriCremer RoriCremer merged commit 38fa733 into ah_var_store Mar 9, 2022
@RoriCremer RoriCremer deleted the rc-vs-222-dataset-id branch March 9, 2022 23:05
This was referenced Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants