Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rework biosample_load to not use todays date in ftp directory path #366

Merged
merged 2 commits into from
Sep 24, 2021

Conversation

dpark01
Copy link
Member

@dpark01 dpark01 commented Sep 22, 2021

The biosample_load workflow had been using today's date (ISO format) to construct a target directory path on NCBI's FTP server for BioSample submissions. This is because the directory space on the FTP server never seems to clean up (and we don't have delete permissions) so we need to ensure a unique destination for our inputs.

Using today's date, however, results in cromwell call cache misses of the ftp biosample registration task when sarscov2_illumina_full workflow fail overnight and you need to re-run the next day. This PR removes today's date from the inputs of that task and instead constructs the target directory based on the name of the input tsv file plus its MD5 hash, which should all remain stable for the same inputs.

@dpark01 dpark01 merged commit 360cf33 into master Sep 24, 2021
@dpark01 dpark01 deleted the dp-biosample branch September 24, 2021 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant