Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vm shape bump on downstream nextstrain steps #232

Merged
merged 4 commits into from
Mar 12, 2021
Merged

vm shape bump on downstream nextstrain steps #232

merged 4 commits into from
Mar 12, 2021

Conversation

dpark01
Copy link
Member

@dpark01 dpark01 commented Mar 12, 2021

Not quite sure what the conditions are that cause this, but recent runs of nextstrain builds seem to require >3GB of RAM at the ancestral_traits and tip_frequencies and export_auspice_json tasks. It's a little odd because the size of inputs (# of genomes) should not be increasing at all as the size of GISAID increases, but perhaps these steps have memory growth purely based on metadata table size (even all the irrelevant entries).

We could potentially filter the metadata tsv to only the entries selected by the subsampling (and make it an output of the subsample task) so that downstream augur steps only see smaller tables.

Or we could just use more RAM. This PR does the latter.

@dpark01 dpark01 merged commit 07868f4 into master Mar 12, 2021
@dpark01 dpark01 deleted the dp-nextstrain branch March 12, 2021 20:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant