-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gencode refactor to remove gcs #934
Conversation
raise ValueError( | ||
'Unexpected number of fields on line in ensemble_to_refseq mapping', | ||
msg, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extreme nitpick but you can put the ValueError and msg on the same line here
response = requests.get(url, stream=True, timeout=10) | ||
gene_symbol_to_gene_id = {} | ||
for line in gzip.GzipFile(fileobj=response.raw): | ||
line = line.decode('ascii') # noqa: PLW2901 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, this became very simple. is there a downside to taking the pickle file part out of the download process? do you know why it was there in the first place?
…s into benb/gencode_refactor
* add task to write relatedness check to tsv (#930) * add task to write relatedness check to tsv * fix requirements * relatedness_check_table_path * add relatedness check file path to metadata.json * Benb/use metadata as source of family table load (#936) * use run metadata as source of family table load * ruff * Support gcs dirs in rsync (#932) * Support gcs dirs in rsync * ws * Gencode refactor to remove gcs (#934) * Gencode refactor to remove gcs * Fix * additional semi join (#947) * metadata parameters refactor (#946) * metadata parameters refactor * fix missing param * tweak * missed one * last one * fix test * last few bugfixes * fix * bump * missed one * change parameter type due to confusing bug * push * enum * Parse clinvar version from header (#949) * Parse clinvar version from header * responses activate * fix test * Dependency reordering so that `ValidateCallsetTask` runs before updating the reference data. (#950) * Parse clinvar version from header * Dependency reordering for reference data updates and validation * ruff * missed one * Revert relatedness changes * push * Fix import issue * Fix sample type * ruff * Fix import mocking * imports * responses activate * fix test * Tweaks * comment * Benb/check parsed clinvar version in complete (#951) * Parse clinvar version from header * First pass * Bump hail tables to https * correct dataset/dataset types * Fix clinvar mito * Fix combined * Dependency reordering for reference data updates and validation * ruff * missed one * Revert relatedness changes * push * Fix import issue * Fix sample type * ruff * Fix import mocking * imports * Missed one * First mocking pass * Finish mocks in reference data * responses activate * ruff * commas * fix test * Update compare_globals.py * import --------- Co-authored-by: Julia Klugherz <[email protected]>
I'm running into weird dependency issues with
dataproc
, decided to remove ourgcs
dep!Resolves #609