-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically download model weights #68
Conversation
We have this information on the Releases page now.
As per discussion on Slack (https://noblelab.slack.com/archives/C01MXN4NWMP/p1659803053573279).
This simplifies the examples that most users will want to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it solves the problem to me. Its a bit complicated and I think it'd be worthwhile to create unit tests for it, perhaps by mocking a response GitHub.
Also, do we have a way to verify that the architecture (number of layers, layer dim, etc) matches the downloaded weights, or to automatically set it?
Simplify config
The transformer tests only deal with depthcharge functionality and just seem copied from its repository.
I.e. the config YAML file.
It looks like |
I see you removed the |
Codecov Report
@@ Coverage Diff @@
## main #68 +/- ##
===========================================
+ Coverage 14.31% 68.32% +54.01%
===========================================
Files 9 10 +1
Lines 559 644 +85
===========================================
+ Hits 80 440 +360
+ Misses 479 204 -275
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Now we're getting a multiprocessing error on Windows 🤦♂️ |
It crashes the DataLoader: pytorch/pytorch#70344
- Non-matching version - GitHub rate limit exceeded
* Download model weights from GitHub release * Include dependencies * Update model usage documentation * Reformat with black * Download weights to the OS-specific app dir * Don't download weights if already in cache dir * Update model file instructions * Remove release notes from the README We have this information on the Releases page now. * Remove explicit model specification from example commands * Harmonize default parameters and config values As per discussion on Slack (https://noblelab.slack.com/archives/C01MXN4NWMP/p1659803053573279). * No need to specify config file by default This simplifies the examples that most users will want to use. * Simplify version matching regex * Remove depthcharge related tests The transformer tests only deal with depthcharge functionality and just seem copied from its repository. * Make sure that package data is included I.e. the config YAML file. * Remove obsolote (ppx) tests * Update integration test * Add MacOS support and support for Apple's MPS chips * Fail test but print version * Added n_worker fn and tests * Create split_version fn and add unit tests * Fix debugging unit test * Explicitly set version * Monkeypatch loaded version * Add device selector, so that on CPU-only runs the devices > 0 * Add windows patch * Fix typo * Revert * Use main process for data loading on Windows * Fix typo * Fix unit test * Fix devices for when num_workers == 0 * Fix devices for when num_workers == 0 * Minor README updates * Import reordering * Minor code and docstring reformatting * Test model weights retrieval * Fix getting the number of devices * Disable excessive Tensorboard deprecation warnings * Don't use worker threads on MacOS It crashes the DataLoader: pytorch/pytorch#70344 * Warnings need to be ignored before import * Additional weights tests - Non-matching version - GitHub rate limit exceeded * Disable tests on MacOS * Include Python 3.10 as supported version Co-authored-by: William Fondrie <[email protected]> Co-authored-by: Wout Bittremieux <[email protected]> Co-authored-by: William Fondrie <[email protected]>
* Download model weights from GitHub release * Include dependencies * Update model usage documentation * Reformat with black * Download weights to the OS-specific app dir * Don't download weights if already in cache dir * Update model file instructions * Remove release notes from the README We have this information on the Releases page now. * Remove explicit model specification from example commands * Harmonize default parameters and config values As per discussion on Slack (https://noblelab.slack.com/archives/C01MXN4NWMP/p1659803053573279). * No need to specify config file by default This simplifies the examples that most users will want to use. * Simplify version matching regex * Remove depthcharge related tests The transformer tests only deal with depthcharge functionality and just seem copied from its repository. * Make sure that package data is included I.e. the config YAML file. * Remove obsolote (ppx) tests * Update integration test * Add MacOS support and support for Apple's MPS chips * Fail test but print version * Added n_worker fn and tests * Create split_version fn and add unit tests * Fix debugging unit test * Explicitly set version * Monkeypatch loaded version * Add device selector, so that on CPU-only runs the devices > 0 * Add windows patch * Fix typo * Revert * Use main process for data loading on Windows * Fix typo * Fix unit test * Fix devices for when num_workers == 0 * Fix devices for when num_workers == 0 * Minor README updates * Import reordering * Minor code and docstring reformatting * Test model weights retrieval * Fix getting the number of devices * Disable excessive Tensorboard deprecation warnings * Don't use worker threads on MacOS It crashes the DataLoader: pytorch/pytorch#70344 * Warnings need to be ignored before import * Additional weights tests - Non-matching version - GitHub rate limit exceeded * Disable tests on MacOS * Include Python 3.10 as supported version Co-authored-by: William Fondrie <[email protected]> Co-authored-by: Wout Bittremieux <[email protected]> Co-authored-by: William Fondrie <[email protected]>
* Add beam search * Delete print statements * Automatically download model weights (#68) (#88) * Download model weights from GitHub release * Include dependencies * Update model usage documentation * Reformat with black * Download weights to the OS-specific app dir * Don't download weights if already in cache dir * Update model file instructions * Remove release notes from the README We have this information on the Releases page now. * Remove explicit model specification from example commands * Harmonize default parameters and config values As per discussion on Slack (https://noblelab.slack.com/archives/C01MXN4NWMP/p1659803053573279). * No need to specify config file by default This simplifies the examples that most users will want to use. * Simplify version matching regex * Remove depthcharge related tests The transformer tests only deal with depthcharge functionality and just seem copied from its repository. * Make sure that package data is included I.e. the config YAML file. * Remove obsolote (ppx) tests * Update integration test * Add MacOS support and support for Apple's MPS chips * Fail test but print version * Added n_worker fn and tests * Create split_version fn and add unit tests * Fix debugging unit test * Explicitly set version * Monkeypatch loaded version * Add device selector, so that on CPU-only runs the devices > 0 * Add windows patch * Fix typo * Revert * Use main process for data loading on Windows * Fix typo * Fix unit test * Fix devices for when num_workers == 0 * Fix devices for when num_workers == 0 * Minor README updates * Import reordering * Minor code and docstring reformatting * Test model weights retrieval * Fix getting the number of devices * Disable excessive Tensorboard deprecation warnings * Don't use worker threads on MacOS It crashes the DataLoader: pytorch/pytorch#70344 * Warnings need to be ignored before import * Additional weights tests - Non-matching version - GitHub rate limit exceeded * Disable tests on MacOS * Include Python 3.10 as supported version Co-authored-by: William Fondrie <[email protected]> Co-authored-by: Wout Bittremieux <[email protected]> Co-authored-by: William Fondrie <[email protected]> * Automatically download model weights (#68) (#89) * Download model weights from GitHub release * Include dependencies * Update model usage documentation * Reformat with black * Download weights to the OS-specific app dir * Don't download weights if already in cache dir * Update model file instructions * Remove release notes from the README We have this information on the Releases page now. * Remove explicit model specification from example commands * Harmonize default parameters and config values As per discussion on Slack (https://noblelab.slack.com/archives/C01MXN4NWMP/p1659803053573279). * No need to specify config file by default This simplifies the examples that most users will want to use. * Simplify version matching regex * Remove depthcharge related tests The transformer tests only deal with depthcharge functionality and just seem copied from its repository. * Make sure that package data is included I.e. the config YAML file. * Remove obsolote (ppx) tests * Update integration test * Add MacOS support and support for Apple's MPS chips * Fail test but print version * Added n_worker fn and tests * Create split_version fn and add unit tests * Fix debugging unit test * Explicitly set version * Monkeypatch loaded version * Add device selector, so that on CPU-only runs the devices > 0 * Add windows patch * Fix typo * Revert * Use main process for data loading on Windows * Fix typo * Fix unit test * Fix devices for when num_workers == 0 * Fix devices for when num_workers == 0 * Minor README updates * Import reordering * Minor code and docstring reformatting * Test model weights retrieval * Fix getting the number of devices * Disable excessive Tensorboard deprecation warnings * Don't use worker threads on MacOS It crashes the DataLoader: pytorch/pytorch#70344 * Warnings need to be ignored before import * Additional weights tests - Non-matching version - GitHub rate limit exceeded * Disable tests on MacOS * Include Python 3.10 as supported version Co-authored-by: William Fondrie <[email protected]> Co-authored-by: Wout Bittremieux <[email protected]> Co-authored-by: William Fondrie <[email protected]> * Break beam search to testable subfunctions * Fix precursor m/z termination and filtering * Add unit testing for beam search * Add beamsearch comments and fix formatting * Address requested changes and minor fixes * Add more unit tests for beam search * Check NH3 loss for early stopping * Consistent parameter order * Update docstrings * Remove unused precursors parameter * Update beam matching mask in a level higher * Minor refactoring to avoid code duplication * Update imports * Simplification refactoring * Fix unit tests * Simplify predicted peptide caching * Simplify predicted peptide caching * Simplify predicted peptide caching * Unify predicted peptide caching * Restrict tensor reshape to subfunction and minor fixes * Finish beams when all isotopes exceed the precursor m/z tolerance * Generalize look-ahead for tokens with negative mass * Remove greedy decoding functionality * Handle case with unfinished beams and add test * Upgrade required depthcharge version * Use detokenize function * Add test for negative mass-aware termination * Fix egative mass-aware beam termination * Minor refactoring * Add test for dummy output at max length * Fixed and refactored peptide and scocre mzTab outputs * Add tests for peptide and score output formatting * Small fixes * Update changelog * Fix changelog update Co-authored-by: Wout Bittremieux <[email protected]> Co-authored-by: William Fondrie <[email protected]> Co-authored-by: Wout Bittremieux <[email protected]>
Fixes #17. Fixes #75.
Use cached model weights or download them from GitHub.
If no weights file (extension: .ckpt) is available in the cache directory, it will be downloaded from a release asset on GitHub. Model weights are retrieved by matching release version. If no model weights for an identical release (major, minor, patch), alternative releases with matching (i) major and minor, or (ii) major versions will be used. If no matching release can be found, no model weights will be downloaded.
Note that the GitHub API is limited to 60 requests from the same IP per hour. A log message provides instructions to explicitly specify the model file for subsequent uses.
Review: @wsnoble to check whether this is the desired behavior for the users and the documentation, @wfondrie to check the code.