Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Development November 2022 #64

Merged
merged 16 commits into from
Jan 2, 2023
Merged

Development November 2022 #64

merged 16 commits into from
Jan 2, 2023

Conversation

SebieF
Copy link
Collaborator

@SebieF SebieF commented Dec 5, 2022

05.12.2022 - Version 0.2.1

Bug fixes

Features

  • Using device is now logged
  • Adding a sanity_checker.py that checks if the test results have some obvious problems (like only predicting a single
    class) (wip)
  • Adding a limited_sample_size flag to train the model on a subset of all training ids. Makes it easy to check if the
    model architecture is able to overfit on the training data
  • Adding metrics from best training iteration to out.yml file (to compare with test set performance)
  • Applying _validate_targets to all protocols in TargetManager
  • Added a Changelog file

Maintenance

  • Conversion dataset -> torch.tensor moved to embeddings.py
  • Storing training/validation/test ids is replaced with the amount of samples in the respective sets
  • Storing start and end time in a reproducible, readable format
  • Export of ConfigurationException via init.py file for consistency
  • Removing unnecessary double-loading of checkpoint for test evaluation
  • Adding typing to split lists in TargetManager

@SebieF SebieF added bug Something isn't working enhancement New feature or request refactoring Code or standardization refactorings labels Dec 5, 2022
@SebieF SebieF requested a review from sacdallago December 5, 2022 11:34
@SebieF SebieF self-assigned this Dec 5, 2022
Improves dataset creation time (likely) and makes embedding handling easier
Numbers are good and easy to review for sanity checks, ids are not and are contained in the fasta file anyway at the moment
Fixup from interaction branch (29.11.22)
Check output_vars after run for obvious problems
This limits training data to a user defined value. Enables quick checking of the architecture and to see if the model is able to overfit
Best checkpoint was loaded in both, trainer and Solver, so it is best to keep it in the trainer because it removes the side effect and improves logging ordering
Makes it easier to see difference between training, validation and test for best epoch
The file inconsistencies apply in almost the same way to all protocols (missing pre-computed embeddings for example). Length check is of course only done for residue_to_x. Also fixing old flag name.
@sacdallago sacdallago merged commit 8a9ded5 into sacdallago:main Jan 2, 2023
@SebieF SebieF deleted the dev-11-22 branch January 2, 2023 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request refactoring Code or standardization refactorings
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants