-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dev #60
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Changed custom models location in new module 'malpolon.models.custom_models'. This includes glc24 pre_extracted MME model and multi_modal.py. For MME: classificationsystem and nn module have been split in 2 files to allow calling MME from model_builder without triggering a circular import through check_model. Updated examples consequently. * Fix: state_dict altered during training. - state_dict contains a loss parameter pos_weight as key loss.pos_weight. This key is created when the loss is instantiated by GenericPredictionSystem. However, this loss parameter was accessed and modified during the _step() process, which also alters the state_dict. Consequently, when loading the model by its checkpoint, there would be a value mismatch and the model would not load to resume training. This has been fixed by restoring the initial value of the loss parameter within the _step() function before the return statement. - 'positive_weigh_factor' model hyperparameter has been deleted and replaced by loss parameter 'pos_weight', which achieves the same purpose. In the config file, 'positive_weigh_factor' model key has been substituted for subkey 'pos_weight' nested under 'loss_kwargs' nested in the optimizer section * Cleaned remainings of previous commit testing * Added download weight option for all classification system and updated checkopoint_path call for MME example * Fixed wrong checkpoint_path path initialization behavior. - glc24_cnn_multimodal_ensemble: updated example config file and main script to new checkpoint_path behavior, in both training and inference runs - standard_prediction_systems.py: Fixed wrong checkpoint_path path initialization behavior - glc2024_pre_extracted_prediction_system.py: added missing checkpoint_path argument and removed checkpoint_path setter as it is carried out by GenericPredictionSystem * Updated example cnn_on_rgbnir_torchgeo following checkpoint_path update * Updated example cnn_on_rgbnir_concat following checkpoint_path update * Updated example cnn_on_rgbnir_glc23_patches following checkpoint_path update * Reset yaml file glc23 example * Fixed wrong variable assignment in exmaples micro_geolifeclef2022/cnn_on_rgb_nir_patches and micro_geolifeclef2022/cnn_on_rgb_patches * Added predict run part in example geolifeclef2022/cnn_on_rgb_patches and updated main script following checkpoint_path update. - data_module: Added more flexibility for predictions without targets - geolifeclef2022 dataset: Added default -1 value for targets in predict mode to comply with standard_prediction_system predict() method * Updated glc22 and microglc22 examples following checkpoint_path update, and added inference part in the run section for those which didn't have one. Added input argument in custom GLC22 datamodules + model output in prediction mode, to such extent. * Updated CIFAR-10 example following checkpoint_path update * Updated all inference examples following checkpoint_path update * Removed duplicate import * Updated code docstrings * Fixed task value from binary to multilabel (doesn't change behavior) * Added 'malpolon' as model providers. - model_builder: Added provider method and created new dictionary with model names as keys, and local imports of models as values - data_module: Added posisblity of applying no activation function when running inference, so as to output the model's logits. Enhanced CSV export method's info prints. - glc2024_multimodal_ensemble_model: Added new init argument and class attribute 'pretrained' which the datmaodule uses to determine whether to download pretrained weights (formerly: a standalone 'weights_download' variable was used by the datamodule). Added docstrings. - glc2024_pre_extracted_prediction_system: Changed handling behavior of the model's loss during '_step()' to prevent overwritting the loss parameter during training which resulted in a de-synchronization of the state_dcit() before and after running the model (since loss parameters are automatically added as learnable parameters) - glc24_cnn_multimodal_ensemble.yaml: Updated config file accordingly. Cleaned config file with correct values. - glc24_cnn_multimodal_ensemble.py: Updated MME main srcipt accordingly. Changed activation function of inference run from softmax() to sigmoid() * Updated glc22 tests following class getter changes * Removed commented dict
* Corrected docstring * Improved script splitting csv obs by species frequency by adding callable arguments, reducing computation time, adding comments, making it more generic. Renamed the script to split_obs_per_column_frequency.py * Fixed unwanted behavior and further improved split_obs_per_column_frequency.py * Fixed output test name syntax being different from the other splits * Renamed or deleted files * Added inference metrics evaluation scripts and output files for GLC24 MME model. Added the top25 predictions files as they are not heavy. * Added specific .gitignore for MME inference and evaluation folder to opt out heavy files only * Updated values of previously created .gitignore * Moved and updated previous specific .gitignore * Added entries to root .gitignore * Added task selection (multilabel or other) in malpolon.data.datasets.geolifeclef2024_pre_extracted.GLC24Datamodule * WiP: glc24 mme habitat integration * COrrected typos * Changed multiclass prediction filtering to keep all predictions and probas out of predict_logits_to_class() * Fixed GLC24 mme habitat download method * Reset glc24 mme habitat config file * Added GLC24 MME habitat model dataset as new Malpolon dataset within malpolon.data.datasets.geolifeclef2024_pre_extracted * Renamed inference evaluation script for GLC24 pre-extracted examples. Added dcostrings to said examples. * Fixed habitat dataset folder not being created before calling symbolic links * Added docstrings and linting * Removed unnecessary files * Added glc24 pre-extracted species unit test * Added glc24_pre_extracted examples * Updated test_examples pytest run skips and cleaned file. * linting * Docstrings glc24_pre_extracted
…predict_point': Changed checkpoint state_dict loading from model to LightningModule (breaking changes). Added iterable data type compatibility.
…er learning examples
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📝 Changelog
Major
Added GLC24 pre_extracted habitat dataset and example (see PR 58 in the Links section)
Changed the way checkpoints are loaded from loading the
state_dict
of the model object to loading thestate_dict
of the LightningModule. This is a breaking change as examples needed to be updated by removing the replacement of "model." string in the loaded state_dict.Added possibility to download model weights for any Malpolon model given a URL and a few file paths
Updated the way checkpoint_path is passed on to models. Added an attribute checkpoint_path for all Malpolon models
Added Malpolon as (local) model provider.
malpolon.models.custom_models
which will host custom models proposed by Malpolongeolifeclef2024_multimodal_ensemble.py
to glc2024_multimodal_ensemble_model.py and glc2024_pre_extracted_prediction_system.py in custom_models to prevent circular import from malpolon.models.model_builder after adding Malpolon as (local) providerMinor
malpolon.data.data_module.export_predict_csv
to enable more flexibility when outputting the prediction CSV for a single data point.Examples
data.download_data
)model.model_kwargs.pretrained
key in the config file. The weights enable users to directly run our MME model on our GLC24_pre_extracted Test set and reach ~30% micro F1-score with ~26% micro precision and ~36% micro Recall, as well as ~96% micro AuC.Tests
🔗 Links
✅ Checklist