Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Optimizers / Schedulers flexbility rework (#67)
* Model weights download (#56) * Changed custom models location in new module 'malpolon.models.custom_models'. This includes glc24 pre_extracted MME model and multi_modal.py. For MME: classificationsystem and nn module have been split in 2 files to allow calling MME from model_builder without triggering a circular import through check_model. Updated examples consequently. * Fix: state_dict altered during training. - state_dict contains a loss parameter pos_weight as key loss.pos_weight. This key is created when the loss is instantiated by GenericPredictionSystem. However, this loss parameter was accessed and modified during the _step() process, which also alters the state_dict. Consequently, when loading the model by its checkpoint, there would be a value mismatch and the model would not load to resume training. This has been fixed by restoring the initial value of the loss parameter within the _step() function before the return statement. - 'positive_weigh_factor' model hyperparameter has been deleted and replaced by loss parameter 'pos_weight', which achieves the same purpose. In the config file, 'positive_weigh_factor' model key has been substituted for subkey 'pos_weight' nested under 'loss_kwargs' nested in the optimizer section * Cleaned remainings of previous commit testing * Added download weight option for all classification system and updated checkopoint_path call for MME example * Fixed wrong checkpoint_path path initialization behavior. - glc24_cnn_multimodal_ensemble: updated example config file and main script to new checkpoint_path behavior, in both training and inference runs - standard_prediction_systems.py: Fixed wrong checkpoint_path path initialization behavior - glc2024_pre_extracted_prediction_system.py: added missing checkpoint_path argument and removed checkpoint_path setter as it is carried out by GenericPredictionSystem * Updated example cnn_on_rgbnir_torchgeo following checkpoint_path update * Updated example cnn_on_rgbnir_concat following checkpoint_path update * Updated example cnn_on_rgbnir_glc23_patches following checkpoint_path update * Reset yaml file glc23 example * Fixed wrong variable assignment in exmaples micro_geolifeclef2022/cnn_on_rgb_nir_patches and micro_geolifeclef2022/cnn_on_rgb_patches * Added predict run part in example geolifeclef2022/cnn_on_rgb_patches and updated main script following checkpoint_path update. - data_module: Added more flexibility for predictions without targets - geolifeclef2022 dataset: Added default -1 value for targets in predict mode to comply with standard_prediction_system predict() method * Updated glc22 and microglc22 examples following checkpoint_path update, and added inference part in the run section for those which didn't have one. Added input argument in custom GLC22 datamodules + model output in prediction mode, to such extent. * Updated CIFAR-10 example following checkpoint_path update * Updated all inference examples following checkpoint_path update * Removed duplicate import * Updated code docstrings * Fixed task value from binary to multilabel (doesn't change behavior) * Added 'malpolon' as model providers. - model_builder: Added provider method and created new dictionary with model names as keys, and local imports of models as values - data_module: Added posisblity of applying no activation function when running inference, so as to output the model's logits. Enhanced CSV export method's info prints. - glc2024_multimodal_ensemble_model: Added new init argument and class attribute 'pretrained' which the datmaodule uses to determine whether to download pretrained weights (formerly: a standalone 'weights_download' variable was used by the datamodule). Added docstrings. - glc2024_pre_extracted_prediction_system: Changed handling behavior of the model's loss during '_step()' to prevent overwritting the loss parameter during training which resulted in a de-synchronization of the state_dcit() before and after running the model (since loss parameters are automatically added as learnable parameters) - glc24_cnn_multimodal_ensemble.yaml: Updated config file accordingly. Cleaned config file with correct values. - glc24_cnn_multimodal_ensemble.py: Updated MME main srcipt accordingly. Changed activation function of inference run from softmax() to sigmoid() * Updated glc22 tests following class getter changes * Removed commented dict * Updated setup.py for v1.3.0 * Updated sklearn verison * Added optimizer and scheduler selection via config file. Applied changes to sentinel-2a-rgbnir_bioclim example yaml config file * Dosctrings * Optimizer / scheduler rework [backward compatible]. - malpolon.models.utils: Changed behavior of check_optimizer() and added check_scheduler() to allow users to input one or several optimizers (and optionally 1 scheduler per optimizer, possibly with a lr_scheduler_config descriptor) via their config files. - malpolon.models.standard_prediction_systems: changed instantiation of optimizer(s) and scheduler(s) in class GenericPredictionSystem. The class attributes are now lists of instantiated optimizers (respectively, of lr_scheduler_config dictionaries). Updated behavior of method configure_optimizers() to return a dictionary containing all the optimizers and scheudlers (cf. https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.core.LightningModule.html#lightning.pytorch.core.LightningModule.configure_optimizers). - malpolon.tests.test_models.utils: Added all corresponding unit tests, testing both valid scenarios and edge cases of incorrect user inputs in the config file. - sentinel-2a-rgbnir_bioclim example: updated the config file to fit previously described changes. * Updated text_examples skip rules. * WiP: Updated MME example config file and ClassificationSystemGLC24() class because recent rework of optimizer(s) and scheduler(s) are not compatible with the classification system calls. * WiP: updated glc24_pre_extracted example following optimizers and schedulers update. Updated test_examples consequently * Cleaned files and updated docstrings * restored default test_examples pytest skips values * linting * WiP: update documentation and creating a new example-wide README with instruction on what to do to create a custom example * Added new README in _malpolon/examples_ exaplaining how to create and run examples in a generic way, for each scenario (WiP) * Updated exmaples/ README * Updated exmaples/ README * Updated root readme to link to new examples/ readme * Updated examples readme with hyperparameters and updated /sentinel-2a-rgbnir_bioclim examples consequently with new 'optim' key * Updated examples/ readme with info about config parameters * Changed Conda source following licensing changes which can make its use non free * Updated Readmes * Fixed examples/Readme info * Updated sentinel-2a-rgbnir & sentinel-2a-rgbnir_bioclim examples config files and scripts following optimizers update * Updated all examples following optimizers update * Updated config files following text\_examples following optimizers update. All test_examples ran: all passed. * Updated READMEs with Troubleshooting and Contribution section * Added linting section to root README, and added bash script to run linters and tests * Added instructions relative to the checking script in root README * Linting
- Loading branch information