-
-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Best Model info #42
Comments
Information on which model are you interested in out of the two? |
Both if you can otherwise RefineNet is enough |
RefineNet
DlinkNet
|
Hi, |
Yes, its the best model. It could be because I also used test time augmentation while inference |
@teresalisanti are you running the refine-net model on a custom aerial imagery data ? or Inria data ? And the results are they from finetuned refine-net model ? or just the model weights that are shared in the repo ? |
I trained from scratch the RefineNet model on custom aerial imagery data + Inria dataset, so i didn't use the model weights that you shared in the repository. I tested my best model configuration on my own images without test time augmentation. Why do you use test time augmentation while inference? I don't get the benefit. |
unfortunately, I don't have the script that I used for training, it was in my old laptop which I no longer have and top of that, I did not commit the file to git :( . TTA was helpful with IOU as in case of building, larger buildings are cut off in to different images, scaling during prediction helps with that problem, however, its not that significant though. Rotation helps to increase the confidence of prediction, its just to reduce pixels which don't have high confidence (false positives). To handle the boundary effect, Building's at the boundary of the image, might obtain low confidence, to tackle this Mirroring, Cropping and Scaling is done as to ensure buildings at the boundaries are detected What I can suggest you do is, try fine tuning the model, if the image quality differ a lot then just use the weights for the earlier layers. If you plan to train the model from scratch than use the Imagenet weights for ResNet. |
Thank you!! Did You train your model from scratch or just the top layers of the network by applying transfer learning with Imagenet weights? How can i modify the scripts (refinenet.py, segmentation.py) to freeze all but the top layers which i would like to retrain using the weights you shared in this repo? Thank you :) |
I used ImageNet weights to initialise ResNet module and used default pytorch initialisation for rest, thats how I setup the training. In the library its default to use ImageNet weights for training, i.e it does what I described above. However, the library doesn’t have a functionality to split pre-trained weights which are not present in torchvision and make them non-trainable. You will have to write a custom function to set defined layers to non trainable after you have loaded the model. What I would suggest you rather do is, load the entire weights file (file shared in the repo) and finetune it for your set of images. You can start by setting lr=1e-04 and see how the training progress and make changes accordingly. You can set this param == False, that way just the weights in the decoder part will be updated. |
One common reason for this is, when augmentation is applied the model gets some hard examples to learn from which causes the validation metric to be lower than training metric, eventually the model should be able to adapt to those hard examples and generate a more standard metric if the issue still persists than you should either make the model more complex or reduce augmentation |
if metrics on validation set are lower than those on training set, than everything is working fine. This should be the normal output of every training, as you said above. In our case metrics on validation set are always higher than those on training set. |
If the validation metric is lower than training metric for initial epochs then thats not a problem, however, if the train metric is always higher than validation throughout the training than it could be considered as a problem as this behaviour is not desired. To avoid such behaviour, the model performance on training data without augmentation could be checked to verify that model is not under-fitting. If thats not the case then model complexity could be increased so the model is able to adapt the varying examples in the training set. One other sanity check would be to train on model with high complexity (perhaps Resent150 or greater) and observe the behaviour. |
Hi Fuzail,
could you please share some information about the best model you got? I would like to know:
Thank you,
Teresa
The text was updated successfully, but these errors were encountered: