-
Notifications
You must be signed in to change notification settings - Fork 936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update vilbert.py #1065
base: main
Are you sure you want to change the base?
Update vilbert.py #1065
Conversation
Optional ITM loss in pre-training added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for contributing to MMF. Requesting some changes before this is ready to merge.
In addition, have you tried running this model? How have you tested this change?
mmf/models/vilbert.py
Outdated
@@ -1226,6 +1230,14 @@ def forward( | |||
prediction_scores_t.view(-1, self.vocab_size), masked_lm_labels.view(-1) | |||
) | |||
output["masked_lm_loss"] = masked_lm_loss.unsqueeze(0) | |||
|
|||
if itm_loss is not False: | |||
itm_head = ITM({"type": "itm", "hidden_size": self.vocab_size}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be initialized in the init not in the forward pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! corrected the initialization thing. I checked the snippet I added separately in similar manner to "ITM head test" but was not able to test the ViLBERTForPretraining (As after loading yaml file its throwing 'dict' object has no attribute 'bert_model_name') even without making changes in original code.
ITM head initialization under init instead of forward pass.
Optional ITM loss in pre-training added for VilBERT.
Addresses #466
Thanks for your contribution!
If you're sending a large PR (e.g., >50 lines), please open an issue first about
the feature/bug, and indicate how you want to contribute.
Use contributing guidelines before opening up the PR to follow MMF style guidelines.