Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SegGPT] Fix loss calculation #30421

Merged
merged 6 commits into from
Apr 24, 2024
Merged

Conversation

EduardoPach
Copy link
Contributor

What does this PR do?

This PR fixes #30419 and ensures that the loss is being correctly calculated.

While working on this PR I not only noticed that SegGptLoss was broken, but it was being incorrectly calculated (shame on SegGpt contributor 😞). Proposed solution include:

  • Passing labels to SegGptModel to correctly perform the forward pass when training in In-Context Painting style
  • Changed SegGptLoss forward method and its docstrings accordingly so that the output is the same as the one obtained in the original implementation.

Note

While running test_modeling_seggpt with is_training = True I found that gradient_checkpointing is also not working due to type_token_semantic parameter that is not used in the forward pass and is controlled by the embedding_type the model's forward and by default we use the type_token_instance just like the original implementation. Hence, we could probably move the embedding_type to config to allow gradient_checkpointing or remove it entirely as in the original implementation is not clear what is the use case for type_token_semantic

c.c. @amyeroberts

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

Have we verified the loss value with that of the original model?

src/transformers/models/seggpt/modeling_seggpt.py Outdated Show resolved Hide resolved
@EduardoPach
Copy link
Contributor Author

Thanks for working on this!

Have we verified the loss value with that of the original model?

Yeap, also added some tests to make sure this won't be an issue again 🙂.

Regarding gradinet_checkpoint while training, should this be resolved in this PR or a new one? It will probably involve re-uploading the checkpoint to the hub as we would need to take the type_token_semantic parameter from the weights

@amyeroberts
Copy link
Collaborator

Regarding gradinet_checkpoint while training, should this be resolved in this PR or a new one? It will probably involve re-uploading the checkpoint to the hub as we would need to take the type_token_semantic parameter from the weights

This should be done in a separate PR.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

@amyeroberts amyeroberts merged commit d26c141 into huggingface:main Apr 24, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[SegGPT] Loss calculation is broken
2 participants