-
Notifications
You must be signed in to change notification settings - Fork 27.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pyramid Vision Transformer #17596
Labels
Comments
Hey @danielhoshizaki, Can I join you? |
@danielhoshizaki , I would love to contribute for this new model. |
Thanks for offering to help and sorry about the late response. |
Sure @danielhoshizaki |
anyone working on this? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Model description
I would like to add the Pyramid Vision Transformer model.
Paper Abstract
Pyramid Vision Transformer~(PVT), has several merits compared to prior arts. (1) Different from ViT that typically has low-resolution outputs and high computational and memory cost, PVT can be not only trained on dense partitions of the image to achieve high output resolution, which is important for dense predictions but also using a progressive shrinking pyramid to reduce computations of large feature maps. (2) PVT inherits the advantages from both CNN and Transformer, making it a unified backbone in various vision tasks without convolutions by simply replacing CNN backbones. (3) We validate PVT by conducting extensive experiments, showing that it boosts the performance of many downstream tasks
Open source status
Provide useful links for the implementation
Model Implementation: https://github.com/whai362/PVT
Pretrained model weights for semantic segmentation: https://github.com/whai362/PVT/tree/v2/segmentation (based on ADE20K)
The text was updated successfully, but these errors were encountered: