You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The difference is that you directly use the query (kernel in our paper) for temporal association, while ours are learned by a sparse triplet loss to learn such embedding.
I wonder would you consider cite our work. Thanks a lot!
Moreover, I would ask several questions.
1, Would the conclusion still be hold if you use a weaker Instance Segmentation model (DETR as VISTR)?
Because I apply K-Net for online learning. However, on YT-VIS-2019, the performance is not good.
2, I could not understand why OVIS improve a lot than YT-VIS.
Thanks Again!
Best Regards!
The text was updated successfully, but these errors were encountered:
Hi! Dear authors:
After I read this paper, I feel very excited and convinced by the way you did.
The insights of your paper are very similar to our work: Video K-Net
https://github.com/lxtGH/Video-K-Net
The difference is that you directly use the query (kernel in our paper) for temporal association, while ours are learned by a sparse triplet loss to learn such embedding.
I wonder would you consider cite our work. Thanks a lot!
Moreover, I would ask several questions.
1, Would the conclusion still be hold if you use a weaker Instance Segmentation model (DETR as VISTR)?
Because I apply K-Net for online learning. However, on YT-VIS-2019, the performance is not good.
2, I could not understand why OVIS improve a lot than YT-VIS.
Thanks Again!
Best Regards!
The text was updated successfully, but these errors were encountered: