-
Add
sigmoid()
on reference box before positional embedding. -
Normalize theta
$\in [\frac{-\pi}{2},\frac{\pi}{2})$ to [0, 1). -
Change the smooth L1 loss of polys in format of eight params to L1 loss of oboxes in format of five params.
-
grad detach on reference box of every decoder layer except the first layer
- disable the gaussian co-attention
- back to sam-detr without roi aign