You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the paper's equation (4), asymmetric probability shifting is p_m = max(p-m, 0), but in the implementation, it's called asymmetric clipping and there is xs_neg = (xs_neg + self.clip).clamp(max=1) which is probably p_m = min(p+m, 1).
Is there a reason for this difference?
The text was updated successfully, but these errors were encountered:
Thanks for such an interesting paper 👍
In the paper's equation (4), asymmetric probability shifting is
p_m = max(p-m, 0)
, but in the implementation, it's called asymmetric clipping and there isxs_neg = (xs_neg + self.clip).clamp(max=1)
which is probablyp_m = min(p+m, 1)
.Is there a reason for this difference?
The text was updated successfully, but these errors were encountered: