You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,i wander why u use double softmax in ur paper? the probability pi is already the output of the softmax function as ur paper mensioned, if use a softmax on eht pi,
for instance, if i have a prob "tensor([0.1000, 0.8000, 0.1000])", the output of double softmax is tensor([0.2491, 0.2491, 0.5017]), apparently the high pi 0.6 becomes 0.4272. This appears to lower the confident.
so why use double softmax ?Please help me understand this design,Thanks!!!
The text was updated successfully, but these errors were encountered:
Hi,i wander why u use double softmax in ur paper? the probability pi is already the output of the softmax function as ur paper mensioned, if use a softmax on eht pi,
for instance, if i have a prob "tensor([0.1000, 0.8000, 0.1000])", the output of double softmax is tensor([0.2491, 0.2491, 0.5017]), apparently the high pi 0.6 becomes 0.4272. This appears to lower the confident.
so why use double softmax ?Please help me understand this design,Thanks!!!
The text was updated successfully, but these errors were encountered: