-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Swapped alpha
and beta
in tversky
loss?
#1993
Comments
tversky
loss?alpha
and beta
in tversky
loss?
One easy way to check is to compare against implementations in other libraries. Are they consistent with the current behaviour, the proposed change or neither? |
Thanks for the suggestion, @ToucheSir! Rechecked the implementation of Paperdef tversky(y_true, y_pred):
y_true_pos = K.flatten(y_true)
y_pred_pos = K.flatten(y_pred)
true_pos = K.sum(y_true_pos * y_pred_pos)
false_neg = K.sum(y_true_pos * (1-y_pred_pos)) # y .* (1 .- ŷ) -> FluxML
false_pos = K.sum((1-y_true_pos)*y_pred_pos) # (1 .- y) .* ŷ -> FluxML
alpha = 0.3
return (true_pos + smooth)/(true_pos + alpha*false_neg + (1-alpha)*false_pos + smooth)
def tversky_loss(y_true, y_pred):
return 1 - tversky(y_true,y_pred) Fluxfunction tversky_loss(ŷ, y; β = ofeltype(ŷ, 0.7))
_check_sizes(ŷ, y)
#TODO add agg
num = sum(y .* ŷ) + 1
den = sum(y .* ŷ + β * (1 .- y) .* ŷ + (1 - β) * y .* (1 .- ŷ)) + 1
1 - num / den
end The implementation looks different because of the swapped |
So just to clarify, using the same alpha/beta gives you the correct answer in both implementations? If there are any values of either hyperparam where they differ, I think the issue is still valid. |
Using Paperalpha = 0.3
tversky_loss = 1 - (true_pos + smooth)/(true_pos + alpha*false_neg + (1-alpha)*false_pos + smooth) Fluxbeta = 0.7
tversky_loss = 1 - (true_pos + smooth)/(true_pos + (1-beta)*false_neg + beta*false_pos + smooth) I was looking at the wrong mathematical expression (probably wrong in the blog that I was reading) - Wrong (what I was referring to)Correct |
The$\alpha$ and $\beta$ , and $\alpha$ as 1 - $\beta$ . The loss is defined as
tversky
loss has 2 parameters,Flux
internally calculates the value of1 - tversky index
-Flux
implements it as -Code -
Notice how the term$\beta$ , whereas it should be multiplied with $\alpha$ (which is 1 - $\beta$ ). Similarly, the term $\alpha$ (that is 1 - $\beta$ ), whereas it should be multiplied with $\beta$ .
(1 .- y) .* ŷ
(False Positives, I hope I am not wrong) is multiplied byy .* (1 .- ŷ)
is multiplied withThis makes the loss function behave in a manner opposite to its documentation. For example -
Here the loss for$\beta$ is 0.7; hence it should give more weight to FN), but the exact opposite happens.
ŷ_fnp, y
should have been larger than the loss forŷ_fp, y
as the loss should give more weight or penalize the False Negatives (defaultChanging the implementation of the loss -
which looks right.
Is this a bug, or am I missing something? Would be happy to create a PR if it is a bug!
The text was updated successfully, but these errors were encountered: