[Bug]: Swapped `alpha` and `beta` in `tversky` loss? #1993

Saransh-cpp · 2022-06-08T16:10:15Z

The tversky loss has 2 parameters, $\alpha$ and $\beta$, and Flux internally calculates the value of $\alpha$ as 1 - $\beta$. The loss is defined as 1 - tversky index -

$$1 - \frac{True Positives}{True Positives + \alpha * False Positives + \beta * False Negatives}$$

the Tversky index is defined as:
S(P, G; α, β) = |P G| / (|P G| + α|P \ G| + β|G \ P|) (2)
where α and β control the magnitude of penalties for FPs and FNs, respectively.

Flux implements it as -

1 - sum(|y .* ŷ| + 1) / (sum(y .* ŷ + β*(1 .- y) .* ŷ + (1 - β)*y .* (1 .- ŷ)) + 1)

Code -

num = sum(y .* ŷ) + 1
den = sum(y .* ŷ + β * (1 .- y) .* ŷ + (1 - β) * y .* (1 .- ŷ)) + 1
1 - num / den

Notice how the term (1 .- y) .* ŷ (False Positives, I hope I am not wrong) is multiplied by $\beta$, whereas it should be multiplied with $\alpha$ (which is 1 - $\beta$). Similarly, the term y .* (1 .- ŷ) is multiplied with $\alpha$ (that is 1 - $\beta$), whereas it should be multiplied with $\beta$.

This makes the loss function behave in a manner opposite to its documentation. For example -

julia> y = [0, 1, 0, 1, 1, 1];

julia> ŷ_fp = [1, 1, 1, 1, 1, 1];  # 2 false positive -> 2 wrong predictions

julia> ŷ_fnp = [1, 1, 0, 1, 1, 0];  # 1 false negative, 1 false positive -> 2 wrong predictions

julia> Flux.tversky_loss(ŷ_fnp, y)
0.19999999999999996

julia> Flux.tversky_loss(ŷ_fp, y)  # should be smaller than tversky_loss(ŷ_fnp, y), as FN is given more weight
0.21875

Here the loss for ŷ_fnp, y should have been larger than the loss for ŷ_fp, y as the loss should give more weight or penalize the False Negatives (default $\beta$ is 0.7; hence it should give more weight to FN), but the exact opposite happens.

Changing the implementation of the loss -

julia> y = [0, 1, 0, 1, 1, 1];

julia> ŷ_fp = [1, 1, 1, 1, 1, 1];  # 2 false positive -> 2 wrong predictions

julia> ŷ_fnp = [1, 1, 0, 1, 1, 0];  # 1 false negative, 1 false positive -> 2 wrong predictions

julia> Flux.tversky_loss(ŷ_fnp, y)
0.19999999999999996

julia> Flux.tversky_loss(ŷ_fp, y)  # should be smaller than tversky_loss(ŷ_fnp, y), as FN is given more weight
0.1071428571428571

which looks right.

Is this a bug, or am I missing something? Would be happy to create a PR if it is a bug!

The text was updated successfully, but these errors were encountered:

ToucheSir · 2022-08-16T14:06:32Z

One easy way to check is to compare against implementations in other libraries. Are they consistent with the current behaviour, the proposed change or neither?

Saransh-cpp · 2022-08-18T13:11:12Z

Thanks for the suggestion, @ToucheSir! Rechecked the implementation of tversky_loss with the one in the paper -

Paper

def tversky(y_true, y_pred):
    y_true_pos = K.flatten(y_true)
    y_pred_pos = K.flatten(y_pred)
    true_pos = K.sum(y_true_pos * y_pred_pos)
    false_neg = K.sum(y_true_pos * (1-y_pred_pos))  # y .* (1 .- ŷ) -> FluxML
    false_pos = K.sum((1-y_true_pos)*y_pred_pos)  # (1 .- y) .* ŷ -> FluxML
    alpha = 0.3
    return (true_pos + smooth)/(true_pos + alpha*false_neg + (1-alpha)*false_pos + smooth)

def tversky_loss(y_true, y_pred):
    return 1 - tversky(y_true,y_pred)

Flux

function tversky_loss(ŷ, y; β = ofeltype(ŷ, 0.7))
    _check_sizes(ŷ, y)
    #TODO add agg
    num = sum(y .* ŷ) + 1
    den = sum(y .* ŷ + β * (1 .- y) .* ŷ + (1 - β) * y .* (1 .- ŷ)) + 1
    1 - num / den
end

The implementation looks different because of the swapped alpha and beta values, but they act the same!

ToucheSir · 2022-08-18T19:42:52Z

So just to clarify, using the same alpha/beta gives you the correct answer in both implementations? If there are any values of either hyperparam where they differ, I think the issue is still valid.

Saransh-cpp · 2022-08-22T04:23:02Z

Using alpha=0.3 and beta=0.7 in paper's and flux's implementation works the same. Simplified expressions -

Paper

alpha = 0.3
tversky_loss = 1 - (true_pos + smooth)/(true_pos + alpha*false_neg + (1-alpha)*false_pos + smooth)

Flux

beta = 0.7
tversky_loss = 1 - (true_pos + smooth)/(true_pos + (1-beta)*false_neg + beta*false_pos + smooth)

I was looking at the wrong mathematical expression (probably wrong in the blog that I was reading) -

Wrong (what I was referring to)

$$1 - \frac{True Positives}{True Positives + \alpha * False Positives + \beta * False Negatives}$$

Correct

$$1 - \frac{True Positives}{True Positives + \alpha * False Negatives + \beta * False Positives}$$

Saransh-cpp changed the title ~~[Bug]: Swapped $\alpha$ and $\beta$ in tversky loss?~~ [Bug]: Swapped alpha and beta in tversky loss? Jun 8, 2022

Saransh-cpp mentioned this issue Jun 14, 2022

Miscellaneous docstring additions and fixes #1998

Merged

3 tasks

darsnack added this to Triage List Aug 16, 2022

darsnack moved this to Requested in Triage List Aug 16, 2022

ToucheSir closed this as completed in #1998 Aug 17, 2022

Repository owner moved this from Requested to Done in Triage List Aug 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Swapped `alpha` and `beta` in `tversky` loss? #1993

[Bug]: Swapped `alpha` and `beta` in `tversky` loss? #1993

Saransh-cpp commented Jun 8, 2022

ToucheSir commented Aug 16, 2022

Saransh-cpp commented Aug 18, 2022

ToucheSir commented Aug 18, 2022

Saransh-cpp commented Aug 22, 2022

[Bug]: Swapped alpha and beta in tversky loss? #1993

[Bug]: Swapped alpha and beta in tversky loss? #1993

Comments

Saransh-cpp commented Jun 8, 2022

ToucheSir commented Aug 16, 2022

Saransh-cpp commented Aug 18, 2022

Paper

Flux

ToucheSir commented Aug 18, 2022

Saransh-cpp commented Aug 22, 2022

Paper

Flux

Wrong (what I was referring to)

Correct

[Bug]: Swapped `alpha` and `beta` in `tversky` loss? #1993

[Bug]: Swapped `alpha` and `beta` in `tversky` loss? #1993