Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multitarget - same metrics for several different algorithms #71

Open
thiagonazareth opened this issue Apr 6, 2021 · 4 comments
Open

Comments

@thiagonazareth
Copy link

Dear, good night. I'm sorry for the English, I'm using a translator.
I am using MEKA for my master's work, which is using machine learning to predict student retention in higher education, and I came across the following situation. Using the GUI interface and running Meka Explorer
to test the multitarget algorithms, the results of Hamming score and Accuracy (per label) are the same for several different algorithms. I used two multitarget datasets available in the data MEKA folder, the
thyroid-L7.arff and solar_flare.arff, and the same behavior of equal metrics for different algorithms occurs.

Using meka.classifiers.multitarget.CC, meka.classifiers.multitarget.BCC, meka.classifiers.multitarget.CCp and meka.classifiers.multitarget.CR, all running with J48 and NaiveBayes, with default parameters, present the same results as Hamming score, Exact match, Hamming loss, ZeroOne loss, Levenshtein distance and Accuracy (per label).

I ran the experiments on both Mac OSX and Ubuntu.

The result is this for all the algorithms and variations mentioned above, using the thyroid-L7.arff dataset.

N (test) 3119
L 7
Hamming score 0.281
Exact match 0
Hamming loss 0.719
ZeroOne loss 1
Levenshtein distance 0.719
Label indices [0 1 2 3 4 5 6]
Accuracy (per label) [0.002 0.023 0.006 0.939 0.013 0.001 0.980]

@thiagonazareth
Copy link
Author

thiagonazareth commented Apr 14, 2021

I found the problem and created the pull request to fix it

@jmread
Copy link
Contributor

jmread commented Apr 28, 2021

It is not necessarily a problem to get the same results for different algorithms. But if I understand, according to your proposed change, it looks like this may be a result of the posterior distribution information not being copied into the right place where it is later accessed under evaluation metrics. Is that correct?

@thiagonazareth
Copy link
Author

I agree that it is not necessarily a problem to get the same results for different algorithms, but it caught my attention to use very different algorithms, varying several input parameters and they bring the same result. The problem I found was the following: the size of the array that stores the results (distributionForInstance method) is doubled, to store the result of the label from position i and in position i + N store probability information from label i. Probability information has not yet been implemented for MT classifiers. So, Arrays.copyOfRange (y, L, L * 2) takes information that is always of value 1. The correct thing is to do Arrays.copyOfRange (y, 0, L);

@jmread
Copy link
Contributor

jmread commented Apr 30, 2021

The reason for the doubling of the array is to make space to store the probability information from the posterior, P(y[j] = y_max[j] | x) where y_max[j] is the most likely value. The first part of the array (up to L) is used to store the y_max[j] directly for each label. This is not needed in the standard multi-label case, we just store P(y[j] = 1) instead, because y_max[j] can be inferred directly (there are only two possible values -- either 0 or 1). In the multi-target case, this was included mainly for display/debug purposes, and does not represent the full distribution anyway. I guess this is what you mean by "Probability information not yet implemented for MT classifiers". I agree that the fix you propose makes sense. It seems that in this part of the code information is missing from 0...L altogether, which shouldn't be the case. Probably it also needs to be accompanied by a unit test for example on thyroid-L7.arff (as you used above to demonstrate the issue). Are you able to put your experiment into a small unit test?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants