-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding French team contribution points #302
Conversation
I would argue this as one dataset so 2 points, but then you get the 2 bonus point pr. language which is not previously covered within Bitext mining (Tataoba). That should still give quite a few points for a valuable dataset. Note: that the bonus points have been updated slightly to accommodate for exactly these cases. Let me know what you think.
Def. add the bonus points. Generally it seems like you are missing quite a few bonus points (e.g. for retrieval x French)
In relation to MMTEB I don't believe we should add points for this as it is only machine-translated. |
Will you add these as well? |
@KennethEnevoldsen here is an update, still have some questions about dataset points. For PR reviews I just gave 1 point per review: Datasets:
Evaluated models:
Total for the French contrib: 486 PR reviews:
|
Shouldn't this be 2 * 125 = 250?
I believe the best approach is to not have bonuses twice.
I would not include these as models evaluated is on the whole of MMTEB, which will change due to new additions. |
Sorry Flores is 204 languages. So it's 204 - 75 = 129
Ok we'll keep them on only one of the datasets.
You don't count the language specific models? Some of the 42 models were French only models.. Also, I don't think contributors will be able to run models on the whole benchmark, it's really huge. I expect them to run models on their proposed datasets 🤔 |
Yea so had the idea that all models would be run on everything, but that seems problematic/wasteful. We should problably make it more clear what we mean by running a model. @imenelydiaker do you (or someone from your team) have the time to outline that segment? For now I would remove the models and get this PR merged it (then we can always add it in a new PR).
Yea I believe the compute cost of the MMTEB is something that we have to bring down. One solution might be to limit very large datasets. Another option is also to estimate the performance on unseen datasets in a smart way (e.g. estimate the latent factor of a models).
But to keep models comparable (evaluated on all dataset) everyone would need to run >20 models on their dataset. |
I'll take care of that 🙂
Yep agreed, we may not need huge datasets since we have a lot of them. We should discuss this. 🤔
Yes! |
Wonder thanks @imenelydiaker I think this is very reasonable, feel free to merge it in. |
* Update points.md * Update docs/mmteb/points.md * Update points.md * Update points.md
After discussing it with the team, we agreed on splitting the total numbers of points for French on 5 since we consider that we all contributed equally. I suggest the following and have some questions before adding the point to the
points.md
file:Datasets:
Evaluated models:
I didn't add PR review points for @wissam-sib and I, I lost track of them 😅 but I'll try to do that quickly!