Ideas from Elliot #757

deklanw · 2021-03-06T22:41:56Z

deklanw
Mar 6, 2021

There is a new recsys evaluation framework called Elliot which seems to be doing many things well

code: https://github.com/sisinflab/elliot
paper: https://arxiv.org/abs/2103.02590

Some of the things it does are notable improvements (IMO) over RecBole:

More filtering/splitting/evaluation options
Several beyond-accuracy metrics built in of different types: novelty, diversity, fairness, bias, etc https://elliot.readthedocs.io/en/latest/guide/metrics_intro.html
Single-file configuration for multiple models at once (including all hyperparameter grids), with statistical tests for significance
Saves recommendation lists

The single-file thing is very appealing to me in comparison to the way RecBole does it. Consider if I want to compare 10 algorithms. With RecBole I need 10 yml files and 10 hyper files, I have to save the metrics, and then compare them myself (with no statistical tests). With Elliot I have a single file for all of that.

It seems like many of these things could be adopted here

Answered by batmanfly

Mar 7, 2021

@deklanw Thanks a lot for the nice pointer. We will definitely consider incorporating these improvements.

In fact, we are always considering some of the suggested settings. We delay the implementaiton because (we admit if included, they will be useful in some scenarios):

(1) More metrics. Although bias, diversity or other beyond-accuracy metrics are introduced, there are still no widely accepted measurement on these metrics in the literature. Actually, many datasets cannot be evaluated with these beyond-accuracy metrics, e.g., if one dataset has no category or label informaiton of items, and it would be difficult to compute diversity metrics. So, the quesitons will be whether existing bey…

View full answer

batmanfly · 2021-03-07T01:24:18Z

batmanfly
Mar 7, 2021

@deklanw Thanks a lot for the nice pointer. We will definitely consider incorporating these improvements.

In fact, we are always considering some of the suggested settings. We delay the implementaiton because (we admit if included, they will be useful in some scenarios):

(1) More metrics. Although bias, diversity or other beyond-accuracy metrics are introduced, there are still no widely accepted measurement on these metrics in the literature. Actually, many datasets cannot be evaluated with these beyond-accuracy metrics, e.g., if one dataset has no category or label informaiton of items, and it would be difficult to compute diversity metrics. So, the quesitons will be whether existing beyond-accuracy metrics are recognized in the research community and widely applicable to various datasets?

(2) Single-file configuration. It is a feature that should be included. While, a direct problem will be if we had a number of compared neural models which all require considerable efforts to tune their parameters, how do we implement parallel optimiztion (e.g., hyper-parameter searching) on one server or machine. It might invovle in computation schedule. Otherwise, how to feed the models with the suggested parameter settings with the configuration files (if the number of the configured parameters is large, the configuration file would also look messy)? Another way is to implement the interface as a simple wrapper, so that it just plays the role of combining mulitple configuration files.

We will discuss on your suggestions, and quickly make some decisions on our future plan.

1 reply

deklanw Mar 8, 2021
Author

Thanks for the quick response. To address some of the points,

It's true that there appears to be no standard set of metrics in the literature. That sounds like a very good reason for an evaluation framework to implement some! Someone has to set some kind of standard.

Most of the beyond-accuracy metrics in Elliot don't require side information. Mostly only the fairness metrics do (fairness with respect to what divisions). Here are some which don't: item coverage, entropy, Gini diversity, Expected Popularity Complement (EPC), Average coverage of long tail items (ACLT), Average percentage of long tail items (APLT), Average Recommendation Popularity (ARP), Popularity-based Ranking-based Equal Opportunity (PopREO), Popularity-based Ranking-based Statistical Parity (PopRSP). Citations for all of these are in the Elliot docs.

Yes, with a large number of models some kind of distributed training setup seems needed. Or, at least, resumable execution (for getting kicked off of Google Colab, for example).

Reminds me of a feature from beta-recsys https://github.com/beta-team/beta-recsys:

It is deployable in a wide range of environments via pre-built docker containers and supports distributed parameter tuning using Ray.

Thanks!

batmanfly · 2021-03-10T00:26:24Z

batmanfly
Mar 10, 2021

@deklanw Thanks for the clarifications. Sound good!

So far, we haven't touched beyond-accuracy metrics. As you suggested, it was definitely good to have a try. We are thinking of a larger extension in this aspect. We found that in RecBole, the model was somehow tied with the evaluation part (e.g., CTR prediction and ranking have different evaluation procedures in our implementation). We will try to decouple the two parts, and make it flexible to implement various evaluation metrics. This is scheduled at the summer vocation of our school (around August). If this was feasible, it would become easier to implement various evaluation metrics.

For running multiple models, the suggestion has been adopted by our team as the first priority in the few months. We will first try to release a version that can wrap and simplify YAML files for running multiple models (without considering the parallel optimization at the first stage). If this turned to be easily used for users, we will consider incorporating optimization techniques to accelerate running multiple models simultaneously. Will update the information at this page.

Thanks again for contributing new ideas for improvements in RecBole.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ideas from Elliot #757

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Ideas from Elliot #757

deklanw Mar 6, 2021

Replies: 2 comments · 1 reply

batmanfly Mar 7, 2021

deklanw Mar 8, 2021 Author

batmanfly Mar 10, 2021

deklanw
Mar 6, 2021

Replies: 2 comments 1 reply

batmanfly
Mar 7, 2021

deklanw Mar 8, 2021
Author

batmanfly
Mar 10, 2021