-
There is a new recsys evaluation framework called Elliot which seems to be doing many things well Some of the things it does are notable improvements (IMO) over RecBole:
The single-file thing is very appealing to me in comparison to the way RecBole does it. Consider if I want to compare 10 algorithms. With RecBole I need 10 It seems like many of these things could be adopted here |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
@deklanw Thanks a lot for the nice pointer. We will definitely consider incorporating these improvements. In fact, we are always considering some of the suggested settings. We delay the implementaiton because (we admit if included, they will be useful in some scenarios): (1) More metrics. Although bias, diversity or other beyond-accuracy metrics are introduced, there are still no widely accepted measurement on these metrics in the literature. Actually, many datasets cannot be evaluated with these beyond-accuracy metrics, e.g., if one dataset has no category or label informaiton of items, and it would be difficult to compute diversity metrics. So, the quesitons will be whether existing beyond-accuracy metrics are recognized in the research community and widely applicable to various datasets? (2) Single-file configuration. It is a feature that should be included. While, a direct problem will be if we had a number of compared neural models which all require considerable efforts to tune their parameters, how do we implement parallel optimiztion (e.g., hyper-parameter searching) on one server or machine. It might invovle in computation schedule. Otherwise, how to feed the models with the suggested parameter settings with the configuration files (if the number of the configured parameters is large, the configuration file would also look messy)? Another way is to implement the interface as a simple wrapper, so that it just plays the role of combining mulitple configuration files. We will discuss on your suggestions, and quickly make some decisions on our future plan. |
Beta Was this translation helpful? Give feedback.
-
@deklanw Thanks for the clarifications. Sound good! So far, we haven't touched beyond-accuracy metrics. As you suggested, it was definitely good to have a try. We are thinking of a larger extension in this aspect. We found that in RecBole, the model was somehow tied with the evaluation part (e.g., CTR prediction and ranking have different evaluation procedures in our implementation). We will try to decouple the two parts, and make it flexible to implement various evaluation metrics. This is scheduled at the summer vocation of our school (around August). If this was feasible, it would become easier to implement various evaluation metrics. For running multiple models, the suggestion has been adopted by our team as the first priority in the few months. We will first try to release a version that can wrap and simplify YAML files for running multiple models (without considering the parallel optimization at the first stage). If this turned to be easily used for users, we will consider incorporating optimization techniques to accelerate running multiple models simultaneously. Will update the information at this page. Thanks again for contributing new ideas for improvements in RecBole. |
Beta Was this translation helpful? Give feedback.
@deklanw Thanks a lot for the nice pointer. We will definitely consider incorporating these improvements.
In fact, we are always considering some of the suggested settings. We delay the implementaiton because (we admit if included, they will be useful in some scenarios):
(1) More metrics. Although bias, diversity or other beyond-accuracy metrics are introduced, there are still no widely accepted measurement on these metrics in the literature. Actually, many datasets cannot be evaluated with these beyond-accuracy metrics, e.g., if one dataset has no category or label informaiton of items, and it would be difficult to compute diversity metrics. So, the quesitons will be whether existing bey…