Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[💡SUG] Do you have training and evaluation speed reference benchmarks? #518

Closed
tszumowski opened this issue Nov 20, 2020 · 8 comments
Closed
Labels
enhancement New feature or request

Comments

@tszumowski
Copy link

tszumowski commented Nov 20, 2020

Is your feature request related to a problem? Please describe.
Will you be able to post how long it typically takes to train and evaluate for an epoch for the models? Even just for one large dataset this can be helpful for the community (i.e. MovieLens-1M)

I notice #484 and #485. I understand from that PR there are no plans to keep a scoreboard.

However, it's a bit difficult to determine whether or now it is worth benchmarking an algorithm because any given algorithm may take hours to run a single epoch on a GPU.

For example, in a private dataset comparable to MovieLens-10M, I'm seeing drastically different training times across the general recommenders using a P100 GPU, from a few seconds/epoch to several minutes/epoch.

Having preliminary train/evaluation times would help a user understand accuracy vs. speed tradeoffs. It will also help users and developers benchmark speeds against other open-source implementation.

Describe the solution you'd like
A preliminary list of training time per-epoch and evaluation time per-epoch using default configurations for each recommender, using MovieLens-1M dataset.

Describe alternatives you've considered
N/A

Additional context
N/A

@tszumowski tszumowski added the enhancement New feature or request label Nov 20, 2020
@batmanfly
Copy link
Member

Thanks for this nice suggestion. We will discuss on this point and update the response soon.

-Wayne Xin Zhao

@tszumowski
Copy link
Author

To add a bit. The intent is not to creste a top performing benchmark in speed or accuracy. Rather, it would be a rough guide for users that provide parameters that work on a common platform (e.g. Colab K80) and and example of what to expect for runtimes.

Thank you for the consideration.

@batmanfly
Copy link
Member

To add a bit. The intent is not to creste a top performing benchmark in speed or accuracy. Rather, it would be a rough guide for users that provide parameters that work on a common platform (e.g. Colab K80) and and example of what to expect for runtimes.

Thank you for the consideration.

Our team just had a discussion on this issue. We would arrange the test and give a rough time estimate of the implemented algorithms on some selected datasets with varying sizes. Hopefully, we would update these efficiency results on the main page or otherwhere before next Wednesday.

We would also inform you on this issue page.

BTW, your mentioned LightGCN issue is also important. I think if such a speed board was available, that issue might be clear. Our team also asked the implementer to locate the lines that are likely to yield the thrown memory exception. Will get back to you with the answer soon. A practical hint is that different algorithms may scale to varying-sized datasets. Graph based algorithms are likely to take up more space than other kinds of algorithms, which is likely to throw memory exception on large-scale datasets (e.g., Gowalla dataset). That is why we provide a series of data preprocessing functions in the library, e.g., K-core filtering. In the future, we would consider accelerating some competitive algorithms with slow speed (that would take some time, probably in 2021=) ).

Thanks again for your efforts with these suggestions!

@tszumowski
Copy link
Author

tszumowski commented Nov 25, 2020

@batmanfly (and @ShanleiMu )I saw this post today, which provides links to time and memory costs for general recommenders and sequential recommenders. Thank you.

I had a few questions/requests for those lists and figured this is a good Issue thread to post.

  1. I believe the times here are in seconds-per-epoch, corrrect? (sec/epoch). If so, adding that will help clarify for new users.
  2. I believe the memory is the GPU memory, correct? If so, adding that will help clarify.
  3. Would it be possible to run on the Context-Aware recommenders too? I tried some of those yesterday and realized that adding side-features can dramatically slow down training in some cases (depending on # features, feature structure, etc)

Thank you again!

@batmanfly
Copy link
Member

@batmanfly (and @ShanleiMu )I saw this post today, which provides links to time and memory costs for general recommenders and sequential recommenders. Thank you.

I had a few questions/requests for those lists and figured this is a good Issue thread to post.

  1. I believe the times here are in seconds-per-epoch, corrrect? (sec/epoch). If so, adding that will help clarify for new users.
  2. I believe the memory is the GPU memory, correct? If so, adding that will help clarify.
  3. Would it be possible to run on the Context-Aware recommenders too? I tried some of those yesterday and realized that adding side-features can dramatically slow down training in some cases (depending on # features, feature structure, etc)

Thank you again!

@tszumowski Nice suggestions. We will add these details to clarity our results.

For context- and knowledge- aware algorithms, their results are on the way=) We do find that some context-aware algorithms run more slowly than general recommendation algorithms, so that we didn't obtain their results by now. Their results are expected to be ready on this weekend based on current intermediate results.

@tszumowski
Copy link
Author

@batmanfly great! You're all so fast and responsive!

@ShanleiMu
Copy link
Member

@tszumowski We have added more details to clarify our results and updated the time and memory costs of context-aware recommenders and knowledge-based recommenders.

@tszumowski
Copy link
Author

@ShanleiMu this is great! Thank you. I'll close this issue given all the great docs!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants