Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qcactus/add lightgcn #1123

Merged
merged 9 commits into from
Jun 25, 2020

Conversation

Qcactus
Copy link
Contributor

@Qcactus Qcactus commented Jun 17, 2020

Description

Add LightGCN algorithm and lightgcn_deep_dive notebook.

Related Issues

Checklist:

  • I have followed the contribution guidelines and code style for this project.
  • I have added tests covering my contributions.
  • I have updated the documentation accordingly.
  • This PR is being made to staging and not master.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

Review Jupyter notebook visual diffs & provide feedback on notebooks.


Powered by ReviewNB

@ghost
Copy link

ghost commented Jun 17, 2020

CLA assistant check
All CLA requirements met.

@Leavingseason
Copy link
Collaborator

Hi all, this is a contributor from university. They are putting the very recent SIGIR2020 paper https://arxiv.org/abs/2002.02126 to our codebase. I will go through the code first and will let the authors correct some issues (if there is any). After that I will ping you in Teams so that you can start the review process.

@miguelgfierro
Copy link
Collaborator

this is awesome! super good work

notebooks/02_model/lightgcn_deep_dive.ipynb Show resolved Hide resolved
@@ -0,0 +1,807 @@
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for all the deprecated warnings:

WARNING:tensorflow:From ../../reco_utils/recommender/deeprec/graphrec/lightgcn.py:158: The name tf.sparse_tensor_dense_matmul is deprecated. Please use tf.sparse.sparse_dense_matmul instead. 

WARNING:tensorflow:From ../../reco_utils/recommender/deeprec/graphrec/lightgcn.py:116: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.

WARNING:tensorflow:From ../../reco_utils/recommender/deeprec/graphrec/lightgcn.py:117: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.

WARNING:tensorflow:From ../../reco_utils/recommender/deeprec/graphrec/lightgcn.py:119: The name tf.GPUOptions is deprecated. Please use tf.compat.v1.GPUOptions instead.

WARNING:tensorflow:From ../../reco_utils/recommender/deeprec/graphrec/lightgcn.py:120: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

WARNING:tensorflow:From ../../reco_utils/recommender/deeprec/graphrec/lightgcn.py:121: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From ../../reco_utils/recommender/deeprec/graphrec/lightgcn.py:123: The name tf.global_variables_initializer is deprecated. Please use tf.compat.v1.global_variables_initializer instead.

would you mind to change the code to tf.compact.v1? it would help if/when we change to TF2



Reply via ReviewNB

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have the plan to switch to TF2?
If not, I should suggest it keep the old fashion, because as far as I know, most of people in industry are not willing to use TF2, becuase it will cause a lot of platform refactor in their codebase,

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so far we are not planning to, there was a discussion about this #953.

However, I guess that at some point we will change, but I don't expect us to rewrite the algos for TF2. We are doing a large refactor in PR #1086 and one of the things we are doing, slowly, is to add tf.compat.v1 https://github.com/microsoft/recommenders/blob/24b6ba9664b808abb41f118c9adefb983b56be1d/reco_utils/recommender/ncf/ncf_singlenode.py#L58. The reason to do this work now is to save work in the future, if at some point we change to TF2, if we have everything changed to compat.v1, we won't break the repo.

Copy link
Collaborator

@anargyri anargyri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, thanks for contributing this method!

Copy link
Collaborator

@miguelgfierro miguelgfierro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code is really nice, but there are several parts that I think we should change before merging. I think it is important to have single responsibility with the fit method and also to perform DRY in the metrics

reco_utils/recommender/deeprec/graphrec/lightgcn.py Outdated Show resolved Hide resolved
reco_utils/recommender/deeprec/graphrec/ranking_metrics.py Outdated Show resolved Hide resolved
@Qcactus Qcactus requested a review from anargyri June 20, 2020 08:46
Copy link
Collaborator

@miguelgfierro miguelgfierro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really good

@miguelgfierro miguelgfierro merged commit 089f246 into recommenders-team:staging Jun 25, 2020
@miguelgfierro
Copy link
Collaborator

hey @Qcactus, this is really nice. If you want to, feel free to add your name to https://github.com/microsoft/recommenders/blob/master/AUTHORS.md

@miguelgfierro
Copy link
Collaborator

hey @Qcactus, in which machine have you computed the stats of the notebook? I'm trying to replicate

FYI @Leavingseason

@Qcactus
Copy link
Contributor Author

Qcactus commented Jul 7, 2020

@miguelgfierro GeForce GTX 1080Ti. Anything wrong with the notebook?

@miguelgfierro
Copy link
Collaborator

@miguelgfierro GeForce GTX 1080Ti. Anything wrong with the notebook?

I was testing the notebook on a K80 gpu with different batch sizes, but interestingly, the gpu memory doesn't change when I show it with nvidia-smi. These are the results I got:

EPOCHS = 5
#BATCH_SIZE = 1024 # with ML1m: Epoch 2 (train)169.7s, gpu memory 56Mb
#BATCH_SIZE = 4096 # with ML1m: Epoch 2 (train)47.2s, gpu memory 56Mb
# BATCH_SIZE = 16384 # (=1024*4*4), with ML1m: Epoch 2 (train)16.2s, gpu memory 56Mb
# BATCH_SIZE = 65536 # (=1024*4*4*4), with ML1m: Epoch 2 (train)9.0s, gpu memory 56Mb

I tried another machine, this time with 4K80, and got similar results: BATCH_SIZE = 65536 # (=1024*4*4*4), with ML1m: Epoch 2 (train)8.8s, gpu memory 56Mb

Looking at the code it seems it is using gpu: https://github.com/microsoft/recommenders/blob/staging/reco_utils/recommender/deeprec/models/graphrec/lightgcn.py#L107 so I don't understand why the gpu memory is so low, do you know what could be happening?

@Qcactus
Copy link
Contributor Author

Qcactus commented Jul 7, 2020

@miguelgfierro
I tested the notebook on GeForce GTX 1080Ti. Here is the result:

MovieLens 1m
BATCH_SIZE=1024, Epoch 2 (train)28.0s, GPU Memory 396Mb
BATCH_SIZE=4096, Epoch 2 (train)13.3s, GPU Memory 396Mb
BATCH_SIZE=16384, Epoch 2 (train)9.7s, GPU Memory 396Mb
BATCH_SIZE=65536, Epoch 2 (train)7.5s, GPU Memory 457Mb

It seems reasonable on my machine. But I don't have access to other kinds of GPU, so I might not be able to find out the problem. Have you used tensorflow 1.15.2? (I noticed that some codes in this repo are tested with tf 1.11.) Or maybe you can try to test the notebook on a GeForce to see whether the result is similar with mine.

@miguelgfierro
Copy link
Collaborator

I found the issue, the low memory consumption we were having was because the datasets were small, if I used ML10M or ML20M, I was getting 6713MiB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants