-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[High-Level-API] Rewrite Chapter 5 Personalized Recommendation in Book to use new Flui… #526
[High-Level-API] Rewrite Chapter 5 Personalized Recommendation in Book to use new Flui… #526
Conversation
05.recommender_system/README.md
Outdated
|
||
Our program starts with importing necessary packages and initializes some global variables: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
starts with importing necessary packages and initializing
05.recommender_system/README.md
Outdated
``` | ||
|
||
Movie title, a sequence of words represented by an integer word index sequence, will be feed into a `sequence_conv_pool` layer, which will apply convolution and pooling on time dimension. Because pooling is done on time dimension, the output will be a fixed-length vector regardless the length of the input sequence. | ||
Movie title, which is a sequence of words represented by an integer word index sequence, will be feed into a `sequence_conv_pool` layer, which will apply convolution and pooling on time dimension. Because pooling is done on time dimension, the output will be a fixed-length vector regardless the length of the input sequence. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will be fed
05.recommender_system/README.md
Outdated
|
||
Finally, we can use cosine similarity to calculate the similarity between user characteristics and movie features. | ||
Finally, we can define a `inference_program` that use cosine similarity to calculate the similarity between user characteristics and movie features. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an inference_program
that uses
05.recommender_system/README.md
Outdated
|
||
Before jumping into creating a training module, algorithm setting is also necessary. Here we specified Adam optimization algorithm via `paddle.optimizer`. | ||
Next we define data feeders for test and train. The feeder reads a `BATCH_SIZE` of data each time and feed them to the training/testing process. | ||
`paddle.dataset.movielens.train` will yield records during each pass, after shuffling, a batch input of `buf_size` is generated for training. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the sentence is not clear. Plus, buf_size is larger than BATCH_SIZE. I think the logic is reversed...
05.recommender_system/README.md
Outdated
|
||
`paddle.dataset.movielens.train` will yield records during each pass, after shuffling, a batch input is generated for training. | ||
Create a trainer that takes `train_program` as input and specifies optimizer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create ... and specify
05.recommender_system/README.md
Outdated
if step % 100 == 0: # every 100 batches, update cost plot | ||
cost_ploter.plot() | ||
Use create_lod_tensor(data, lod, place) API to generate LoD Tensor, where `data` is a list of sequences of index numbers, `lod` is the level of detail (lod) info associated with `data`. | ||
For example, data = [[10, 2, 3], [2, 3]] means that it contains two sequences of indexes, of length 3 and 2, respectively. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indexes => indices
05.recommender_system/README.md
Outdated
Finally, we can invoke `trainer.train` to start training: | ||
### Infer | ||
|
||
Now we can infer with inputs that matched with the yield records that we provide in `feed_order` during training. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
matched => match
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence is not clear. Maybe break it into two?
@@ -98,13 +98,13 @@ Figure 4. A hybrid recommendation model. | |||
|
|||
We use the [MovieLens ml-1m](http://files.grouplens.org/datasets/movielens/ml-1m.zip) to train our model. This dataset includes 10,000 ratings of 4,000 movies from 6,000 users to 4,000 movies. Each rate is in the range of 1~5. Thanks to GroupLens Research for collecting, processing and publishing the dataset. | |||
|
|||
`paddle.v2.datasets` package encapsulates multiple public datasets, including `cifar`, `imdb`, `mnist`, `moivelens` and `wmt14`, etc. There's no need for us to manually download and preprocess `MovieLens` dataset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I told Nicki the same. He really has a keen sight 👀
``` | ||
|
||
Finally, we can invoke `trainer.train` to start training: | ||
### Infer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inference
}, | ||
return_numpy=False) | ||
|
||
print("infer results: ", np.array(results[0])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we show a comparison between prediction and the real data? For example, user 23::M::35::0::90049
rated movie 2278::Ronin (1998)::Action|Crime|Thriller
a 4.0 score. Our prediction is 3.458
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion, i think it would be helpful
…d API
I will add plot in next PR