Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: fix the format of docs #942

Merged
merged 4 commits into from
Aug 24, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 13 additions & 9 deletions docs/source/user_guide/config/evaluation_settings.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,19 +12,23 @@ Evaluation settings are designed to set parameters about model evaluation.

- ``order (str)``: decides how we sort the data in `.inter`. Now we support two kinds of ordering strategies: ``['RO', 'TO']``, which denotes the random ordering and temporal ordering. For ``RO``, we will shuffle the data and then split them in this order. For ``TO``, we will sort the data by the column of `TIME_FIELD` in ascending order and the split them in this order. The default value is `RO`.

- ``split (dict)``: decides how we split the data in `.inter`. Now we support two kinds of splitting strategies: ``['RS','LS']``, which denotes the ratio-based data splitting and leave-one-out data splitting. If the key of ``split`` is ``RS``, you need to set the splitting ratio like ``[0.8,0.1,0.1]``,``[7,2,1]`` or ``[8,0,2]``, which denotes the ratio of training set, validation set and testing set respectively. If the key of split is ``LS``, now we support three kinds of ``LS`` mode: ``['valid_and_test', 'valid_only', 'test_only']`` and you should choose one mode as the value of `LS`. The default value of `split` is ``{'RS': [0.8,0.1,0.1]}``.
- ``split (dict)``: decides how we split the data in `.inter`. Now we support two kinds of splitting strategies: ``['RS','LS']``, which denotes the ratio-based data splitting and leave-one-out data splitting. If the key of ``split`` is ``RS``, you need to set the splitting ratio like ``[0.8,0.1,0.1]``, ``[7,2,1]`` or ``[8,0,2]``, which denotes the ratio of training set, validation set and testing set respectively. If the key of split is ``LS``, now we support three kinds of ``LS`` mode: ``['valid_and_test', 'valid_only', 'test_only']`` and you should choose one mode as the value of `LS`. The default value of `split` is ``{'RS': [0.8,0.1,0.1]}``.

- ``mode (str)``: decides the data range which we evaluate the model on. Now we support four kinds of evaluation mode: ``['full','unixxx','popxxx','labeled']``. ``full`` , ``unixxx`` and ``popxxx`` are designed for the evaluation on implicit feedback (data without label). For implicit feedback, we regard the items with observed interactions as positive items and those without observed interactions as negative items. ``full`` means evaluating the model on the set of all items. ``unixxx``, for example ``uni100``, means uniformly sample 100 negative items for each positive item in testing set, and evaluate the model on these positive items with their sampled negative items. ``popxxx``, for example ``pop100``, means sample 100 negative items for each positive item in testing set based on item popularity (:obj:`Counter(item)` in `.inter` file), and evaluate the model on these positive items with their sampled negative items. Here the `xxx` must be an integer. For explicit feedback (data with label), you should set the mode as ``None`` and we will evaluate the model based on your label. The default value is ``full``.

- ``repeatable (bool)``: Whether to evaluate the result with a repeatable recommendation scene.
Note that it is disabled for sequential models as the recommendation is already repeatable.
For other models, defaults to ``False``.
- ``repeatable (bool)``: Whether to evaluate the result with a repeatable recommendation scene. Note that it is disabled for sequential models as the recommendation is already repeatable. For other models, defaults to ``False``.
- ``metrics (list or str)``: Evaluation metrics. Defaults to
``['Recall', 'MRR', 'NDCG', 'Hit', 'Precision']``. Range in
``[''Recall', 'MRR', 'NDCG', 'Hit', 'MAP', 'Precision', 'AUC',
'MAE', 'RMSE', 'LogLoss', 'ItemCoverage', 'AveragePopularity',
'GiniIndex','ShannonEntropy','TailPercentage' ]``. Note that value-based
metrics and ranking-based metrics can not be used together.
``['Recall', 'MRR', 'NDCG', 'Hit', 'Precision']``. Range in the following table:

============== =================================================
Type Metrics
============== =================================================
Ranking-based Recall, MRR, NDCG, Hit, MAP, Precision, GAUC,ItemCoverage, AveragePopularity, GiniIndex, ShannonEntropy, TailPercentage
value-based AUC, MAE, RMSE, LogLoss
============== =================================================

Note that value-based metrics and ranking-based metrics can not be used together.

- ``topk (list or int or None)``: The value of k for topk evaluation metrics.
Defaults to ``10``.
- ``valid_metric (str)``: The evaluation metrics for early stopping.
Expand Down
21 changes: 12 additions & 9 deletions docs/source/user_guide/train_eval_intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,9 +62,7 @@ The parameters used to control the evaluation method are as follows:
- ``mode (str)``: Control different candidates of ranking.
Range in ``[labeled, full,unixxx,popxxx]`` and defaults to ``full``.

- ``repeatable (bool)``: Whether to evaluate the result with a repeatable recommendation scene.
Note that it is disabled for sequential models as the recommendation is already repeatable.
For other models, defaults to ``False``.
- ``repeatable (bool)``: Whether to evaluate the result with a repeatable recommendation scene. Note that it is disabled for sequential models as the recommendation is already repeatable. For other models, defaults to ``False``.

Evaluation metrics
>>>>>>>>>>>>>>>>>>>>>>>>>>
Expand All @@ -86,12 +84,17 @@ More details about metrics can refer to :doc:`/recbole/recbole.evaluator.metrics
The parameters used to control the evaluation metrics are as follows:

- ``metrics (list or str)``: Evaluation metrics. Defaults to
``['Recall', 'MRR', 'NDCG', 'Hit', 'Precision']``. Range in
``['Recall', 'MRR', 'NDCG', 'Hit', 'MAP', 'Precision', 'AUC',
'MAE', 'RMSE', 'LogLoss', 'ItemCoverage', 'AveragePopularity',
'GiniIndex','ShannonEntropy','TailPercentage']``.
Note that value-based metrics and ranking-based metrics can not be used together.
``['Recall', 'MRR', 'NDCG', 'Hit', 'Precision']``. Range in the following table:

============== =================================================
Type Metrics
============== =================================================
Ranking-based Recall, MRR, NDCG, Hit, MAP, Precision, GAUC,ItemCoverage, AveragePopularity, GiniIndex, ShannonEntropy, TailPercentage
value-based AUC, MAE, RMSE, LogLoss
============== =================================================

Note that value-based metrics and ranking-based metrics can not be used together.
- ``topk (list or int or None)``: The value of k for topk evaluation metrics.
Defaults to ``10``.

For more details about evaluation settings, please read :doc:`config/evaluation_settings`
For more details about evaluation settings, please read :doc:`config/evaluation_settings`