fix rank_cut setting and minor problems #2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi Qingyao,
I fixed some problems, these are details:
add missing num_layers setting for the bug appeared in DLA/main.py
print("Created %d layers of %d units." % (model.hparams.num_layers, model.embed_size))
add missing flag: self_test
import six for the use of xrange in python3
the most important problem:
I found the rank_cut setting does not work when evaluate, because the generated test.ranklist file does not cover all documents under the condition that the rank_cut is equal to 10 all the time.
I modified the code as I understand, I trained 2k epochs, then eval in the original code, here is the result by trec_eval:
ndcg_cut_1 all 0.6971
ndcg_cut_3 all 0.6950
ndcg_cut_5 all 0.7076
ndcg_cut_10 all 0.7373
map all 0.5383
runid all RankLSTM
num_q all 6983
num_ret all 63481
num_rel all 123035
num_rel_ret all 50891
Then I used the modified code to generate new test.ranklist with the same model, here is the updated result by trec_eval:
ndcg_cut_1 all 0.7126
ndcg_cut_3 all 0.7101
ndcg_cut_5 all 0.7243
ndcg_cut_10 all 0.7639
map all 0.8443
runid all RankLSTM
num_q all 6983
num_ret all 165660
num_rel all 123035
num_rel_ret all 123035
You can clearly see the difference in the two test.ranklist files.(the doc numbers are different)
The performance might be higher than results in the paper due to the use of a different dataset. But I am wondering is it the code version that your result reached in the original paper. If that, then it can be underestimated.