You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! Thanks a lot for the quick release of 1.7.2 to hotfix #8491, now SparkXGBRanker can work more properly. Based on 1.7.2, I also have encountered 2 issues of SparkXGBRanker regarding evaluation metrics,
inaccurate eval metric output given different number of workers
while I use num_workers=1 and num_workers=4 and same parameter set to train the model, the results of model._xgb_sklearn_model.best_score are very different, I also write a manual eval function and the results also do not align with model._xgb_sklearn_model.best_score under both cases, the eval metric I used for early stopping is "NDCG@10"
internal ndcg@10 on validation data ~ model._xgb_sklearn_model.best_score
num_workers = 1
manual train ndcg@10 -- 0.5931561486501443, internal ndcg@10 on validation data -- 0.46678797961901586, manual valid ndcg@10 -- 0.43402427434921265
num_workers = 4
manual train ndcg@10 -- 0.5499394648169217, internal ndcg@10 on validation data -- 0.6978460945534216, manual valid ndcg@10 -- 0.43996799528598785
Could you take a look at the eval metric logics and fix it? Here are the code to reproduce:
SparkXGBRanker can not pass multiple eval metrics
SparkXGBRanker does not support passing multiple eval metrics yet, could you add this feature into roadmap?
Hi! Thanks a lot for the quick release of 1.7.2 to hotfix #8491, now SparkXGBRanker can work more properly. Based on 1.7.2, I also have encountered 2 issues of SparkXGBRanker regarding evaluation metrics,
A sample dataset is attached to reproduce the results
https://github.com/lezzhov/learning_to_rank/tree/main/learning_to_rank/data/train.txt
while I use num_workers=1 and num_workers=4 and same parameter set to train the model, the results of model._xgb_sklearn_model.best_score are very different, I also write a manual eval function and the results also do not align with model._xgb_sklearn_model.best_score under both cases, the eval metric I used for early stopping is "NDCG@10"
internal ndcg@10 on validation data ~ model._xgb_sklearn_model.best_score
num_workers = 1
manual train ndcg@10 -- 0.5931561486501443, internal ndcg@10 on validation data -- 0.46678797961901586, manual valid ndcg@10 -- 0.43402427434921265
num_workers = 4
manual train ndcg@10 -- 0.5499394648169217, internal ndcg@10 on validation data -- 0.6978460945534216, manual valid ndcg@10 -- 0.43996799528598785
Could you take a look at the eval metric logics and fix it? Here are the code to reproduce:
###################################################
SparkXGBRanker does not support passing multiple eval metrics yet, could you add this feature into roadmap?
The error is that:
The text was updated successfully, but these errors were encountered: