Skip to content

Commit

Permalink
Test FIL probabilities with absolute error thresholds in python (#3582)
Browse files Browse the repository at this point in the history
Probabilities are limited between [0.0, 1.0]. Also, we generally care more about large probabilities which are `O(1/n_classes)`.
The largest relative probability errors are usually caused by a small ground truth probability (e.g. 1e-3), as opposed to a large absolute error.
Hence, relative probability error is not the best metric. Absolute probability error is more relevant.
Moreover, absolute probability error is more stable, as relative errors have a long tail. When training or even inferring on many rows, the chance of getting a ground truth probability sized 1e-3 or 1e-4 grows. In some cases, there is no reasonable and reliable threshold. Last, if the number of predicted probabilities (clipped values) per input row grows, so does the long tail of relative probability errors, due to less undersampling. This unfairly compares binary classification with regression, and multiclass classification with binary classification.

The changes below are based on collecting absolute errors under `--run_unit`, `--run_quality` and `--run_stress`. These thresholds are violated at most a couple times per million samples, in most cases never.

Authors:
  - @levsnv

Approvers:
  - John Zedlewski (@JohnZed)
  - Andy Adinets (@canonizer)

URL: #3582
  • Loading branch information
levsnv authored Mar 9, 2021
1 parent fd9ec89 commit cd220fc
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions python/cuml/test/test_fil.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2019, NVIDIA CORPORATION.
# Copyright (c) 2019-2021, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -138,7 +138,7 @@ def test_fil_classification(n_rows, n_columns, num_rounds,
assert array_equal(fil_preds, xgb_preds_int)
xgb_proba = np.stack([1-xgb_preds, xgb_preds], axis=1)
fil_proba = np.asarray(fm.predict_proba(X_validation))
assert np.allclose(fil_proba, xgb_proba, 1e-3)
assert np.allclose(fil_proba, xgb_proba, atol=3e-7)


@pytest.mark.parametrize('n_rows', [unit_param(1000), quality_param(10000),
Expand Down Expand Up @@ -262,7 +262,7 @@ def test_fil_skl_classification(n_rows, n_columns, n_estimators, max_depth,
assert array_equal(fil_preds, skl_preds_int)
fil_proba = np.asarray(fm.predict_proba(X_validation))
fil_proba = np.reshape(fil_proba, np.shape(skl_proba))
assert np.allclose(fil_proba, skl_proba, 1e-3)
assert np.allclose(fil_proba, skl_proba, atol=3e-7)


@pytest.mark.parametrize('n_rows', [1000])
Expand Down Expand Up @@ -486,8 +486,8 @@ def test_lightgbm(tmp_path, num_classes):
algo='TREE_REORG',
output_class=True,
model_type="lightgbm")
fil_proba = fm.predict_proba(X)
assert np.allclose(gbm_proba, fil_proba[:, 1], 1e-2)
fil_proba = fm.predict_proba(X)[:, 1]
assert np.allclose(gbm_proba, fil_proba, atol=3e-7)
gbm_preds = (gbm_proba > 0.5)
fil_preds = fm.predict(X)
assert array_equal(gbm_preds, fil_preds)
Expand Down

0 comments on commit cd220fc

Please sign in to comment.