Regression Tests and gitlab CI support #4228

Willian-Zhang · 2021-04-26T08:06:04Z

This PR depends on #4089

Regression here stands for testing method as described here . To not be confused with term regression tasks in ML realm.

This PR:

Add basic regression test supports
- create report for all tests
- many to many relationship between testing scripts and verified datas
- multiple regression tests
- sha1 and color report
Gitlab CI Support
- to build lightgbm on the fly
- run regression tests

Example Result: on gitlab

Caveats/Future Plans:

Gitlab CI Support depends on a image I created willianz/lightgbm-build-env:latest, this won't effect current upstream CI (Microsoft) whose CI were fulled on GitHub.
Regression Tests not yet added to GitHub CI
Current check implementation relies on local data generation for referencing data
- as regression logics (caching and moving data around) should be more dependent on CI tool side instead of checking side
- and there would be a pending decision on how Add-Remove-Change effects CI reports (are they strictly prohibited)
This PR depends on [python-package] Create Dataset from multiple data files #4089
- Considering there are some local regression tests yet under development against Sequence feature

1. Use random access for data sampling 2. Support read data from multiple input files 3. Read data in batch so no need to hold all data in memory

microsoft#4089 (comment)

validation dataset ignored

try remove git submodules

ghost · 2021-04-26T08:06:17Z

All CLA requirements met.

jameslamb

Thanks for your interest in LightGBM!

But I don't understand this pull request, sorry. Is there some previous discussion that you've had with maintainers here that describes this work?

My initial observations:

I'm -1 on supporting new CI pipelines on GitLab. This project already has extensive testing on GitHubAction, AppVeyor, and Azure DevOps and I'd like to understand why it's necessary to add GitLab.
It's not clear to me why the new tests added in this PR need to be in their own folder called regression/ instead of being included in LightGBM's existing testing infrastructure. I'd like to understand why that is being proposed.

Willian-Zhang · 2021-04-30T07:22:33Z

@jameslamb
Sorry for not making this clear.

Intention for this PR is to purpose what LightGBM seems to be lacking: Regression test that ensures LightGBM produces same outputs (intermediate results, models, predictions, etc) over commits or releases.
This is especially useful on project of this magnitude, specially when building, testing with large datasets takes huge amount of time or resources, while CI could step up offload those heavy tasks asynchronously during development cycle.

GitLab CI here serves as a demo. We have been testing legally protected datasets (e.g. Yahoo), as well as large ones (e.g. Higgs) on self hosted server. GitLab CI integrates well to our current testing infrastructure, also does not affect current public Github CI.

However considerations are: Regressions tests on a ML framework could be very different to traditional ones that requires each in-output to be strictly unchanged, restrictions could be added where needed or instead prompting user for change if not fatal.
Despite running procedure could be arbitrary (C++, Python, R or any combination of those routine) and reporting mechanism is strongly sighted to CI tool adopted, It's really up to your design decision where restrictions or suggestion to put on regression changes.

TL;DR

Gitlab is just a proposal, it does not affect current GitHub CI procedure.
Please consider adding Regression tests.

jameslamb · 2021-05-01T23:07:47Z

Intention for this PR is to purpose what LightGBM seems to be lacking: Regression test that ensures LightGBM produces same outputs (intermediate results, models, predictions, etc) over commits or releases.

Thanks for the clarification.

Even though there isn't a separate directory in this project called regression_tests, many of the project's tests meet that definition...they are intended to catch breaking changes. For example, consider these tests that check for exact measures of metrics from training on a fixed dataset:

LightGBM/R-package/tests/testthat/test_basic.R

Lines 172 to 173 in 26cde5f

    
           expect_true(abs(bst$lower_bound() - 0.1513859) < TOLERANCE) 
        
           expect_true(abs(bst$upper_bound() - 0.9080349) < TOLERANCE)

LightGBM/R-package/tests/testthat/test_basic.R

Lines 572 to 573 in 26cde5f

expected_error <- 0.6931268

expect_true(abs(bst$eval_train()[[1L]][["value"]] - expected_error) < TOLERANCE)

We'd welcome a discussion about specific types of tests that the project is missing today, and then small, focused pull requests to add them one at a time.

Could you please close this pull request and open a feature request at https://github.com/microsoft/LightGBM/issues describing the types of tests that you believe LightGBM is missing?

Please be as specific as possible, for example:

tests that a model file written in version X can be used in version Y
tests that a Dataset file written by version X can be read by version Y
etc.

This pull request in its current state needs significant changes before it is likely to be accepted by maintainers, so I think this topic would benefit from some discussion first.

StrikerRUS · 2021-06-16T20:44:14Z

I agree with @jameslamb . There is no doubt that regression tests are very important. However, according to the provided definition,

Regression testing (rarely non-regression testing) is re-running functional and non-functional tests to ensure that previously developed and tested software still performs after a change.

such tests can be written in quite different formats, not necessary as a special GitLab jobs.

I believe we already have regression tests for
metrics,

LightGBM/tests/python_package_test/test_engine.py

Line 169 in 4530ded

assert ret < 0.005

LightGBM/tests/python_package_test/test_engine.py

Line 2662 in 4530ded

assert log_loss(y_train, y_pred) < 0.661

internal model structure,

LightGBM/tests/python_package_test/test_basic.py

Lines 47 to 48 in 4530ded

    
           assert bst.lower_bound() == pytest.approx(-2.9040190126976606) 
        
           assert bst.upper_bound() == pytest.approx(3.3182142872462883)

printed outputs

LightGBM/tests/python_package_test/test_utilities.py

Line 97 in 4530ded

assert "\n".join(actual_log_wo_gpu_stuff) == expected_log

All these asserts check that LightGBM performs the same way it did in previous commit. I guess this is what regression tests are about.

I believe we should add more such tests with hardcoded expected results and make them more strict (for metrics on test datasets, for example) and pay more attention to PRs which change that values (#4349). Contributions are very welcome 😉 !

Sorry, but I think this PR in its' current state is not going to be merged.

jameslamb · 2021-06-25T05:25:36Z

I've created two issues to track specific pieces of work that I think are relevant to the discussion in this pull request.

[ci] regression tests: binary Dataset format #4406: regression tests on binary dataset files
[ci] regression tests: model files #4407: regression tests on model files

Since it has been almost two months since #4228 (comment) with no comment from non-maintainers on this pull request, I am closing this pull request and locking the conversation.

For anyone arriving at this discussion who is interested in contributing such regression tests, please comment on #4406 / #4407 or open a new issue.

cyfdecyf and others added 30 commits March 21, 2021 17:57

[python-package] create Dataset from sampled data.

250b882

[python-package] create Dataset from List[Sequence].

43c0a20

1. Use random access for data sampling 2. Support read data from multiple input files 3. Read data in batch so no need to hold all data in memory

[python-package] example: create Dataset from multiple HDF5 file.

265ae97

fix: revert is_class implementation for seq

a8bc7d9

fix: unwanted memory view reference for seq

1c31b64

fix: seq is_class accepts sklearn matrices

2b1bd95

fix: requirements for example

744fee3

fix: pycode

8c38451

feat: print static code linting stage

10bd79f

fix: linting: avoid shell str regex conversion

8663b71

code style: doc style

27544d3

code style: isort

fcd637f

fix ci dependency: h5py on windows

771198b

[py] remove rm files in test seq

ae7b18d

microsoft#4089 (comment)

docs(python): init_from_sample summary

1990980

microsoft#4089 (comment)

ci(regression-test): and gitlab ci

ecb6a10

ci(gitlab-ci): change build docker image for gitlab ci

49c5075

ci(gitlab-ci): change docker image

d714757

ci(gitlab): use custom docker image

cb3d8a8

docs(regression): rename README

2ea720d

ci(gitlab): try fix build missing submodule

6fa3278

ci(regression): fix color on gitlab ci

66ec3ca

docs(test): Add tips for regression

60a3f7b

ci(regression): no remove data file option

2646b6c

ci(regression): add sample test script

569b95c

ci(regression): unignore data for test

05b398a

ci(gitlab): fix ansi color output

7906fc9

ci(regression): add bin example data test

f1fbcf8

validation dataset ignored

ci(gitlab): make install python

695f960

try remove git submodules

ci(gitlab): fix python install

af7c340

Willian-Zhang and others added 7 commits April 23, 2021 16:59

ci(gitlab): try fix cannot find lgbm after build

a6e2064

ci(gitlab): try fix artifacts not found err

eec226c

ci(gitlab): try fix artifacts not found err 3

b7819bd

ci(gitlab): try fix regression build keep conda

96704db

ci(gitlab): try fix test after build dep by extend

b822cf3

ci(gitlab): show sha1 for diff result

5f63feb

ci(regression): fix regression data precision

839ee4e

Willian-Zhang requested review from btrotta, chivee, guolinke, henry0312, jameslamb, shiyu1994, StrikerRUS and wxchan as code owners April 26, 2021 08:06

jameslamb requested changes Apr 26, 2021

View reviewed changes

jameslamb added in progress awaiting response labels Apr 26, 2021

This was referenced Jun 25, 2021

[ci] regression tests: binary Dataset format #4406

Closed

[ci] regression tests: model files #4407

Closed

jameslamb closed this Jun 25, 2021

microsoft locked as resolved and limited conversation to collaborators Jun 25, 2021

jameslamb removed the in progress label Aug 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression Tests and gitlab CI support #4228

Regression Tests and gitlab CI support #4228

Willian-Zhang commented Apr 26, 2021 •

edited

Loading

ghost commented Apr 26, 2021 •

edited by ghost

Loading

jameslamb left a comment

Willian-Zhang commented Apr 30, 2021

jameslamb commented May 1, 2021

StrikerRUS commented Jun 16, 2021

jameslamb commented Jun 25, 2021

Regression Tests and gitlab CI support #4228

Regression Tests and gitlab CI support #4228

Conversation

Willian-Zhang commented Apr 26, 2021 • edited Loading

ghost commented Apr 26, 2021 • edited by ghost Loading

jameslamb left a comment

Choose a reason for hiding this comment

Willian-Zhang commented Apr 30, 2021

TL;DR

jameslamb commented May 1, 2021

StrikerRUS commented Jun 16, 2021

jameslamb commented Jun 25, 2021

Willian-Zhang commented Apr 26, 2021 •

edited

Loading

ghost commented Apr 26, 2021 •

edited by ghost

Loading