Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] introduce CI jobs that mimic CRAN gcc-ASAN and clang-ASAN tests (fixes #4674) #4678

Merged
merged 15 commits into from
Oct 26, 2021
Merged
40 changes: 27 additions & 13 deletions .github/workflows/r_package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -173,16 +173,29 @@ jobs:
$env:TASK = "${{ matrix.task }}"
& "$env:GITHUB_WORKSPACE/.ci/test_windows.ps1"
test-r-sanitizers:
name: r-package (ubuntu-latest, R-devel, GCC ASAN/UBSAN)
timeout-minutes: 60
name: r-package (ubuntu-latest, R-devel, ${{ matrix.compiler }} ASAN/UBSAN)
runs-on: ubuntu-latest
container: rhub/rocker-gcc-san
container: wch1/r-debug
timeout-minutes: 60
strategy:
fail-fast: false
matrix:
include:
################
# CMake builds #
################
- r_customization: san
compiler: gcc
- r_customization: csan
compiler: clang
env:
# env variables from CRAN's configuration: https://www.stats.ox.ac.uk/pub/bdr/memtests/README.txt
ASAN_OPTIONS: "detect_leaks=0:detect_odr_violation=0"
UBSAN_OPTIONS: "print_stacktrace=1"
RJAVA_JVM_STACK_WORKAROUND: 0
RGL_USE_NULL: true
R_DONT_USE_TK: true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh great! I did also just see this conversation in that project's issues about keeping these images' configuration in sync with CRAN: wch/r-debug#21

Since it looks like the the current image is up to date with CRAN, I'll remove this configuration and check that these jobs still reproduce the issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed 6389d2a, which removes these env variables.

I intentionally did not merge the latest master into this branch, since it include a fix for the issues this test is supposed to catch (#4673).

If the next round of CI builds reproduces that issue, I'll update this PR's branch to the latest master.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha sorry, and cec4267. Realized I included an inaccurate comment copy-pasting from another part of the template.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intentionally did not merge the latest master

Ahhhh, I was wondering why after pushing these changes, I couldn't replicate the test errors! But now I understand...the version of the code tested at CI will be the state of this branch merged into master, so just having #4673 on master is enough to make the tests in this PR now pass.

From the checkout task on https://github.com/microsoft/LightGBM/runs/3971480759?check_suite_focus=true

HEAD is now at d3763ec Merge eddfffd into d88b445

I'll update this to latest master and temporarily add a revert commit reverting #4673, just so we can test that the CI jobs are doing what they should.

I am still travelling though, so might not be able to return to this for for 2-3 days.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok yeah, seems to be working! As of the most recent state of this branch, these new CI jobs are reproducing the issues found in CRAN's clang-ASAN and gcc-ASAN checks.

clang-ASAN

link to failing job: https://github.com/microsoft/LightGBM/runs/3971678264?check_suite_focus=true

evidence from logs that ASAN and UBSAN are both being used.

clang++ -fsanitize=address,undefined -fno-sanitize=float-divide-by-zero -fno-sanitize=alignment -fno-omit-frame-pointer -frtti -std=gnu++11 -I"/usr/local/RDcsan/lib/R/include" -DNDEBUG -I./include -DEIGEN_MPL2_ONLY -DMM_PREFETCH=1 -DMM_MALLOC=1 -DUSE_SOCKET -DLGB_R_BUILD -I/usr/local/include -pthread -fpic -g -O0 -Wall -pedantic -c lightgbm_R.cpp -o lightgbm_R.o

gcc-ASAN

link to failing job: https://github.com/microsoft/LightGBM/runs/3971678238?check_suite_focus=true

evidence from logs that ASAN and UBSAN are both being used

g++ -fsanitize=address,undefined,bounds-strict -fno-omit-frame-pointer -std=gnu++11 -I"/usr/local/RDsan/lib/R/include" -DNDEBUG -I./include -DEIGEN_MPL2_ONLY -DMM_PREFETCH=1 -DMM_MALLOC=1 -DUSE_SOCKET -DLGB_R_BUILD -I/usr/local/include -DSWITCH_TO_REFCNT -pthread -fpic -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -O0 -Wall -Wall -pedantic -c lightgbm_R.cpp -o lightgbm_R.o

I'll add the changes from #4673 back in and mark this ready for review.

steps:
- name: Install Git before checkout
shell: bash
run: |
apt-get update --allow-releaseinfo-change
apt-get install --no-install-recommends -y git
- name: Checkout repository
uses: actions/[email protected]
with:
Expand All @@ -191,16 +204,17 @@ jobs:
- name: Install packages
shell: bash
run: |
Rscript -e "install.packages(c('R6', 'data.table', 'jsonlite', 'Matrix', 'testthat'), repos = 'https://cran.r-project.org', Ncpus = parallel::detectCores())"
RDscript${{ matrix.r_customization }} -e "install.packages(c('R6', 'data.table', 'jsonlite', 'Matrix', 'testthat'), repos = 'https://cran.r-project.org', Ncpus = parallel::detectCores())"
sh build-cran-package.sh
Rdevel CMD INSTALL lightgbm_*.tar.gz || exit -1
RD${{ matrix.r_customization }} CMD INSTALL lightgbm_*.tar.gz || exit -1
- name: Run tests with sanitizers
shell: bash
run: |
cd R-package/tests
Rscriptdevel testthat.R 2>&1 > ubsan-tests.log
cat ubsan-tests.log
exit $(cat ubsan-tests.log | grep --count "runtime error")
exit_code=0
RDscript${{ matrix.r_customization }} testthat.R >> tests.log 2>&1 || exit_code=1
cat ./tests.log
exit ${exit_code}
test-r-debian-clang:
name: r-package (debian, R-devel, clang)
timeout-minutes: 60
Expand Down