Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SANITIZE CMake option to check for memory leaks and undefined behavior #24

Merged
merged 7 commits into from
Nov 18, 2022

Conversation

yantosca
Copy link
Contributor

The GNU C and Fortran compilers allow you to build code sanitation functionality into an executable. This will add several library function calls to an executable that check for memory leaks and other such types of errors.

In CMakeScripts/GC-ConfigureClassic.cmake, we have added SANITIZE as a new cache variable. When configuring with -DSANITIZE=y, this will add -fsantize=address -fsantize=leak -fsanitize=undefined to both the compiler and linker flags. This will compile the code sanitation functionality into the GEOS-Chem Classic executable.

Using -DSANITIZE=y with Intel compilers will cause CMake to throw an error, as this functionality is not supported in Intel compilers (at least I think not).

NOTE: The code sanitation causes GEOS-Chem Classic to run much more slowly. Therefore it is best to only use this new option on short debugging runs.

With -DSANITIZE=y, you will get output similar to this to the stderr stream:

=================================================================
==26387==ERROR: LeakSanitizer: detected memory leaks
. . . etc . . .

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x2b77724e48ff in __interceptor_malloc /tmp/ryantosca/spack-stage/spack-stage-gcc-11.2.0-hbazyfn6aayq2jlpog7uck7w5xa5nmm4/spack-src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x3db68fd in __hcox_seasalt_mod_MOD_hcox_seasalt_init /local/ryantosca/GC/rundirs/epa-kpp/san_dev/CodeDir/src/HEMCO/src/Extensions/hcox_seasalt_mod.F90:1083
    #2 0x36dc202 in __hcox_driver_mod_MOD_hcox_init /local/ryantosca/GC/rundirs/epa-kpp/san_dev/CodeDir/src/HEMCO/src/Extensions/hcox_driver_mod.F90:278
    #3 0xbdb561 in __hco_interface_gc_mod_MOD_hcoi_gc_init /local/ryantosca/GC/rundirs/epa-kpp/san_dev/CodeDir/src/GEOS-Chem/GeosCore/hco_interface_gc_mod.F90:657
    #4 0x81daa1 in __emissions_mod_MOD_emissions_init /local/ryantosca/GC/rundirs/epa-kpp/san_dev/CodeDir/src/GEOS-Chem/GeosCore/emissions_mod.F90:117
    #5 0x40b719 in geos_chem /local/ryantosca/GC/rundirs/epa-kpp/san_dev/CodeDir/src/GEOS-Chem/Interfaces/GCClassic/main.F90:732
    #6 0x415874 in main /local/ryantosca/GC/rundirs/epa-kpp/san_dev/CodeDir/src/GEOS-Chem/Interfaces/GCClassic/main.F90:32
    #7 0x2b77751c1554 in __libc_start_main (/lib64/libc.so.6+0x22554)

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x2b77724e48ff in __interceptor_malloc /tmp/ryantosca/spack-stage/spack-stage-gcc-11.2.0-hbazyfn6aayq2jlpog7uck7w5xa5nmm4/spack-src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x32aca51 in __input_opt_mod_MOD_set_input_opt /local/ryantosca/GC/rundirs/epa-kpp/san_dev/CodeDir/src/GEOS-Chem/Headers/input_opt_mod.F90:725
    #2 0x408365 in geos_chem /local/ryantosca/GC/rundirs/epa-kpp/san_dev/CodeDir/src/GEOS-Chem/Interfaces/GCClassic/main.F90:313
    #3 0x415874 in main /local/ryantosca/GC/rundirs/epa-kpp/san_dev/CodeDir/src/GEOS-Chem/Interfaces/GCClassic/main.F90:32
    #4 0x2b77751c1554 in __libc_start_main (/lib64/libc.so.6+0x22554)

SUMMARY: AddressSanitizer: 23642181 byte(s) leaked in 3693 allocation(s).

you can then go to the line numbers that are cited and attempt to fix the memory leaks or other issues that are brought up.

…m debug

We have added -g to the list of default compiler options for both
Intel Fortran and GNU Fortran.  This will help users to run or view the
gcclassic executable in a debugger without having to recompile with
-DCMAKE_BUILD_TYPE=Debug.

Also removed the -gdwarf-2 -gstrict-dwarf from the list of gfortran
debugging compiler options.  This forced using the DWARF-2 standard,
which has now been replaced by more modern versions.

Signed-off-by: Bob Yantosca <[email protected]>
CMakeScripts/GC-ConfigureClassic.cmake
- Add new SANITIZE cache variable for GNU Fortran only.  If activated
  it will set -fsanitize=address -fsanitize=leak -fsanitize=undefined
  compiler and linker flags.  This will enable checking for memory
  leaks, memory address faults, and code producing undefined behavior
  according to the Fortran standard.  (NOTE: This will make code run
  more slowly so it should not be used in production runs).
- SANITIZE will throw an error when used with compilers other than
  GNU Fortran.  This is because the sanitization options are only
  included in GNU Fortran but not in Intel Fortran.

CMakeLists.txt
- Updated CACHE STRING values to reflect the proper CMAKE_BUILD_TYPE
  values: Release, RelWithDebInfo, Debug.  (Uses proper capitalization)

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca yantosca added category: Feature Request New feature or request topic: Build Related to CMake and/or the build sequence labels Aug 16, 2022
@yantosca yantosca added this to the 14.1.0 milestone Aug 16, 2022
@yantosca yantosca self-assigned this Aug 16, 2022
@yantosca
Copy link
Contributor Author

NOTE: I will open a new branch in geoschem/geos-chem where we attempt to fix the several memory leaks that were identified. This can all go into 14.1.0.

@yantosca
Copy link
Contributor Author

Also tagging @kilicomu, from whom I found out about this feature.

@yantosca
Copy link
Contributor Author

The sanitation will also flag runtime issues such as:

/path/to/GCClassic/src/GEOS-Chem/Headers/dictionary_m.F90:79:31: runtime error: signed integer overflow: 193452388 * 33 cannot be represented in type 'integer(kind=4)'

which can be fixed by using an integer(kind=8) (8-byte integer, aka integer*8).

CMakeScripts/GC-ConfigClassic.cmake
- We now use target_link_libraries instead of link_libraries
  and target_compile_options instead of add_compile_options
  with the SANITIZE option.  The target_* commands only apply
  to the target whereas the other commands will be applied to
  all directories in the build process (which might lead to
  side-effects).

Signed-off-by: Bob Yantosca <[email protected]>
@msulprizio msulprizio changed the base branch from dev to main October 18, 2022 17:07
Copy link
Contributor

@lizziel lizziel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good except that I'm confused by the change in location of -g. See my comments on the relevant lines of code.

CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
@yantosca yantosca removed the request for review from LiamBindle November 4, 2022 20:20
@yantosca
Copy link
Contributor Author

yantosca commented Nov 4, 2022

I am currently running GCC and GCHP integration tests with the sanitize updates merged up to GC 14.0.1 and HEMCO 3.5.1. I can create a new PR, as @lizziel has requested that we merge these fixes into GC 14.0.2, if they prove to be zero-diff (which I believe they are).

@msulprizio
Copy link
Contributor

@yantosca In addition to doing diff tests to ensure zero diffs, could you also see if these updates have an impact on model performance? I'm wondering if the -g flag above slows down the model at all. A simple comparison of the timers printed out at the end of the log files should be sufficient.

@kilicomu
Copy link

kilicomu commented Nov 7, 2022

In theory, -g shouldn't make a difference to model performance, assuming we still use the same optimization flags that we would otherwise use. The use of optimization flags can make the debugging experience a little weird, though. See the GCC docs:

GCC allows you to use -g with -O. The shortcuts taken by optimized code may occasionally be surprising: some variables you declared may not exist at all; flow of control may briefly move where you did not expect it; some statements may not be executed because they compute constant results or their values are already at hand; some statements may execute in different places because they have been moved out of loops.

Definitely worth a quick timing test!

Also, @yantosca it might be worth switching from -g -O0 to -g -Og in the debug releases! You will get at least some optimisation and also get full debugging capabilities:

Optimize debugging experience. -Og should be the optimization level of choice for the standard edit-compile-debug cycle, offering a reasonable level of optimization while maintaining fast compilation and a good debugging experience. It is a better choice than -O0 for producing debuggable code because some compiler passes that collect debug information are disabled at -O0.

Like -O0, -Og completely disables a number of optimization passes so that individual options controlling them have no effect. Otherwise -Og enables all -O1 optimization flags except for those that may interfere with debugging:

-fbranch-count-reg -fdelayed-branch
-fdse -fif-conversion -fif-conversion2
-finline-functions-called-once
-fmove-loop-invariants -fmove-loop-stores -fssa-phiopt
-ftree-bit-ccp -ftree-dse -ftree-pta -ftree-sra

@yantosca yantosca changed the base branch from main to dev/14.0.2 November 7, 2022 14:44
@yantosca
Copy link
Contributor Author

yantosca commented Nov 7, 2022

I will revert the compilation flag updates. We can add them into a separate PR which will allow for more in-depth testing. I'll create a new feature request.

CMakeLists.txt
- Remove -g from GEOSChem_Fortran_FLAGS_{GNU,Intel}
- Add -g to GEOSChem_Fortran_FLAGS_DEBUG_{GNU,Intel}

Signed-off-by: Bob Yantosca <[email protected]>
This commit informs the GCClassic superproject about the following
commits that were pushed to the Github geoschem/HEMCO repository:

5be2f32 Bug fix: Nullify Inst%GFED_EMFAC instead of deallocating it
2d56ee9 Merge branch 'main' into feature/sanitize+3.5.1
7ed469e Prevent memory leaks in routine ConfigList_AddCont (hco_config_mod.F90)
04f3278 Remove deallocation statement in hco_datacont_mod.F90
d53a5b0 Remove memory leak in hco_datacont_mod.F90 and hco_filecont_mod.F90
a9a14fc Fix memory leaks in HEMCO core routines
5120eae Add -DSANITIZE=y option to CMakeLists.txt for HEMCO standalone
7e1e171 Clean up memory leaks in extension InstRemove routines #2
1efde62 Clean up memory leaks in extension "InstRemove" routines
fd6f78f Free pointer fields of HcoState%Clock in routine HcoClock_Cleanup

Signed-off-by: Bob Yantosca <[email protected]>
This commit informs the GCClassic superproject about the following
commits that were pushed to the GitHub geoschem/geos-chem repository:

29ae8daf0 Merge branch 'main' into feature/sanitize+14.0.1
97eecb1d0 Remove extraneous USE statement reference to HistItem_Destroy
bd28bafe5 Fix memory leaks in Registry and History routines
ce9ea8118 Set TimeForEmis = .FALSE. for Emissions_Run phase 0 call
36547b875 Remove memory leaks in input_opt_mod.F90

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca yantosca added the no-diff-to-benchmark This update will not change the results of fullchem benchmark simulations label Nov 7, 2022
@yantosca
Copy link
Contributor Author

yantosca commented Nov 7, 2022

I have done a parallelization test (GEOS-Chem Classic) with 8 cores vs. 5 cores. The results show that the sanitize updates are zero-diff:

Using configuration file rst.yml
... Printing totals and differences
... Only showing variables with |absolute difference| > 0.0
Variable                 Ref=5cores             Dev=8cores                Dev - Ref
```

@yantosca
Copy link
Contributor Author

yantosca commented Nov 7, 2022

@lizziel @msulprizio I have reverted the compilation option update... -g is only used with -DCMAKE_BUILD_TYPE=Debug. We can investigate the impact of changing compilation options in a separate PR.

@yantosca yantosca removed this from the 14.1.0 milestone Nov 7, 2022
@yantosca yantosca added this to the 14.0.2 milestone Nov 7, 2022
Copy link
Contributor Author

yantosca commented Nov 9, 2022

I've ported the -g back to debug-only. We can check this out in a separate PR.

@yantosca yantosca modified the milestones: 14.0.2, 14.1.0 Nov 10, 2022
@yantosca
Copy link
Contributor Author

NOTE: These updates to add the -DSANITIZE=y configuration option (and compiler switches) can go into 14.1.0 as this is a new feature. The sanitized code can go into GEOS-Chem 14.0.2 and HEMCO 3.5.2.

@msulprizio msulprizio changed the base branch from dev/14.0.2 to dev/14.1.0 November 10, 2022 18:58
@msulprizio msulprizio merged commit 54a0055 into dev/14.1.0 Nov 18, 2022
@msulprizio msulprizio deleted the feature/sanitize branch November 18, 2022 19:41
yantosca added a commit that referenced this pull request Sep 3, 2024
This commit informs the GCClassic superproject about the following
commits that were pushed to the GitHub geoschem/geos-chem repository:

1951ea96e Declare PPN as photolyzing in the species database
40a078514 Merge PR #2424 (Support for Cloud-J_v8.0.0)
647c9f579 Merge PR #2426 (Store aerosol-specific optics files in new directory)

and to the geoschem/Cloud-J repository:

291a161 Update Cloud-J standalone settings to use UV absorption by H2O by default
6a027b1 Merge pull request #24 from geoschem/feature/cloudj_v8

These PRs move aerosol optics configuration files to a separate folder
(out of the CHEM_INPUTS/FAST-JX folder structure) and update Cloud-J
to version 8.0.0.

Signed-off-by: Bob Yantosca <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Feature Request New feature or request no-diff-to-benchmark This update will not change the results of fullchem benchmark simulations topic: Build Related to CMake and/or the build sequence
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants