Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leaks and other issues identified by the code sanitizer #159

Merged
merged 14 commits into from
Nov 10, 2022

Conversation

yantosca
Copy link
Contributor

Using the -DSANITIZE=y CMake option (which invokes code sanitization) has identified several memory leaks and similar issues in HEMCO code. These issues can be caused by:

  • Arrays (or array fields of objects) that are allocated but not deallocated
  • Attempting to store a number in a variable that cannot accommodate it (i.e. trying to store 1e150 in a REAL*4 variable)
  • Attempting to store a value of the wrong type into a variable
  • etc.

This PR should be merged along with geoschem/GCClassic#24 (which adds the proper compiler flags for the -DSANITIZE=y option) and geoschem/geos-chem#1353 (which removes memory leaks etc. in the GEOS-Chem code).

For now, tagging @lizziel @msulprizio @Jourdan-He @jimmielin

src/Core/hco_clock_mod.F90
- Several pointer fields were added to the HEMCO Clock object but
  were never deallocated.  These include
  - Clock%ThisLocYear
  - Clock%ThisLocMonth
  - Clock%ThisLocDay
  - Clock%ThisLocWD
  - Clock%ThisLocHour
  We now deallocate and nullify these fields in HcoClock_Cleanup
  so as to avoid memory leaks.

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca
Copy link
Contributor Author

Also tagging @christophkeller

@yantosca yantosca added category: Feature Request New feature or request topic: Build Related to makefiles or the build sequence topic: Performance Related to HEMCO performance, parallelization, or memory issues labels Aug 16, 2022
@yantosca yantosca added this to the 3.5.1 milestone Aug 16, 2022
src/Extensions/hcox_dustdead_mod.F
src/Extensions/hcox_distginoux_mod.F90
src/Extensions/hcox_finn_mod.F90
src/Extensions/hcox_gc_POPs_mod.F90
src/Extensions/hcox_gc_RnPbBe_mod.F90
src/Extensions/hcox_gfed_mod.F90
src/Extensions/hcox_volcano_mod.F90
- Move the deallocation statements for fields of Inst outside of
  the IF ( ASSOCIATED( PrevInst ) ) block.  This was causing memory
  leaks in GEOS-Chem Classic, where there is only one instance
  (and PrevInst was set to NULL).  This caused fields of Inst never
  to be deallocated.

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca yantosca added memory topic: Structural Modifications Related to HEMCO structural modifications (as opposed to scientific updates) and removed topic: Performance Related to HEMCO performance, parallelization, or memory issues labels Aug 18, 2022
@yantosca
Copy link
Contributor Author

Also see related issue #159

Continued the process of removing memory leaks from the prior commit
(#1efde62) in the InstRemove routines of all HEMCO extensions.

In particular:
(1) Pulled array deallocations out of the IF ( ASSOCIATED( PrevInst ) )
    block.  In GEOS-Chem Classic w/ HEMCO, PrevInst is always NULL
    since there is only one instance in which HEMCO runs, which
    leads to memory leaks.
(2) Now made array deallocations easier to read with IF/THEN/ENDIF blocks
(3) Nullify PrevInst and Inst pointers before exiting the routine
(4) Added/updated comments
(5) Also updated InstRemove in hcox_custom_mod.F90 and i
    hcox_template_mod.F90x so that people can follow these examples
    to make new HEMCO extensions w/o incurring memory leaks.

Signed-off-by: Bob Yantosca <[email protected]>
We have added a new CMake cache variable called SANITIZE.  When
configuring the HEMCO standalone with -DSANITIZE=y, this will add
"-fsanitize=address -fsanitize=leak -fsanitize=undefined" compiler
and linker flags to the build process.

Signed-off-by: Bob Yantosca <[email protected]>
src/Core/hco_diagn_mod.F90
- In routine DiagnCollection_Cleanup, we need to deallocate the memory
  assigned to ThisColl before pointing to NextColl

src/Core/hco_extlist_mod.F90
- In HCO_CleanupOpt, use a DEALLOCATE statement to free the memory
  from ThisOpt instead of a NULLIFY statement (which just sets
  ThisOpt to NULL without freeing any memory)

src/Core/hco_vertgrid_mod.F90
- Also deallocate the zGrid object and not just zGrid%Ap and zGrid%Bp

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca yantosca requested a review from Jourdan-He August 22, 2022 17:51
src/Core/hco_filedata_mod.F90
- Routine FileData_Init was rewritten to remove the local varaible
  NewFDta.  The prior code allocated NewFDta, and then set all of its
  fields to 0/False/NULL, and then pointed FileDta to NewFDta.  This
  caused a memory leak because the memory was never freed from the
  NewFDta object before exiting the subroutine.  We now operate on the
  FileDta argument directly, thus avoiding the memory leak.
- Also set FileDta => NULL() immediately before allocation.  This
  allows us to remove the Dta => NULL() in Config_ReadCont.

src/Core/hco_datacont_mod.F90
- In routine DataCont_Cleanup, make sure that memory is freed from
  sub-fields of the Dct object before deallocating Dct.

src/Core/hco_config_mod.F90
- In routine Config_ReadCont, we have removed the Dta => NULL()
  statements before calling FileData_Init.

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca yantosca removed the request for review from Jourdan-He August 23, 2022 13:52
We have to remove the deallocation of Dct%Data from within the
IF ( DeepClean ) block of DataCont_Cleanup, as this was causing
a double-free error (i.e. trying to free a pointer that has been
already freed).

Signed-off-by: Bob Yantosca <[email protected]>
Instead of allocating memory to the NewLct object (which is never freed),
we operate directly on the Lct argument.  This will prevent a memory
leak.

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca yantosca modified the milestones: 3.5.1, 3.6.0 Sep 16, 2022
@msulprizio msulprizio changed the base branch from dev to main October 18, 2022 17:30
@msulprizio msulprizio changed the base branch from main to dev/3.6.0 October 18, 2022 17:33
@lizziel
Copy link
Contributor

lizziel commented Nov 4, 2022

This is great! Were you able to estimate the size of memory leak from the log before you added in the fixes?

@yantosca yantosca marked this pull request as ready for review November 4, 2022 20:16
@yantosca yantosca changed the title [WIP] Fix memory leaks and other issues identified by the code sanitizer Fix memory leaks and other issues identified by the code sanitizer Nov 4, 2022
@yantosca
Copy link
Contributor Author

yantosca commented Nov 4, 2022

I am currently running integration tests with these updates merged up to HEMCO 3.5.1. I can create a new PR.

src/Extensions/hcox_gfed_mod.F90
- The Inst%GFED_EMFAC pointer was being deallocated in the InstRemove
  routine.  However, this just points to Inst%GFED_EMFAC4 and is
  never allocated.  The deallocation had caused a "double free or
  corruption" error, which means we are trying to deallocate a pointer
  (or array) that was never allocated (or has already been deallocated).
  The fix is to simpliy nullify Inst%GFED_EMFAC instead of trying
  to nullify it.

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca
Copy link
Contributor Author

yantosca commented Nov 7, 2022

I have done a parallelization test with GEOS-Chem Classic using 8 cores vs. 5 cores. The results show that the sanitize updates are zero-diff:

Using configuration file rst.yml
... Printing totals and differences
... Only showing variables with |absolute difference| > 0.0
Variable                 Ref=5cores             Dev=8cores                Dev - Ref
```

Added to unreleased 3.5.2 section

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca
Copy link
Contributor Author

Also updated the CHANGELOG.md

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Show resolved Hide resolved
CHANGELOG.md Show resolved Hide resolved
src/Extensions/hcox_dustdead_mod.F Outdated Show resolved Hide resolved
Following the suggestion by @lizziel, that this is more of a fix
than a change.

Signed-off-by: Bob Yantosca <[email protected]>
@yantosca yantosca changed the base branch from dev/3.6.0 to dev/3.5.2 November 10, 2022 18:14
CHANGELOG.md
- Add text about sanitizer option for HEMCO standalone (under "Added")
  for version 3.5.2.
- Replace TAB characters with spaces (6 instance)
- Bullet points start w/ an indent of 2 chars

Signed-off-by: Bob Yantosca <[email protected]>
src/Extensions/hcox_dustdead_mod.F
- Removed code that was commented out with "this is not used" tags
  in the finalization routine

CHANGELOG.md
- Added a description of this update under "Changed" for 3.5.2.

Signed-off-by: Bob Yantosca <[email protected]>
Copy link
Contributor

@lizziel lizziel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Contributor

@msulprizio msulprizio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These updates look good. I wonder if documentation should also be updated here for the sanitize option?

@yantosca yantosca merged commit ee75832 into dev/3.5.2 Nov 10, 2022
@yantosca yantosca deleted the feature/sanitize branch November 10, 2022 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Feature Request New feature or request topic: Build Related to makefiles or the build sequence topic: Structural Modifications Related to HEMCO structural modifications (as opposed to scientific updates)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants