-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolving static lifetime problems #674
Conversation
# Remove timer classes. Fixes #614 (partially) The classes `Timer`and `ScopedTimer` both use static data, and so need to be refactored as part of issue #614. However, they are currently unused in the model and additionally need to be made thread-safe and possibly MPI aware, depending on how the timings should be reported. This PR entirely removes the classes from the code base.
# Remove the static implementaion of NullIterator. Fixes #614 (partially)
# Pull Request Title Fixes #(your issue number) ### Task List - [ ] Defined the tests that specify a complete and functioning change (*It may help to create a [design specification & test specification](../../../wiki/Specification-Template)*) - [ ] Implemented the source code change that satisfies the tests - [ ] Documented the feature by providing worked example - [ ] Updated the README or other documentation - [ ] Completed the pre-Request checklist below --- # Change Description *Please write a summary of the change and specify which Issue this fixes. This should cover the motivation for making the change and provide almost enough context to review the change in isolation. While the issue description is a good place to discuss the failure/requirements, this is the ideal space for explaining why this design fixes the issue and why you designed it the way you did.* - *If there are important design decisions, particularly those that leverage functionality outside this change or were caused by existing limitations, they should be noted here.* - *If there is a design specification which relates to this PR, please provide a link to it here.* - *If the PR relates to an open issue, please link the issue to the PR by using the 'fixes #issue_number' syntax.* - *If the PR relates to several open issue, make sure to use 'fixes #issue_1, fixes #issue_2, etc.' syntax.* - *If there are any dependencies (third-party or internal downstream) which are added or impacted, please note them here. And if their limitations informed your design, please note how they did so here (particularly if this information was not captured in a design specification).* Text goes here --- # Test Description *Please describe the tests you have written that verify the enhancement works or that the bug fix has been remedied. For a bug fix, please ensure a complete description of the failure is available here or in the issue such that a reviewer can verify that the testing is suitable. This is a good place to highlight any manual testing you may have had to perform.* *Please describe any operating system limitations placed on the testing (if appropriate).* *If your tests fixes any issue, please explain that in the comment you provided in the test file by referencing the issue within the comment and please record the mapping here e.g.* - *testA fixes issue 404* - *testB fixes issue 999* Text goes here --- # Documentation Impact *Please describe any changes to the documentation not captured above or, if made in a seperate pull request, please link to the corresponding PR or issue.* Text goes here --- # Other Details *If you have run any static analysis, including complexity analysis, or run coverage testing which has not been captured by an automated tool, please link or copy it here.* Text goes here --- ### Pre-Request Checklist - [ ] The requirements of this pull request are fully captured in an issue or design specification and are linked and summarised in the description of this PR - [ ] No new warnings are generated - [ ] The documentation has been updated (or an issue has been created to track the corresponding change) - [ ] Methods and Tests are commented such that they can be understood without having to obtain additional context - [ ] This PR/Issue is labelled as a bug/feature/enhancement/breaking change - [ ] File dates have been updated to reflect modification date - [ ] This change conforms to the conventions described in the README *Please complete your pull request description and delete the instructional text before submitting.* *Congratulations and thank you for making your contribution to neXtSIM_DG!*
# Change ConfigMap static constants to an enum. Fixes #614 (partially) --- # Change Description Change static const ints to an enum.
# Remove the `FileCallbackCloser` static list. Fixes #614 (partially) --- # Change Description Replace with a singleton function providing a list. --- # Test Description The class unit test still passes.
# ParaGridO static data Fixes #614 (partially) --- # Change Description Moved the maps between netCDF concepts and `ModelArray` equivalents to `const` instance variables that are initialized as part of the ctor. Moved the static maps from file names to netCDF files and current time indices to a singleton-like function. Also combined the maps, to ensure they remain consistent. Added the `Finalizer` class to to what `std::atexit` does, except at our convenience, rather than executable clean-up. This includes a new test for the new class. This change is the same as that contained in PR #683 --- # Test Description Added a test for the new `Finalizer` class. The `ParaGrid_test` still passes, this mainly tests the parts that pass through to `ParaGridIO`.
issue614_datalifetimes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes greatly improve resource lifetime management. In fact, it is now possible to get useful information out of valgrind:
==41968== LEAK SUMMARY:
==41968== definitely lost: 512 bytes in 3 blocks
==41968== indirectly lost: 3,569,016 bytes in 24 blocks
==41968== possibly lost: 1,186,944 bytes in 21 blocks
==41968== still reachable: 110,717 bytes in 719 blocks
==41968== suppressed: 0 bytes in 0 blocks
Not bad :D. Two of the leaks are from raw pointers holding heap data in the dynamics and they can be addressed in a different issue. The third one is in Iterator
. I think this leak should be addressed before merging the changes.
My other comments are just for cosmetic code changes, so they are not essential.
For the big picture of resource management, a possible pitfall which remains is that modules have to be manually registered for destruction via Module::finalize
. That seems manageable but is there a reason why this should be done manually instead of automatically, e.g. on construction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @timspainNERSC. Just a few comment/questions from my side.
bool Finalizer::registerUnique(const FinalFn& fn) | ||
{ | ||
if (contains(fn)) | ||
return false; | ||
registerFunc(fn); | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason this function returns a bool
. Looks like it is mostly used like a void
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return value indicates whether the function was successfully registered (true
) or not (false
). I this is useful enough functionality, given the minimal cost. Even if it is not currently used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my concern was, if you thought you were using it but you were not. Or if it was left in for some use case that we no longer need.
Do you think it would be more useful to raise an exception if the user tries to register an already registered function? Given the use in the rest of the code, my concern is that new developers are not likely to test the boolean return value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's there as a service if the developer wants to know if the function was already registered. How it is used currently, I don't care, so I don't check. If a future developer does care, then they can check.
@@ -46,6 +46,8 @@ ConfigurationHelp::HelpMap& ERA5Atmosphere::getHelpRecursive(HelpMap& map, bool | |||
|
|||
void ERA5Atmosphere::configure() | |||
{ | |||
Finalizer::registerUnique(Module::finalize<IFluxCalculation>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a thought... is there a way to force finalization without relying on the user to remember they have to do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost certainly! But what does this use case look like? Which classes would be automatically finalized? I think this is a big enough piece of work on its own that is should be a separate issue where we could discuss the scope and implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My use case was more around the fact the someone may forget to register the module finalization. By default we could finalize everything at the end if it hasn't been manually triggered beforehand. Maybe I am missing something though. Happy to discuss in the next meeting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So let's say I forget to do this step. How bad is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the CPU version? It's what develop
is doing right now without this code, so no problem at all.
In the GPU version, my understanding is that not cleaning up data buffers before the GPU process finishes is a big headache. That is what a lot of the changes in this PR are aiming to address.
Thanks @timspainNERSC , I have approved from my side. I left two conversations "unresolved" just so we can find them easily in the next team meeting and discuss. No outstanding issues. Thanks for addressing my comments |
I thought that there were cases where registering the final methods at construction would not work. But I cannot identify any right now. This might be worth looking at in the (not to distant) future, but can be a separate issue. |
# memory leaks in dynamics Fixes #716 # Change Description With #674 integrated, all dynamic buffers held by modules should get freed before the program terminates. This update replaces raw pointers with `std::unique_ptr` for heap buffers that are used internally by the dynamics. # Test Description A manual test was done by running the dynamics benchmark for 1 time step with valgrind: ``` ==74249== LEAK SUMMARY: ==74249== definitely lost: 0 bytes in 0 blocks ==74249== indirectly lost: 0 bytes in 0 blocks ==74249== possibly lost: 7,600 bytes in 19 blocks ==74249== still reachable: 110,406 bytes in 717 blocks ==74249== suppressed: 0 bytes in 0 blocks ``` The setup that was used is not suited for an automated test as it is takes too long (init + 1 step on 32x32 grid takes upwards of 30mins).
Resolving static lifetime problems
Fixes #614
Change Description
Deleted classes
ScopedTimer
Timer
Edited classes
ConfigMap
Changed the
static const
type constants to anenum
.Configurator
Made the static data into singletonoid functions. The data in question are:
AdditionalConfiguration
objectThe command line values were combined into a struct to avoid passing more than one return value.
The pointer to the additional configuration was set to a
nullptr
initially, with null checks used when the pointer value is used.Configured
class templateNothing depends on the keyMap template member variable being a template member variable. Redefine each key map as
static
(compilation unit localized)const std::map
. Since many classes are configured, this affects many files. Only the source.cpp
files are affected.ConfiguredModule
&Module
Changes to
Module<>
Module
independent of Nextsim. It is now a separate Module base class that can be used for any C++ project.NextsimModule
class that specializes the newly genericModule
for nextSIM-DG.#include "Module.hpp"
with#include "NextsimModule.hpp"
.Module
namespace alongside theModule
template class. This string-to-string map is not available elsewhere. The data is stored as a function-localstatic
variable.Module<>
template. This static instance is now a function-localstatic
variable, so at least the construction order is well defined. The lifetime is still until the executable epilogue.DiagnosticOutputModule
class that inherited fromModule<IDiagnosticModule>
, everything now refers to the template instantiation. The…Module
class was necessary to run the register the module at static start-up. With this no longer necessary,…Module.hpp
) have been removedmodule_header.py
has been removedCMakeLists.txt
to run the header script has been removed.#include
s of the…Module.hpp
file have been replaced with an#include
ofNextsimModule.hpp
and an#include
of the interface header file.Module<>
. It now depends on the definition of two classes before it is#include
d:HelpMap
Config
which defines astd::string Config::getImpl(const std::string&)
function.Module<>
to be function-local statics. This requires a slight change to the API and a different format in the generatedmodule.cpp
files.fn
type is now a plain function pointer, rather than astd::function
object. This required replacing a lambda with an ad-hoc function in one test.namespace Module::InterfaceClass
.module_builder.py
to support the above changes.Module<>
template. This will call the destructor on the static instance, allowing all the data arrays the static instance holds to be deleted.Changes to
ConfiguredModule
ConfiguredModule
. Module configurations are now parsed on demand.main
to parse the configuration for the modules.setImplementation()
.ConfiguredModule_test
now only checks that the module configuration options are parsed correctly. Everything else is handeled by theModule<>
template class.Miscellaneous other changes
ModuleLoaderTestModules
. These are very defunct.IceGrowth_test
. This is so far from the Module changes, I don't see how it could have been working before.FileCallbackCloser
Replaced the static
std::list
with a singleton function providing astd::list
.Iterator
Removed the static implementation of
NullIterator
.MissingData
Replaced the
static double
missing data value with a singleton, inlined access function. The missing data value is created on first access, but still destroyed at the end of execution. This is fine, since it is a scalar double.Removed the
MissingData.cpp
source file as it is no longer needed.ModelComponent
registerModule
and the data members it uses. Also removehField()
and similar functions, which don't now do anything, and were originally planned to be used alongside the module register.ModelArrayRefStore
to be lazily allocated.ModelConfig
Moved the key specification to an
.ipp
file to avoid repeating the text inModel.cpp
.ParaGridIO
ModelArray
equivalents toconst
instance variables that are initialized as part of the ctor.Finalizer
class to to whatstd::atexit
does, except at our convenience, rather than executable clean-up. This includes a new test for the new class. This change is the same as that contained in PR Finalize classes and objects without usingstd::atexit
#683New classes
Finalizer
Finalize
class, which mimicsstd::atexit()
, but can be called before the end ofmain()
.Finalizer.cpp
source file to the CMake system.Finalizer
class.Module::finalize<>
function template.registerUnique()
each module instantiation is only called once when the finalization occurs.Test Description
Apart from deleted tests for deleted classes, all tests should still pass.
ModelComponent
Adds some test to
ModelArray_test
to check sizes around assignment operations.Finalizer
There is a test for the new
Finalizer
class.Documentation Impact
No changes to existing documentation. All classes should provided the same interfaces as previously.