-
Notifications
You must be signed in to change notification settings - Fork 376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"MEMLEAK" indications in several acme_developer tests on edison #1636
Comments
Last night I looked at the memory values for one of the tests. On edison, the first day is lower than the rest -- and then they stabilize. On Cori, they are all fairly even. I was going to make some plots, but haven't had time yet. I also haven't had time to look at the formula closely to see how it decides, but it's possible that first day is what's causing it to think there is a memleak -- would an acceptable solution be to ignore the first days memory measurement? |
Yeah if its comparing first day with later days that needs to change. It should not even try to diagnose unless there is at least 2 days of integration. |
New match attribute that can be 'last' or 'first' for values match in component.py Currently there is confusion as to how matches are found for multiple <value> elements in a <values> node. component.py is currently using a matching algorithm that picks the last match in case of multiple matches that are found. This matching algorithm is used anytime a Component object is instantiated (currently occurs in config_component.xml). By default if the match attribute DOES NOT appear, then the last match will be used, to make things backwards compatible. namelist_definition_<component>.xml uses the entry_id.py matching algorithm which picks the first match in case of multiple matches being found. So for setting namelists the first match is picked. This PR adds a new, optional, attribute to the <entry> element in EITHER a config_component.xml, config_compset.xml or namelist_definition_<component>.xml file. <entry id="<name>"> <values match="last"> will pick the last best match <values match="first"> will pick the first best match <value>...</value> <value>...</value> <values> <entry_id> As a result, there is new flexibility and transparency in how matches are determined in component.py by adding a match attribute that can be 'first' or 'last'. Having this be explicit will enable developers to not trip up on assuming 'first' or 'last' match and be wrong. This capability has been added to the _get_value_match routine in BOTH entry_id.py AND component.py. However, the default values differ: the default "match" value entry_id.py is "first" the default "match" value in component.py is "last" Having these default values differ preserves backwards compatibility when the "match" attribute is not there. Moving forwards, it would be good to always have a "match" attribute. The new match = "last"attribute has been added to all of the data components component_component.xml and the config_component_cesm.xml and config_component_acme.xml. Test suite: scripts_regressions_tests and also verified that running the prealpha and prebeta tests on cheyenne, with just namelist comparisons, resulted in identical namelists when compared to cesm2_0_alpha06m Test baseline: cesm2_0_alpha06m for cesm Test namelist changes: None Test status: bit for bit Fixes ESMCI/CIME issue 1617 User interface changes?: New match attribute elements that are children of <entry> nodes. Code review: gold2718
This is a bug and fill be fixed by #1639 |
Swap mem highwater and usage logs for memleak tests This is needed for correct parsing of mem highwater values from cpl.log. Previously, memory resident set size was getting parsed while checking for memleaks. Fixes #1636 [BFB] * origin/azamat/mem-usage/swap-highwater: Swap mem highwater and usage logs for memleak tests
I ran acme_dev on edison with next and all of the MEMLEAKs are now PASSes |
Correct parsing of mem highwater values from cpl.log. Previously, memory resident set size was getting parsed while checking for memleaks. [BFB] Fixes #1636
Correct parsing of mem highwater values from cpl.log. Previously, memory resident set size was getting parsed while checking for memleaks. [BFB] Fixes E3SM-Project/E3SM#1636
Correct parsing of mem highwater values from cpl.log. Previously, memory resident set size was getting parsed while checking for memleaks. [BFB] Fixes E3SM-Project/E3SM#1636
Correct parsing of mem highwater values from cpl.log. Previously, memory resident set size was getting parsed while checking for memleaks. [BFB] Fixes E3SM-Project/E3SM#1636
Correct parsing of mem highwater values from cpl.log. Previously, memory resident set size was getting parsed while checking for memleaks. [BFB] Fixes #1636
Correct parsing of mem highwater values from cpl.log. Previously, memory resident set size was getting parsed while checking for memleaks. [BFB] Fixes #1636
For a while now, the "smallville" test has been issuing a MEMLEAK warning on several platforms and it was suggested to ignore it. But in the last month or so, I see 5 new MEMLEAKS with tests in acme_developer -- but only on edison. I don't see these warnings on cori-haswell or cori-knl. I also tried using the intel v17 compiler on edison and see the same thing (currently edison using intel v15).
It seems strange that there really is a memory leak, so it could be an issue with how the memory is reported or how the decision is made to flag as MEMLEAK. I'm just documenting here.
Rob suggested that the following PR may have changed some things with respect to this. I am not sure yet, but the timing of it matches.
#1532
The text was updated successfully, but these errors were encountered: