Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements towards figuring out #118 #120

Merged
merged 6 commits into from
Nov 16, 2021
Merged

Improvements towards figuring out #118 #120

merged 6 commits into from
Nov 16, 2021

Conversation

rtobar
Copy link

@rtobar rtobar commented Nov 16, 2021

These are simple changes that, while not necessarily fixing #118, are self-contained, and might help figuring out where the memory is blowing exactly.

It will be useful in some places of the code to measure memory usage but
not necessarily to display it as a normal "report" in the logs. Thus,
splitting metrics collection from reporting will help us reuse the
collection bits in other places of the code.

While doing this I also took the chance of simplifying how these memory
collection metrics are performed. For starters, we don't need to know
about that many fields, as we're usually only interested on VM/RSS
curernt/peak usage. Additionally we can use a single file as the source
of the information, which should help speeding up this data collection
as well.

Signed-off-by: Rodrigo Tobar <[email protected]>
Pretty names are unnecessarily long: they don't really give any useful
information, specially because we already have the location in the file
where the report is generated

Signed-off-by: Rodrigo Tobar <[email protected]>
Memory reporting doesn't really need to use the Options object anymore
(it was previously used to keep track of peak memory usage, which the
kernel already does for us); additionally we already have the file/line
location showing up on the log statement when we generate a memory
report, so there's no need to have the information twice in the same
line.

Signed-off-by: Rodrigo Tobar <[email protected]>
These were not used in reality, or were previously not used correctly,
so there's no need to keep them around. If we need some memory tracking
functionality in the future, we can reinstate them in a different form.

Signed-off-by: Rodrigo Tobar <[email protected]>
VR's main function didn't have a high-level try/catch block to handle
uncaught exceptions rising from the code, leaving the behavior of the
program to the fate of whatever default the C++ runtime library
provides.

This commit adds such a block. When an uncaught exception is received,
an error is logged, and then either an MPI_Abort is issued in
COMM_WORLD, or a simpler exit is invoked on non-MPI builds.

Signed-off-by: Rodrigo Tobar <[email protected]>
This might help us understand what exactly is causing #118 (if logs are
correctly flushed and don't get too scrambled).

Signed-off-by: Rodrigo Tobar <[email protected]>
@icrar-velociraptor-bot
Copy link

3d-run

Relevant configuration parameters full file:

Parameter Value
Bound_halos 0
Physical_linking_length 0.10
FoF_Field_search_type 5

R_200crit v/s Mass_200crit:

R_200mean v/s Mass_200mean:

Mass_200crit histogram:

Mass_200mean histogram:

R_200crit histogram:

R_200mean histogram:

6d-run

Relevant configuration parameters full file:

Parameter Value
Bound_halos 0
Physical_linking_length 0.10
FoF_Field_search_type 3
Halo_6D_linking_length_factor 1.0
Halo_6D_vel_linking_length_factor 1.25

R_200crit v/s Mass_200crit:

R_200mean v/s Mass_200mean:

Mass_200crit histogram:

Mass_200mean histogram:

R_200crit histogram:

R_200mean histogram:

@rtobar rtobar merged commit 343f294 into master Nov 16, 2021
@rtobar rtobar deleted the issue-118 branch January 13, 2022 03:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants