-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DMO Zoom on-the-fly with SWIFT segfault #116
Comments
Note these also segfault standalone on the snapshots for the same setup. They don't run atall with MPI so I'm doing them no-MPI. |
Just like in 53c0289, the SO_angularmomentum vector only has a certain size in some situations, and that check wasn't performed in this code that was setting its values to zero. Moreover, the values don't need to be initialised to zero as they already are set to zero when the vector is sized in PropData.Allocate(). This addresses the problem reported in #116. Signed-off-by: Rodrigo Tobar <[email protected]>
I just reproduced this locally with gcc on a gdb session with a smaller input file:
I had a closer look, and like I mentioned previously this seems to be very much like what happened during #78 and was fixed by 53c0289, so I prepared a similar fix. @stuartmcalpine I just pushed the commit to the new |
Yes that seems to fix it for me, thanks! |
The fix is merged now onto the |
I am doing some zoom runs, no mpi, on-the-fly with swift. For these tests, cosma 8 and 128 threads. The DMO version segfaults on the 4th invocation, and the hydro version later, like the 10th. But the DMO is consistently failing at the same place.
module load intel_comp/2018 intel_mpi/2018 fftw/3.3.7
module load gsl/2.5 hdf5/1.10.3
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-fPIC" -DVR_ZOOM_SIM=ON -DVR_MPI=OFF -DVR_MPI_REDUCE=OFF -DVR_USE_SWIFT_INTERFACE=ON ..
Config file:
vrconfig_3dfof_subhalos_SO_dmo.txt
Last bit of log:
[1726.081] [debug] search.cxx:3982 Getting Hierarchy 23
[1726.081] [debug] search.cxx:4015 Done
[1726.726] [ info] substructureproperties.cxx:5047 Sort particles and compute properties of 23 objects
[1726.726] [debug] substructureproperties.cxx:5059 Calculate properties using minimum potential particle as reference
[1726.726] [debug] substructureproperties.cxx:5062 Sort particles by binding energy
[1795.700] [debug] substructureproperties.cxx:5087 Memory report at substructureproperties.cxx:5087@long long **SortAccordingtoBindingEnergy(Options &, long long, NBody::Particle *, long long, long long *&, long long *, PropData *, long long): Average: 70.075 [GiB] Data: 72.269 [GiB] Dirty: 0 [B] Library: 0 [B] Peak: 81.744 [GiB] Resident: 68.478 [GiB] Shared: 8.734 [MiB] Size: 72.370 [GiB] Text: 4.180 [MiB]
[1795.701] [debug] substructureproperties.cxx:42 Getting CM
[1795.702] [debug] substructureproperties.cxx:320 Done getting CM in 1 [ms]
[1795.702] [debug] substructureproperties.cxx:4621 Getting energy
[1795.703] [debug] substructureproperties.cxx:4733 Have calculated potentials in 744 [us]
[1795.704] [debug] substructureproperties.cxx:5034 Done getting energy in 1 [ms]
[1795.704] [debug] substructureproperties.cxx:338 Getting bulk properties
[1795.706] [debug] substructureproperties.cxx:2194 Done getting properties in 1 [ms]
[1795.706] [debug] substructureproperties.cxx:3219 Done FOF masses in 4 [us]
[1795.706] [debug] substructureproperties.cxx:3236 Get inclusive masses
[1795.706] [debug] substructureproperties.cxx:3237 with masses based on full SO search (slower) for halos only
Line where it segfaults:
Originally posted by @stuartmcalpine in #53 (comment)
The text was updated successfully, but these errors were encountered: