You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The periodic_test with a build of master with the Kokkos Serial backend fails with a seg fault. Below is the output of valgrind from one of the two processes; the other process had a similar trace.
omegah - master @ c5f1dc9d
kokkos - develop @ ed08974c7 (newer than last tagged version of 4.2.00)
simmetrix simmodsuite - 2023.1-230907dev
Valgrind output:
==3612296== Memcheck, a memory error detector
==3612296== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==3612296== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==3612296== Command: ./src/periodic_test /space/cwsmith/omegahKkVersions/omega_h_master/meshes/wedge_matchZ_12elem.sms /space/cwsmith/omegahKkVersions/omega_h_master/meshes/wedge_match.smd /space/cwsmith/omegahKkVersions/omega_h_master/meshes/wedge_matchZ_12elem_sync_2.osh 2
==3612296== Parent PID: 3612294
==3612296==
==3612296== Invalid read of size 4
==3612296== at 0x6654270: host_atomic_fetch_oper<desul::Impl::sub_operator<int, int const>, int, desul::MemoryOrderRelaxed> (Fetch_Op_ScopeCaller.hpp:44)
==3612296== by 0x6654270: host_atomic_fetch_sub<int, desul::MemoryOrderRelaxed, desul::MemoryScopeCaller> (Fetch_Op_Generic.hpp:40)
==3612296== by 0x6654270: atomic_fetch_sub<int, desul::MemoryOrderRelaxed, desul::MemoryScopeCaller> (Generic.hpp:60)
==3612296== by 0x6654270: atomic_fetch_sub<int> (Kokkos_Atomics_Desul_Wrapper.hpp:83)
==3612296== by 0x6654270: Kokkos::Impl::SharedAllocationRecord<void, void>::decrement(Kokkos::Impl::SharedAllocationRecord<void, void>*) (Kokkos_SharedAlloc.cpp:212)
==3612296== by 0x5213382: assign_direct (Kokkos_SharedAlloc.hpp:477)
==3612296== by 0x5213382: Kokkos::Impl::ViewTracker<Kokkos::View<int*, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace> > >::operator=(Kokkos::Impl::ViewTracker<Kokkos::View<int*, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace> > > const&) (Kokkos_ViewTracker.hpp:79)
==3612296== by 0x521076E: Kokkos::View<int*, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace> >::operator=(Kokkos::View<int*, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace> > const&) (Kokkos_View.hpp:1288)
==3612296== by 0x520BA08: Omega_h::Write<int>::operator=(Omega_h::Write<int> const&) (Omega_h_array.hpp:49)
==3612296== by 0x5221F08: Omega_h::Read<int>::operator=(Omega_h::Read<int> const&) (Omega_h_array.hpp:88)
==3612296== by 0x5451023: Omega_h::Mesh::copy_meta() const (Omega_h_mesh.cpp:1235)
==3612296== by 0x54BE3C9: Omega_h::migrate_mesh(Omega_h::Mesh*, Omega_h::Dist, Omega_h_Parting, bool) (Omega_h_migrate.cpp:383)
==3612296== by 0x544D863: Omega_h::Mesh::balance(bool) (Omega_h_mesh.cpp:956)
==3612296== by 0x41CFCF: main (periodic_test.cpp:61)
==3612296== Address 0x38 is not stack'd, malloc'd or (recently) free'd
==3612296==
==3612296==
==3612296== Process terminating with default action of signal 11 (SIGSEGV)
==3612296== Access not within mapped region at address 0x38
==3612296== at 0x6654270: host_atomic_fetch_oper<desul::Impl::sub_operator<int, int const>, int, desul::MemoryOrderRelaxed> (Fetch_Op_ScopeCaller.hpp:44)
==3612296== by 0x6654270: host_atomic_fetch_sub<int, desul::MemoryOrderRelaxed, desul::MemoryScopeCaller> (Fetch_Op_Generic.hpp:40)
==3612296== by 0x6654270: atomic_fetch_sub<int, desul::MemoryOrderRelaxed, desul::MemoryScopeCaller> (Generic.hpp:60)
==3612296== by 0x6654270: atomic_fetch_sub<int> (Kokkos_Atomics_Desul_Wrapper.hpp:83)
==3612296== by 0x6654270: Kokkos::Impl::SharedAllocationRecord<void, void>::decrement(Kokkos::Impl::SharedAllocationRecord<void, void>*) (Kokkos_SharedAlloc.cpp:212)
==3612296== by 0x5213382: assign_direct (Kokkos_SharedAlloc.hpp:477)
==3612296== by 0x5213382: Kokkos::Impl::ViewTracker<Kokkos::View<int*, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace> > >::operator=(Kokkos::Impl::ViewTracker<Kokkos::View<int*, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace> > > const&) (Kokkos_ViewTracker.hpp:79)
==3612296== by 0x521076E: Kokkos::View<int*, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace> >::operator=(Kokkos::View<int*, Kokkos::Device<Kokkos::Serial, Kokkos::HostSpace> > const&) (Kokkos_View.hpp:1288)
==3612296== by 0x520BA08: Omega_h::Write<int>::operator=(Omega_h::Write<int> const&) (Omega_h_array.hpp:49)
==3612296== by 0x5221F08: Omega_h::Read<int>::operator=(Omega_h::Read<int> const&) (Omega_h_array.hpp:88)
==3612296== by 0x5451023: Omega_h::Mesh::copy_meta() const (Omega_h_mesh.cpp:1235)
==3612296== by 0x54BE3C9: Omega_h::migrate_mesh(Omega_h::Mesh*, Omega_h::Dist, Omega_h_Parting, bool) (Omega_h_migrate.cpp:383)
==3612296== by 0x544D863: Omega_h::Mesh::balance(bool) (Omega_h_mesh.cpp:956)
==3612296== by 0x41CFCF: main (periodic_test.cpp:61)
==3612296== If you believe this happened as a result of a stack
==3612296== overflow in your program's main thread (unlikely but
==3612296== possible), you can try to increase the size of the
==3612296== main thread stack using the --main-stacksize= flag.
==3612296== The main thread stack size used in this run was 8388608.
==3612296==
==3612296== HEAP SUMMARY:
==3612296== in use at exit: 13,116,178 bytes in 4,205 blocks
==3612296== total heap usage: 15,374 allocs, 11,169 frees, 14,496,121 bytes allocated
==3612296==
==3612296== LEAK SUMMARY:
==3612296== definitely lost: 0 bytes in 0 blocks
==3612296== indirectly lost: 0 bytes in 0 blocks
==3612296== possibly lost: 10,525 bytes in 206 blocks
==3612296== still reachable: 13,105,653 bytes in 3,999 blocks
==3612296== suppressed: 0 bytes in 0 blocks
==3612296== Rerun with --leak-check=full to see details of leaked memory
==3612296==
==3612296== For lists of detected and suppressed errors, rerun with: -s
==3612296== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
The text was updated successfully, but these errors were encountered:
starting point of debugging would be to debug or 'step' into the "migrate_matches" routine, I am not sure when I'll be able to replicate and work on fixing this issue
The
periodic_test
with a build of master with the Kokkos Serial backend fails with a seg fault. Below is the output of valgrind from one of the two processes; the other process had a similar trace.Omega_h cmake args:
Versions
Valgrind output:
The text was updated successfully, but these errors were encountered: