-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix segfault in release build with GCC 5. #419
Conversation
GCC 5 will cause segfault in Release build.
So the fault just occurs on Gihub build, we cannot reproduce it locally, with GCC 5? |
We are using GCC 7.5 on the server, so there is no segfault. |
Is the stack trace the same with this version? There were 2 stack traces
printed last time, I think. Both are present?
…On Fri, Nov 27, 2020 at 2:38 PM Haowen Qiu ***@***.***> wrote:
So the fault just occurs on Gihub build, we cannot reproduce it locally,
with GCC 5?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#419 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLO7EBOHQSJUI6ZQMD6LSR5CONANCNFSM4UEQO3RA>
.
|
With GCC 5,
For example, for this commit csukuangfj@924b7c1,
|
You can see that the stack traces are the same before and after replacing short lambdas. |
I want to setup a docker image to install GCC5 so that we can do some experiments locally. |
I use gcc 5.5 on our server, IIRC, I can repro the fault before (when we find this issue).
|
It can be reproduced on our server with the latest master branch using GCC 5.5.0
|
Here is the output from our server
|
Mm, what's the stack trace, from gdb?
…On Fri, Nov 27, 2020 at 3:24 PM Fangjun Kuang ***@***.***> wrote:
Here is the output from our server
fangjun:~/open-source/k2/build$ ./bin/cu_fsa_algo_test
Running main() from /root/fangjun/open-source/k2/build/_deps/googletest-src/googletest/src/gtest_main.cc
[==========] Running 15 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 3 tests from ArcSort
[ RUN ] ArcSort.EmptyFsa
[ OK ] ArcSort.EmptyFsa (0 ms)
[ RUN ] ArcSort.NonEmptyFsa
[ OK ] ArcSort.NonEmptyFsa (4242 ms)
[ RUN ] ArcSort.NonEmptyFsaVec
[ OK ] ArcSort.NonEmptyFsaVec (3 ms)
[----------] 3 tests from ArcSort (4245 ms total)
[----------] 12 tests from FsaAlgo
[ RUN ] FsaAlgo.LinearFsa
[ OK ] FsaAlgo.LinearFsa (0 ms)
[ RUN ] FsaAlgo.LinearFsaVec
[ OK ] FsaAlgo.LinearFsaVec (0 ms)
[ RUN ] FsaAlgo.IntersectFsaVec
[ OK ] FsaAlgo.IntersectFsaVec (0 ms)
[ RUN ] FsaAlgo.AddEpsilonSelfLoopsFsa
[I] /root/fangjun/open-source/k2/k2/csrc/fsa_algo_test.cu:virtual void k2::FsaAlgo_AddEpsilonSelfLoopsFsa_Test::TestBody():283 fsa1 = [ [ 0 1 1 0.1 0 2 1 0.2 ] [ 1 3 2 0.3 ] [ 2 3 3 0.4 ] [ 3 4 -1 0.5 ] [ ] ], fsa1+self-loops = [ [ 0 0 0 0 0 1 1 0.1 0 2 1 0.2 ] [ 1 1 0 0 1 3 2 0.3 ] [ 2 2 0 0 2 3 3 0.4 ] [ 3 3 0 0 3 4 -1 0.5 ] [ ] ], arc-map = [ -1 0 1 -1 2 -1 3 -1 4 ]
[I] /root/fangjun/open-source/k2/k2/csrc/fsa_algo_test.cu:virtual void k2::FsaAlgo_AddEpsilonSelfLoopsFsa_Test::TestBody():283 fsa1 = [ [ ] ], fsa1+self-loops = [ [ ] ], arc-map = [ ]
[I] /root/fangjun/open-source/k2/k2/csrc/fsa_algo_test.cu:virtual void k2::FsaAlgo_AddEpsilonSelfLoopsFsa_Test::TestBody():283 fsa1 = [ [ ] [ [ 0 1 1 0.1 0 2 1 0.2 ] [ 1 3 2 0.3 ] [ 2 3 3 0.4 ] [ 3 4 -1 0.5 ] [ ] ] ], fsa1+self-loops = [ [ ] [ [ 0 0 0 0 0 1 1 0.1 0 2 1 0.2 ] [ 1 1 0 0 1 3 2 0.3 ] [ 2 2 0 0 2 3 3 0.4 ] [ 3 3 0 0 3 4 -1 0.5 ] [ ] ] ], arc-map = [ -1 0 1 -1 2 -1 3 -1 4 ]
[I] /root/fangjun/open-source/k2/k2/csrc/fsa_algo_test.cu:virtual void k2::FsaAlgo_AddEpsilonSelfLoopsFsa_Test::TestBody():283 fsa1 = [ [ 0 1 1 0.1 0 2 1 0.2 ] [ 1 3 2 0.3 ] [ 2 3 3 0.4 ] [ 3 4 -1 0.5 ] [ ] ], fsa1+self-loops = [ [ 0 0 0 0 0 1 1 0.1 0 2 1 0.2 ] [ 1 1 0 0 1 3 2 0.3 ] [ 2 2 0 0 2 3 3 0.4 ] [ 3 3 0 0 3 4 -1 0.5 ] [ ] ], arc-map = [ -1 0 1 -1 2 -1 3 -1 4 ]
[I] /root/fangjun/open-source/k2/k2/csrc/fsa_algo_test.cu:virtual void k2::FsaAlgo_AddEpsilonSelfLoopsFsa_Test::TestBody():283 fsa1 = [ [ ] ], fsa1+self-loops = [ [ ] ], arc-map = [ ]
[I] /root/fangjun/open-source/k2/k2/csrc/fsa_algo_test.cu:virtual void k2::FsaAlgo_AddEpsilonSelfLoopsFsa_Test::TestBody():283 fsa1 = [ [ ] [ [ 0 1 1 0.1 0 2 1 0.2 ] [ 1 3 2 0.3 ] [ 2 3 3 0.4 ] [ 3 4 -1 0.5 ] [ ] ] ], fsa1+self-loops = [ [ ] [ [ 0 0 0 0 0 1 1 0.1 0 2 1 0.2 ] [ 1 1 0 0 1 3 2 0.3 ] [ 2 2 0 0 2 3 3 0.4 ] [ 3 3 0 0 3 4 -1 0.5 ] [ ] ] ], arc-map = [ -1 0 1 -1 2 -1 3 -1 4 ]
[ OK ] FsaAlgo.AddEpsilonSelfLoopsFsa (2 ms)
[ RUN ] FsaAlgo.ShortestPath
Segmentation fault
fangjun:~/open-source/k2/build$
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#419 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLO72GVYJGGGQXDC5MLLSR5H3NANCNFSM4UEQO3RA>
.
|
Here is the stack trace from gdb
|
Mm, that's odd, it isn't even in a lambda. How about valgrind, does it
show earlier errors?
…On Fri, Nov 27, 2020 at 3:35 PM Fangjun Kuang ***@***.***> wrote:
Here is the stack trace from gdb
fangjun:~/open-source/k2/build$ gdb ./bin/cu_fsa_algo_test
GNU gdb (Ubuntu 8.1-0ubuntu3.2) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./bin/cu_fsa_algo_test...(no debugging symbols found)...done.
(gdb) r
Starting program: /root/fangjun/open-source/k2/build/bin/cu_fsa_algo_test
warning: File "/home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/lib/libstdc++.so.6.0.21-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
add-auto-load-safe-path /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/lib/libstdc++.so.6.0.21-gdb.py
line to your configuration file "/root/fangjun/.gdbinit".
To completely disable this security protection add
set auto-load safe-path /
line to your configuration file "/root/fangjun/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual. E.g., run from the shell:
info "(gdb)Auto-loading safe path"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Running main() from /root/fangjun/open-source/k2/build/_deps/googletest-src/googletest/src/gtest_main.cc
[==========] Running 15 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 3 tests from ArcSort
[ RUN ] ArcSort.EmptyFsa
[ OK ] ArcSort.EmptyFsa (0 ms)
[ RUN ] ArcSort.NonEmptyFsa
[New Thread 0x7fffa52ef700 (LWP 1332983)]
[New Thread 0x7fffa4aee700 (LWP 1332984)]
[ OK ] ArcSort.NonEmptyFsa (4364 ms)
[ RUN ] ArcSort.NonEmptyFsaVec
[ OK ] ArcSort.NonEmptyFsaVec (2 ms)
[----------] 3 tests from ArcSort (4367 ms total)
[----------] 12 tests from FsaAlgo
[ RUN ] FsaAlgo.LinearFsa
[ OK ] FsaAlgo.LinearFsa (1 ms)
[ RUN ] FsaAlgo.LinearFsaVec
[ OK ] FsaAlgo.LinearFsaVec (0 ms)
[ RUN ] FsaAlgo.IntersectFsaVec
[ OK ] FsaAlgo.IntersectFsaVec (0 ms)
[ RUN ] FsaAlgo.AddEpsilonSelfLoopsFsa
[I] /root/fangjun/open-source/k2/k2/csrc/fsa_algo_test.cu:virtual void k2::FsaAlgo_AddEpsilonSelfLoopsFsa_Test::TestBody():283 fsa1 = [ [ 0 1 1 0.1 0 2 1 0.2 ] [ 1 3 2 0.3 ] [ 2 3 3 0.4 ] [ 3 4 -1 0.5 ] $
] ], fsa1+self-loops = [ [ 0 0 0 0 0 1 1 0.1 0 2 1 0.2 ] [ 1 1 0 0 1 3 2 0.3 ] [ 2 2 0 0 2 3 3 0.4 ] [ 3 3 0 0 3 4 -1 0.5 ] [ ] ], arc-map = [ -1 0 1 -1 2 -1 3 -1 4 ]
[I] /root/fangjun/open-source/k2/k2/csrc/fsa_algo_test.cu:virtual void k2::FsaAlgo_AddEpsilonSelfLoopsFsa_Test::TestBody():283 fsa1 = [ [ ] ], fsa1+self-loops = [ [ ] ], arc-map = [ ]
[I] /root/fangjun/open-source/k2/k2/csrc/fsa_algo_test.cu:virtual void k2::FsaAlgo_AddEpsilonSelfLoopsFsa_Test::TestBody():283 fsa1 = [ [ ] [ [ 0 1 1 0.1 0 2 1 0.2 ] [ 1 3 2 0.3 ] [ 2 3 3 0.4 ] [ 3 4 -1 $
.5 ] [ ] ] ], fsa1+self-loops = [ [ ] [ [ 0 0 0 0 0 1 1 0.1 0 2 1 0.2 ] [ 1 1 0 0 1 3 2 0.3 ] [ 2 2 0 0 2 3 3 0.4 ] [ 3 3 0 0 3 4 -1 0.5 ] [ ] ] ], arc-map = [ -1 0 1 -1 2 -1 3 -1 4 ]
[I] /root/fangjun/open-source/k2/k2/csrc/fsa_algo_test.cu:virtual void k2::FsaAlgo_AddEpsilonSelfLoopsFsa_Test::TestBody():283 fsa1 = [ [ 0 1 1 0.1 0 2 1 0.2 ] [ 1 3 2 0.3 ] [ 2 3 3 0.4 ] [ 3 4 -1 0.5 ] $
] ], fsa1+self-loops = [ [ 0 0 0 0 0 1 1 0.1 0 2 1 0.2 ] [ 1 1 0 0 1 3 2 0.3 ] [ 2 2 0 0 2 3 3 0.4 ] [ 3 3 0 0 3 4 -1 0.5 ] [ ] ], arc-map = [ -1 0 1 -1 2 -1 3 -1 4 ]
[I] /root/fangjun/open-source/k2/k2/csrc/fsa_algo_test.cu:virtual void k2::FsaAlgo_AddEpsilonSelfLoopsFsa_Test::TestBody():283 fsa1 = [ [ ] ], fsa1+self-loops = [ [ ] ], arc-map = [ ]
[I] /root/fangjun/open-source/k2/k2/csrc/fsa_algo_test.cu:virtual void k2::FsaAlgo_AddEpsilonSelfLoopsFsa_Test::TestBody():283 fsa1 = [ [ ] [ [ 0 1 1 0.1 0 2 1 0.2 ] [ 1 3 2 0.3 ] [ 2 3 3 0.4 ] [ 3 4 -1 $
.5 ] [ ] ] ], fsa1+self-loops = [ [ ] [ [ 0 0 0 0 0 1 1 0.1 0 2 1 0.2 ] [ 1 1 0 0 1 3 2 0.3 ] [ 2 2 0 0 2 3 3 0.4 ] [ 3 3 0 0 3 4 -1 0.5 ] [ ] ] ], arc-map = [ -1 0 1 -1 2 -1 3 -1 4 ]
[ OK ] FsaAlgo.AddEpsilonSelfLoopsFsa (1 ms)
[ RUN ] FsaAlgo.ShortestPath
Thread 1 "cu_fsa_algo_tes" received signal SIGSEGV, Segmentation fault.
0x00007ffff75f3f05 in std::_Sp_counted_ptr_inplace<k2::Region, std::allocator<k2::Region>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() () from /root/fangjun/open-source/k2/build/lib/libk2context.so
(gdb) bt
#0 0x00007ffff75f3f05 in std::_Sp_counted_ptr_inplace<k2::Region, std::allocator<k2::Region>, (__gnu_cxx::_Lock_policy)2>::_M_dispose() () from /root/fangjun/open-source/k2/build/lib/libk2context.so
#1 0x000000000041d11a in std::vector<k2::RaggedShapeDim, std::allocator<k2::RaggedShapeDim> >::~vector() ()
#2 0x00007ffff7635491 in k2::GetStateBatches(k2::Ragged<k2::Arc>&, bool) () from /root/fangjun/open-source/k2/build/lib/libk2context.so
#3 0x0000000000417b6f in k2::FsaAlgo_ShortestPath_Test::TestBody() ()
#4 0x00007ffff7ec48b3 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ()
from /root/fangjun/open-source/k2/build/lib/libgtest.so
#5 0x00007ffff7eb16e3 in testing::Test::Run() () from /root/fangjun/open-source/k2/build/lib/libgtest.so
#6 0x00007ffff7eb183d in testing::TestInfo::Run() () from /root/fangjun/open-source/k2/build/lib/libgtest.so
#7 0x00007ffff7eb1935 in testing::TestSuite::Run() () from /root/fangjun/open-source/k2/build/lib/libgtest.so
#8 0x00007ffff7ebcb1c in testing::internal::UnitTestImpl::RunAllTests() () from /root/fangjun/open-source/k2/build/lib/libgtest.so
#9 0x00007ffff7ebcd91 in testing::UnitTest::Run() () from /root/fangjun/open-source/k2/build/lib/libgtest.so
#10 0x00007ffff7ff20db in main () from /root/fangjun/open-source/k2/build/lib/libgtest_main.so
#11 0x00007fffaf6eeb97 in __libc_start_main (main=0x7ffff7ff20a0 <main>, argc=1, argv=0x7fffffffe9b8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe9a8)
at ../csu/libc-start.c:310
#12 0x000000000040855a in _start ()
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#419 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLO7J7NTVDJBAJJJGZS3SR5JFNANCNFSM4UEQO3RA>
.
|
I am adding |
Output from valgrind
|
Seems those crashes from latest code in |
I don't think that would have been changed by setting ParalellRunner to
ParallelRunnerDummy.
It could be something about `static thread_local`, which is how
g_stream_override is defined.
Perhaps it needs to be initialized somewhere?
…On Fri, Nov 27, 2020 at 3:55 PM Haowen Qiu ***@***.***> wrote:
Seems those crashes from latest code in ParallellRunner?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#419 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLO7FMKYRNNHN4HSKTZ3SR5LNRANCNFSM4UEQO3RA>
.
|
It's not very nice to declare something thread_local in a header, perhaps.
I'm really unclear on how the code gets generated that initializes that
variable for each new thread.
…On Fri, Nov 27, 2020 at 3:57 PM Daniel Povey ***@***.***> wrote:
I don't think that would have been changed by setting ParalellRunner to
ParallelRunnerDummy.
It could be something about `static thread_local`, which is how
g_stream_override is defined.
Perhaps it needs to be initialized somewhere?
On Fri, Nov 27, 2020 at 3:55 PM Haowen Qiu ***@***.***>
wrote:
> Seems those crashes from latest code in ParallellRunner?
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#419 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAZFLO7FMKYRNNHN4HSKTZ3SR5LNRANCNFSM4UEQO3RA>
> .
>
|
The following stack trace is more informative, which is produced with
|
I think the error in the CudaStreamOverride class, reported by valgrind,
was the key one.
I think it's overwriting some part of memory that it shouldn't, due to some
problem with how thread_local variables are initialized.
I don't really know enough about the standard to know whether the way we
declared that thread_local global variable was correct.
…On Fri, Nov 27, 2020 at 4:05 PM Fangjun Kuang ***@***.***> wrote:
The following stack trace is more informative, which is produced with -g
while compiling k2
(gdb) bt
#0 0x00007ffff75f3f05 in k2::Region::~Region (this=0x1fdf510, __in_chrg=<optimized out>) at /root/fangjun/open-source/k2/k2/csrc/context.h:315
#1 __gnu_cxx::new_allocator<k2::Region>::destroy<k2::Region> (this=<optimized out>, __p=<optimized out>) at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/ext/new_allocator.h:124
#2 std::allocator_traits<std::allocator<k2::Region> >::destroy<k2::Region> (__a=..., __p=<optimized out>) at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/alloc_traits.h:542
#3 std::_Sp_counted_ptr_inplace<k2::Region, std::allocator<k2::Region>, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x1fdf500)
at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/shared_ptr_base.h:531
#4 0x000000000041d11a in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x1fdf500) at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/shared_ptr_base.h:150
#5 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x760d8378, __in_chrg=<optimized out>)
at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/shared_ptr_base.h:659
#6 std::__shared_ptr<k2::Region, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x760d8370, __in_chrg=<optimized out>)
at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/shared_ptr_base.h:925
#7 std::shared_ptr<k2::Region>::~shared_ptr (this=0x760d8370, __in_chrg=<optimized out>) at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/shared_ptr.h:93
#8 k2::Array1<int>::~Array1 (this=0x760d8360, __in_chrg=<optimized out>) at /root/fangjun/open-source/k2/k2/csrc/array.h:37
#9 k2::RaggedShapeDim::~RaggedShapeDim (this=0x760d8340, __in_chrg=<optimized out>) at /root/fangjun/open-source/k2/k2/csrc/ragged.h:33
#10 std::_Destroy<k2::RaggedShapeDim> (__pointer=<optimized out>) at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/stl_construct.h:93
#11 std::_Destroy_aux<false>::__destroy<k2::RaggedShapeDim*> (__last=<optimized out>, __first=0x760d8340) at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/stl_construct.h:103
#12 std::_Destroy<k2::RaggedShapeDim*> (__last=<optimized out>, __first=<optimized out>) at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/stl_construct.h:126
#13 std::_Destroy<k2::RaggedShapeDim*, k2::RaggedShapeDim> (__last=0x760d83d0, __first=<optimized out>) at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/stl_construct.h:151
#14 std::vector<k2::RaggedShapeDim, std::allocator<k2::RaggedShapeDim> >::~vector (this=0x7fffffffdd80, __in_chrg=<optimized out>)
at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/stl_vector.h:424
#15 0x00007ffff7635491 in k2::RaggedShape::~RaggedShape (this=0x7fffffffdd80, __in_chrg=<optimized out>) at /root/fangjun/open-source/k2/k2/csrc/ragged.h:62
#16 k2::GetStateBatches (fsas=..., ***@***.***=true) at /root/fangjun/open-source/k2/k2/csrc/fsa_utils.cu:794
#17 0x0000000000417b6f in k2::FsaAlgo_ShortestPath_Test::TestBody (this=<optimized out>) at /root/fangjun/open-source/k2/k2/csrc/fsa_algo_test.cu:343
#18 0x00007ffff7ec48b3 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ()
from /root/fangjun/open-source/k2/build/lib/libgtest.so
#19 0x00007ffff7eb16e3 in testing::Test::Run() () from /root/fangjun/open-source/k2/build/lib/libgtest.so
#20 0x00007ffff7eb183d in testing::TestInfo::Run() () from /root/fangjun/open-source/k2/build/lib/libgtest.so
#21 0x00007ffff7eb1935 in testing::TestSuite::Run() () from /root/fangjun/open-source/k2/build/lib/libgtest.so
#22 0x00007ffff7ebcb1c in testing::internal::UnitTestImpl::RunAllTests() () from /root/fangjun/open-source/k2/build/lib/libgtest.so
#23 0x00007ffff7ebcd91 in testing::UnitTest::Run() () from /root/fangjun/open-source/k2/build/lib/libgtest.so
#24 0x00007ffff7ff20db in main () from /root/fangjun/open-source/k2/build/lib/libgtest_main.so
#25 0x00007fffaf6eeb97 in __libc_start_main (main=0x7ffff7ff20a0 <main>, argc=1, argv=0x7fffffffe948, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe938)
at ../csu/libc-start.c:310
#26 0x000000000040855a in _start ()
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#419 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLO5QYB2PINCB2PT7DKLSR5MVRANCNFSM4UEQO3RA>
.
|
See here:
https://stackoverflow.com/questions/55191177/thread-local-at-block-scope
static thread_local at file scope gives the thread_local variable internal
linkage, which means there will be one copy per translation unit in the TLS
(each translation unit will resolve to its own variable at the TLS index
for the .exe, because the assembler will insert the variable in the rdata$t
section of the .o file and mark it in the symbol table as a local symbol
due to the lack of the .global directive on the symbol). extern thread_local at
file scope is legal like it is at block scope and uses the thread_local copy
defined in another translation unit. thread_local at file scope is not
implicitly static, because it can provide a global symbol definition for
another translation unit, which cannot be done by a local variable.
.. I think maybe it should be 'extern' in the header and then in some .cc
file we can declare the same thing static.
…On Fri, Nov 27, 2020 at 4:09 PM Daniel Povey ***@***.***> wrote:
I think the error in the CudaStreamOverride class, reported by valgrind,
was the key one.
I think it's overwriting some part of memory that it shouldn't, due to
some problem with how thread_local variables are initialized.
I don't really know enough about the standard to know whether the way we
declared that thread_local global variable was correct.
On Fri, Nov 27, 2020 at 4:05 PM Fangjun Kuang ***@***.***>
wrote:
> The following stack trace is more informative, which is produced with -g
> while compiling k2
>
> (gdb) bt
> #0 0x00007ffff75f3f05 in k2::Region::~Region (this=0x1fdf510, __in_chrg=<optimized out>) at /root/fangjun/open-source/k2/k2/csrc/context.h:315
> #1 __gnu_cxx::new_allocator<k2::Region>::destroy<k2::Region> (this=<optimized out>, __p=<optimized out>) at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/ext/new_allocator.h:124
> #2 std::allocator_traits<std::allocator<k2::Region> >::destroy<k2::Region> (__a=..., __p=<optimized out>) at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/alloc_traits.h:542
> #3 std::_Sp_counted_ptr_inplace<k2::Region, std::allocator<k2::Region>, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x1fdf500)
> at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/shared_ptr_base.h:531
> #4 0x000000000041d11a in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x1fdf500) at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/shared_ptr_base.h:150
> #5 std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x760d8378, __in_chrg=<optimized out>)
> at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/shared_ptr_base.h:659
> #6 std::__shared_ptr<k2::Region, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x760d8370, __in_chrg=<optimized out>)
> at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/shared_ptr_base.h:925
> #7 std::shared_ptr<k2::Region>::~shared_ptr (this=0x760d8370, __in_chrg=<optimized out>) at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/shared_ptr.h:93
> #8 k2::Array1<int>::~Array1 (this=0x760d8360, __in_chrg=<optimized out>) at /root/fangjun/open-source/k2/k2/csrc/array.h:37
> #9 k2::RaggedShapeDim::~RaggedShapeDim (this=0x760d8340, __in_chrg=<optimized out>) at /root/fangjun/open-source/k2/k2/csrc/ragged.h:33
> #10 std::_Destroy<k2::RaggedShapeDim> (__pointer=<optimized out>) at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/stl_construct.h:93
> #11 std::_Destroy_aux<false>::__destroy<k2::RaggedShapeDim*> (__last=<optimized out>, __first=0x760d8340) at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/stl_construct.h:103
> #12 std::_Destroy<k2::RaggedShapeDim*> (__last=<optimized out>, __first=<optimized out>) at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/stl_construct.h:126
> #13 std::_Destroy<k2::RaggedShapeDim*, k2::RaggedShapeDim> (__last=0x760d83d0, __first=<optimized out>) at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/stl_construct.h:151
> #14 std::vector<k2::RaggedShapeDim, std::allocator<k2::RaggedShapeDim> >::~vector (this=0x7fffffffdd80, __in_chrg=<optimized out>)
> at /home/linuxbrew/.linuxbrew/Cellar/gcc/5.5.0_7/include/c++/5.5.0/bits/stl_vector.h:424
> #15 0x00007ffff7635491 in k2::RaggedShape::~RaggedShape (this=0x7fffffffdd80, __in_chrg=<optimized out>) at /root/fangjun/open-source/k2/k2/csrc/ragged.h:62
> #16 k2::GetStateBatches (fsas=..., ***@***.***=true) at /root/fangjun/open-source/k2/k2/csrc/fsa_utils.cu:794
> #17 0x0000000000417b6f in k2::FsaAlgo_ShortestPath_Test::TestBody (this=<optimized out>) at /root/fangjun/open-source/k2/k2/csrc/fsa_algo_test.cu:343
> #18 0x00007ffff7ec48b3 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) ()
> from /root/fangjun/open-source/k2/build/lib/libgtest.so
> #19 0x00007ffff7eb16e3 in testing::Test::Run() () from /root/fangjun/open-source/k2/build/lib/libgtest.so
> #20 0x00007ffff7eb183d in testing::TestInfo::Run() () from /root/fangjun/open-source/k2/build/lib/libgtest.so
> #21 0x00007ffff7eb1935 in testing::TestSuite::Run() () from /root/fangjun/open-source/k2/build/lib/libgtest.so
> #22 0x00007ffff7ebcb1c in testing::internal::UnitTestImpl::RunAllTests() () from /root/fangjun/open-source/k2/build/lib/libgtest.so
> #23 0x00007ffff7ebcd91 in testing::UnitTest::Run() () from /root/fangjun/open-source/k2/build/lib/libgtest.so
> #24 0x00007ffff7ff20db in main () from /root/fangjun/open-source/k2/build/lib/libgtest_main.so
> #25 0x00007fffaf6eeb97 in __libc_start_main (main=0x7ffff7ff20a0 <main>, argc=1, argv=0x7fffffffe948, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe938)
> at ../csu/libc-start.c:310
> #26 0x000000000040855a in _start ()
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#419 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAZFLO5QYB2PINCB2PT7DKLSR5MVRANCNFSM4UEQO3RA>
> .
>
|
The segfault is caused by Lines 381 to 384 in 57a3bc6
The debug checks are ignored in the release build, so You can set Lines 35 to 37 in 57a3bc6
to constexpr bool kDisableDebug = false; to see the error log. |
There must be something wrong (maybe related with calling code), we suppose it will never be empty according to the way we call |
BTW, @csukuangfj does using extern for |
I am trying it. |
The segfault remains with
And the output is
|
Mm, perhaps you could add print statements to constructor and destructor of
CudaStreamOverride?
IDK how easy it would be to obtain the thread_id and stack...
…On Fri, Nov 27, 2020 at 4:39 PM Fangjun Kuang ***@***.***> wrote:
The segfault remains with extern thread_local:
diff --git a/k2/csrc/context.cu b/k2/csrc/context.cu
index 86b5fb0..267025d 100644
--- a/k2/csrc/context.cu
+++ b/k2/csrc/context.cu
@@ -15,6 +15,8 @@
namespace k2 {
+thread_local CudaStreamOverride g_stream_override;
+
RegionPtr NewRegion(ContextPtr context, std::size_t num_bytes) {
// .. fairly straightforward. Sets bytes_used to num_bytes, caller can
// overwrite if needed.
diff --git a/k2/csrc/context.h b/k2/csrc/context.h
index 5a77228..9410770 100644
--- a/k2/csrc/context.h
+++ b/k2/csrc/context.h
@@ -390,7 +390,7 @@ class CudaStreamOverride {
std::vector<cudaStream_t> stack_;
};
-static thread_local CudaStreamOverride g_stream_override;
+extern thread_local CudaStreamOverride g_stream_override;
class With {
public:
diff --git a/k2/csrc/log.h b/k2/csrc/log.h
index 25099c0..1ef2f65 100644
--- a/k2/csrc/log.h
+++ b/k2/csrc/log.h
@@ -33,7 +33,7 @@ namespace k2 {
namespace internal {
#if defined(NDEBUG)
-constexpr bool kDisableDebug = true;
+constexpr bool kDisableDebug = false;
#else
constexpr bool kDisableDebug = false;
#endif
And the output is
fangjun:~/open-source/k2/build$ ./bin/cu_fsa_algo_test
Running main() from /root/fangjun/open-source/k2/build/_deps/googletest-src/googletest/src/gtest_main.cc
[==========] Running 15 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 3 tests from ArcSort
[ RUN ] ArcSort.EmptyFsa
[ OK ] ArcSort.EmptyFsa (0 ms)
[ RUN ] ArcSort.NonEmptyFsa
[F] /root/fangjun/open-source/k2/k2/csrc/eval.h:void k2::EvalDevice(cudaStream_t, int32_t, LambdaT&) [with LambdaT = __nv_dl_wrapper_t<__nv_dl_tag<void (k2::Array1<int>::*)(int), &k2::Array1<int>::operator=, 1u>, int*, const int>; cudaStream_t = CUstream_st*; int32_t = int]:139 Check failed: stream != kCudaStreamInvalid
[ Stack-Trace: ]
/root/fangjun/open-source/k2/build/lib/libk2_log.so(k2::internal::GetStackTrace()+0x39) [0x7f27cbf6ffa9]
./bin/cu_fsa_algo_test(k2::internal::Logger::~Logger()+0x28) [0x41c1a8]
/root/fangjun/open-source/k2/build/lib/libk2context.so(void k2::EvalDevice<__nv_dl_wrapper_t<__nv_dl_tag<void (k2::Array1<int>::*)(int), &k2::Array1<int>::operator=, 1u>, int*, int const> >(CUstream_st*,
int, __nv_dl_wrapper_t<__nv_dl_tag<void (k2::Array1<int>::*)(int), &k2::Array1<int>::operator=, 1u>, int*, int const>&)+0x697) [0x7f27cb61d557]
/root/fangjun/open-source/k2/build/lib/libk2context.so(k2::RaggedShape::Validate(bool) const+0x72d) [0x7f27cb6e29dd]
/root/fangjun/open-source/k2/build/lib/libk2context.so(k2::RaggedShape::To(std::shared_ptr<k2::Context>) const+0x6a1) [0x7f27cb6e4211]
./bin/cu_fsa_algo_test(k2::Ragged<k2::Arc>::To(std::shared_ptr<k2::Context>) const+0x53) [0x4206a3]
./bin/cu_fsa_algo_test() [0x40e91a]
/root/fangjun/open-source/k2/build/lib/libgtest.so(void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x33) [0x7f27cbf418b3]
/root/fangjun/open-source/k2/build/lib/libgtest.so(testing::Test::Run()+0xc3) [0x7f27cbf2e6e3]
/root/fangjun/open-source/k2/build/lib/libgtest.so(testing::TestInfo::Run()+0x12d) [0x7f27cbf2e83d]
/root/fangjun/open-source/k2/build/lib/libgtest.so(testing::TestSuite::Run()+0xc5) [0x7f27cbf2e935]
/root/fangjun/open-source/k2/build/lib/libgtest.so(testing::internal::UnitTestImpl::RunAllTests()+0x3dc) [0x7f27cbf39b1c]
/root/fangjun/open-source/k2/build/lib/libgtest.so(testing::UnitTest::Run()+0x81) [0x7f27cbf39d91]
/root/fangjun/open-source/k2/build/lib/libgtest_main.so(main+0x3b) [0x7f27cc0700db]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7f2783756b97]
./bin/cu_fsa_algo_test() [0x40858a]
Aborted
fangjun:~/open-source/k2/build$
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#419 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLOZ2GYCUZ4W4RTITC3LSR5QVHANCNFSM4UEQO3RA>
.
|
@kkm000 or @galv I wonder if you might be able to help.
This is probably some subtlety about thread_local and dynamic libraries.
…On Fri, Nov 27, 2020 at 4:42 PM Daniel Povey ***@***.***> wrote:
Mm, perhaps you could add print statements to constructor and destructor
of CudaStreamOverride?
IDK how easy it would be to obtain the thread_id and stack...
On Fri, Nov 27, 2020 at 4:39 PM Fangjun Kuang ***@***.***>
wrote:
> The segfault remains with extern thread_local:
>
> diff --git a/k2/csrc/context.cu b/k2/csrc/context.cu
> index 86b5fb0..267025d 100644
> --- a/k2/csrc/context.cu
> +++ b/k2/csrc/context.cu
> @@ -15,6 +15,8 @@
>
> namespace k2 {
>
> +thread_local CudaStreamOverride g_stream_override;
> +
> RegionPtr NewRegion(ContextPtr context, std::size_t num_bytes) {
> // .. fairly straightforward. Sets bytes_used to num_bytes, caller can
> // overwrite if needed.
> diff --git a/k2/csrc/context.h b/k2/csrc/context.h
> index 5a77228..9410770 100644
> --- a/k2/csrc/context.h
> +++ b/k2/csrc/context.h
> @@ -390,7 +390,7 @@ class CudaStreamOverride {
> std::vector<cudaStream_t> stack_;
> };
>
> -static thread_local CudaStreamOverride g_stream_override;
> +extern thread_local CudaStreamOverride g_stream_override;
>
> class With {
> public:
> diff --git a/k2/csrc/log.h b/k2/csrc/log.h
> index 25099c0..1ef2f65 100644
> --- a/k2/csrc/log.h
> +++ b/k2/csrc/log.h
> @@ -33,7 +33,7 @@ namespace k2 {
> namespace internal {
>
> #if defined(NDEBUG)
> -constexpr bool kDisableDebug = true;
> +constexpr bool kDisableDebug = false;
> #else
> constexpr bool kDisableDebug = false;
> #endif
>
> And the output is
>
> fangjun:~/open-source/k2/build$ ./bin/cu_fsa_algo_test
> Running main() from /root/fangjun/open-source/k2/build/_deps/googletest-src/googletest/src/gtest_main.cc
> [==========] Running 15 tests from 2 test suites.
> [----------] Global test environment set-up.
> [----------] 3 tests from ArcSort
> [ RUN ] ArcSort.EmptyFsa
> [ OK ] ArcSort.EmptyFsa (0 ms)
> [ RUN ] ArcSort.NonEmptyFsa
> [F] /root/fangjun/open-source/k2/k2/csrc/eval.h:void k2::EvalDevice(cudaStream_t, int32_t, LambdaT&) [with LambdaT = __nv_dl_wrapper_t<__nv_dl_tag<void (k2::Array1<int>::*)(int), &k2::Array1<int>::operator=, 1u>, int*, const int>; cudaStream_t = CUstream_st*; int32_t = int]:139 Check failed: stream != kCudaStreamInvalid
>
>
> [ Stack-Trace: ]
> /root/fangjun/open-source/k2/build/lib/libk2_log.so(k2::internal::GetStackTrace()+0x39) [0x7f27cbf6ffa9]
> ./bin/cu_fsa_algo_test(k2::internal::Logger::~Logger()+0x28) [0x41c1a8]
> /root/fangjun/open-source/k2/build/lib/libk2context.so(void k2::EvalDevice<__nv_dl_wrapper_t<__nv_dl_tag<void (k2::Array1<int>::*)(int), &k2::Array1<int>::operator=, 1u>, int*, int const> >(CUstream_st*,
> int, __nv_dl_wrapper_t<__nv_dl_tag<void (k2::Array1<int>::*)(int), &k2::Array1<int>::operator=, 1u>, int*, int const>&)+0x697) [0x7f27cb61d557]
> /root/fangjun/open-source/k2/build/lib/libk2context.so(k2::RaggedShape::Validate(bool) const+0x72d) [0x7f27cb6e29dd]
> /root/fangjun/open-source/k2/build/lib/libk2context.so(k2::RaggedShape::To(std::shared_ptr<k2::Context>) const+0x6a1) [0x7f27cb6e4211]
> ./bin/cu_fsa_algo_test(k2::Ragged<k2::Arc>::To(std::shared_ptr<k2::Context>) const+0x53) [0x4206a3]
> ./bin/cu_fsa_algo_test() [0x40e91a]
> /root/fangjun/open-source/k2/build/lib/libgtest.so(void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x33) [0x7f27cbf418b3]
> /root/fangjun/open-source/k2/build/lib/libgtest.so(testing::Test::Run()+0xc3) [0x7f27cbf2e6e3]
> /root/fangjun/open-source/k2/build/lib/libgtest.so(testing::TestInfo::Run()+0x12d) [0x7f27cbf2e83d]
> /root/fangjun/open-source/k2/build/lib/libgtest.so(testing::TestSuite::Run()+0xc5) [0x7f27cbf2e935]
> /root/fangjun/open-source/k2/build/lib/libgtest.so(testing::internal::UnitTestImpl::RunAllTests()+0x3dc) [0x7f27cbf39b1c]
> /root/fangjun/open-source/k2/build/lib/libgtest.so(testing::UnitTest::Run()+0x81) [0x7f27cbf39d91]
> /root/fangjun/open-source/k2/build/lib/libgtest_main.so(main+0x3b) [0x7f27cc0700db]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7f2783756b97]
> ./bin/cu_fsa_algo_test() [0x40858a]
>
> Aborted
> fangjun:~/open-source/k2/build$
>
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#419 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAZFLOZ2GYCUZ4W4RTITC3LSR5QVHANCNFSM4UEQO3RA>
> .
>
|
After adding log statements to the constructor and destructor, the segfault is gone! The binary is run for multiple times; there As the segfault disappears for higher versions of GCC, so I think there is no problem with the code.
|
I don't know the reason, but it seems when I add LOG in
Not sure if it can help to find the root reason. (@csukuangfj wondering if you can repro this or not, I have run for more than 10 times) |
OK, let's leave this alone for now.
…On Fri, Nov 27, 2020 at 4:58 PM Fangjun Kuang ***@***.***> wrote:
After adding log statements to the constructor and destructor, the
segfault is gone! The binary is run for multiple times; there
is no segfault anymore.
As the segfault disappears for higher versions of GCC, so I think there is
no problem with the code.
diff --git a/k2/csrc/context.h b/k2/csrc/context.h
index 5a77228..2ca9e17 100644
--- a/k2/csrc/context.h
+++ b/k2/csrc/context.h
@@ -377,14 +377,19 @@ class CudaStreamOverride {
void Push(cudaStream_t stream) {
stack_.push_back(stream);
stream_override_ = stream;
+ K2_LOG(INFO) << "push: size: " << stack_.size();
}
void Pop(cudaStream_t stream) {
+ K2_LOG(INFO) << "pop: size: " << stack_.size();
K2_DCHECK(!stack_.empty());
K2_DCHECK_EQ(stack_.back(), stream);
stack_.pop_back();
}
- CudaStreamOverride() : stream_override_(0x0) {}
+ CudaStreamOverride() : stream_override_(0x0) {
+ K2_LOG(INFO) << "constructor";
+ }
+ ~CudaStreamOverride() { K2_LOG(INFO) << "in destructor"; }
cudaStream_t stream_override_;
std::vector<cudaStream_t> stack_;
diff --git a/k2/csrc/log.h b/k2/csrc/log.h
index 25099c0..1ef2f65 100644
--- a/k2/csrc/log.h
+++ b/k2/csrc/log.h
@@ -33,7 +33,7 @@ namespace k2 {
namespace internal {
#if defined(NDEBUG)
-constexpr bool kDisableDebug = true;
+constexpr bool kDisableDebug = false;
#else
constexpr bool kDisableDebug = false;
#endif
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#419 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLOZRNBAZVNM53ABGBA3SR5SZVANCNFSM4UEQO3RA>
.
|
Yes, please see the comments I posted just before yours. |
I guess it has something to do with inline optimization of the compiler. The following modification will prevent segfault (The segfault does not appear after more than 10 runs of the binary)
|
|
Interesting. Perhaps it doesn't realize that other functions may be
reading from that `stack_` variable? I would have thought the compiler
would know that that was a possibility, though...
…On Fri, Nov 27, 2020 at 5:11 PM Fangjun Kuang ***@***.***> wrote:
I guess it has something to do with inline optimization of the compiler.
The following modification will prevent segfault (The segfault does not
appear after more than 10 runs of the binary)
diff --git a/k2/csrc/context.h b/k2/csrc/context.h
index 5a77228..8a91424 100644
--- a/k2/csrc/context.h
+++ b/k2/csrc/context.h
@@ -374,11 +374,11 @@ class CudaStreamOverride {
else
return stream;
}
- void Push(cudaStream_t stream) {
+ __attribute__((noinline)) void Push(cudaStream_t stream) {
stack_.push_back(stream);
stream_override_ = stream;
}
- void Pop(cudaStream_t stream) {
+ __attribute__((noinline)) void Pop(cudaStream_t stream) {
K2_DCHECK(!stack_.empty());
K2_DCHECK_EQ(stack_.back(), stream);
stack_.pop_back();
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#419 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLO57OL5XYQP5G62HOWLSR5ULTANCNFSM4UEQO3RA>
.
|
Moving the implementation of |
Disable inline optimization of `CudaStreamOverride`.
Yeah, that's simpler. It has no opportunity to inline if it's not in the
header.
…On Fri, Nov 27, 2020 at 5:24 PM Haowen Qiu ***@***.***> wrote:
Moving the implementation of Pop and Push to context.cu would be fine as
well. Have run 10+ times
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#419 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZFLO6GZCE6O4UBRLHCPV3SR5V27ANCNFSM4UEQO3RA>
.
|
Thanks, will move them to context.cu. |
It prevents the compiler from inlining them.
+2 |
Let's wait and see if GitHub actions will segfault or not. |
Tests are passed! Merging |
It would be correct to say "allowed by the standard". The standard does not define what the scoped (
|
GCC 5 will cause segfault in the Release build.
Now all C++ tests can be run in GitHub actions for Debug build as well as Release build.