Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault produced by version check on some machines. #1177

Closed
cpockrandt opened this issue Jul 10, 2019 · 8 comments · Fixed by #1185
Closed

Segmentation fault produced by version check on some machines. #1177

cpockrandt opened this issue Jul 10, 2019 · 8 comments · Fixed by #1185
Assignees
Labels
bug faulty or wrong behaviour of code

Comments

@cpockrandt
Copy link
Contributor

cpockrandt commented Jul 10, 2019

Hi,

I am using the argument parser. On one machine the binary runs, on the other I get a segfault.

pocki@salz4 /ccb/salz4-2/pocki/HAPLO_RES % gdb ./kmc_unique                                                                                                                                                 :(
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /ccb/salz4-2/pocki/HAPLO_RES/kmc_unique...done.
(gdb) run
Starting program: /ccb/salz4-2/pocki/HAPLO_RES/./kmc_unique 
Detaching after fork from child process 8658.

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x00000000004d3c61 in std::thread::detach() ()
#2  0x000000000042b8da in seqan3::detail::version_checker::operator()(std::promise<bool>) ()
#3  0x0000000000435e14 in seqan3::argument_parser::parse() ()
#4  0x0000000000405f8c in main ()

I can fix it by turning version-check off. Interestingly the first one works, the second call produces the same segmentation fault (it seems the order of arguments matters ... shouldn't all arguments be parsed first before sending data over the internet? 🤔 ):

% ./kmc_unique --version-check 0 --help
***HELP PAGE***
% ./kmc_unique --help --version-check 0
[1]    9131 segmentation fault (core dumped)  ./kmc_unique --help --version-check 0

I include omp.h but I don't do anything fancy in my program. If I share the entire code I guess it will be hard to reproduce as it only happens on one of my machines. Do you have any idea that I could try fixing it? Where are the information stored when you first open the program and it asks you whether you want to run version checks every time? Maybe this file got messed up on that machine.

@cpockrandt cpockrandt added the bug faulty or wrong behaviour of code label Jul 10, 2019
@eseiler
Copy link
Member

eseiler commented Jul 10, 2019

Hey!

The thing with the argument positions is probably originating from

if (arg == "-h" || arg == "--help")
{
format = detail::format_help{false};
init_standard_options();
return;
}
else if (arg == "-hh" || arg == "--advanced-help")

and following lines.
We first check --help (we return after that) and after a while we check for --version-check (and keep going after that). --version-check should probably be checked before --help.

The cookie file should be in $HOME/.config/seqan/ and be named kmc_unique_usr.timestamp or kmc_unique_dev.timestamp and kmc_unique.version. timestamp stores whether we perform a check, version stores the version of the app and SeqAn.
Aditionally, we also check if the env variable SEQAN3_NO_VERSION_CHECK is defined; if it is, we don't perform a check.

That's all I can say for now, but I'm sure @smehringer is happy to help 😁

@smehringer
Copy link
Member

which two machines, with which OS and which compiler are you using? Do you link pthread?

% ./kmc_unique --version-check 0 --help
***HELP PAGE***
% ./kmc_unique --help --version-check 0
[1]    9131 segmentation fault (core dumped)  ./kmc_unique --help --version-check 0

@eseiler is right. In the first case, the version check is turned off. In the second case it is not because as soon as we see help, we terminate the loop. This needs to be changed.

As to the segmentation fault, the stack trace is quite short :(
I would need to reproduce the error... maybe we have a similar machine as the one you use?

@cpockrandt
Copy link
Contributor Author

cpockrandt commented Jul 11, 2019

Thanks @eseiler for the path. The reason why I don't get segmentation faults is simply because I selected "never check for updates" on most machines. I.e. I get the same segmentation fault on every machine when I remove the config files.

It seems that this line in my CMake file leads to the segmentation fault. If I take it out, it runs:

set (CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -static")

I reproduced the error with this minimal example:

#include "omp.h"
#include <seqan3/argument_parser/all.hpp>

int main(int argc, char* argv[])
{
    seqan3::argument_parser myparser{"ALF", argc, argv};

    myparser.info.author = "Gordon Shumway";
    myparser.info.short_description = "Getting back to Melmac.";
    myparser.info.version = "0.0.0";

    try
    {
         myparser.parse();
    }
    catch (seqan3::parser_invalid_argument const & ext)
    {
        return -1;
    }
    return 0;
}

I run it with GCC 8.3.0 on the master branch.

@h-2
Copy link
Member

h-2 commented Jul 11, 2019

Let me guess, the machine you built on is not Debian/Ubuntu, but the machine that crashes is?

You have to be really careful with static builds, it's why we went through a lot of trouble on SeqAn2 to make it work. In any case, this is not a problem of SeqAn3, it's a problem of how some linux distributions do their packaging and static builds in general.

@h-2
Copy link
Member

h-2 commented Jul 11, 2019

Adding this to your flags will likely make it work -Wl,--whole-archive -lpthread -Wl,--no-whole-archive, however, I recommend just rebuilding the app on every machine. Doing that and using -march=native will also improve the performance.

@cpockrandt
Copy link
Contributor Author

cpockrandt commented Jul 11, 2019

Let me guess, the machine you built on is not Debian/Ubuntu, but the machine that crashes is?

It crashes on Debian as well as Red Hat machines.

It's not really a problem for me, I can work around it. But if it is not even working on Debian, it might be worth it to transfer the workarounds from seqan2 (I'm not sure whether this would have triggered a segmentation fault in SeqAn2)?

But on the other hand my code uses OpenMP (just not my small example above). Why would it only crash in the version check and not in my parallelized code?

@rrahn
Copy link
Contributor

rrahn commented Jul 11, 2019

Well, we are not building apps in SeqAn3 anymore. So the app developer is responsible for these kind of things. We had this only before because we were packaging all the apps with SeqAn.

But on the other hand my code uses OpenMP (just not my small example above). Why would it only crash in the version check and not in my parallelized code?

Because you need pthreads and if you link against it statically not all of pthread lib is included. So many functions may work but some (like using futures/condition_variables etc) won't. In the version check we spawn a thread using future/promis idiom, so there you get the segfault.

That's why the suggested fix by @h-2 will likely solve the problem as it includes the whole-archive of pthreads

@cpockrandt
Copy link
Contributor Author

Ok, you have a point. As soon as the other bug (order of --help and --version-check) is fixed, we can close this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug faulty or wrong behaviour of code
Projects
None yet
5 participants