Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to get core dumps because of SIGSEGV handler #611

Closed
rcdailey opened this issue Mar 15, 2016 · 20 comments
Closed

Unable to get core dumps because of SIGSEGV handler #611

rcdailey opened this issue Mar 15, 2016 · 20 comments

Comments

@rcdailey
Copy link

In the Catch library, there is code for linux to handle interrupts. The primary purpose of these handlers seems to be to provide simple diagnostic messages. I've observed in my tests that it prevents core dumps. Before running my test manually, I set this:

$ ulimit -c unlimited

When I run a test program that crashes, I get the following output:

/home/user/code/TestCheckController.cpp:725: FAILED:
  {Unknown expression after the reported line}
due to a fatal error condition:
  SIGINT - Terminal interrupt signal

===============================================================================
test cases:  43 |  42 passed | 1 failed
assertions: 716 | 715 passed | 1 failed

No core dump was provided. I tested removing the SIGSEGV handler and I do get a core dump.

I expect that the signal interrupt handlers should either be disabled or forward the interrupt to the OS so that a core dump can happen. If neither of these are possible, perhaps design this system to use exceptions or do the minidumps yourself on Linux (make it a configurable option).

I'm happy to help contribute a fix for this, but I'd like some design feedback from @philsquared before I do any work to get an idea of where he'd like this to go. I need to understand more about the original intention.

@hobbes1069
Copy link

I'm seeing something similar with GCC 6 on Fedora Rawhide. Any help would be appreciated.

https://sourceforge.net/p/zipios/bugs/9/

@vasyan1337h4x0r
Copy link

+1
Is there any plan on fixing this?

@philsquared
Copy link
Collaborator

Sorry for the late reply (and for the issue in the first place).
I'm not an expert in signal handlers, so would value your input on this (in fact I would value it even if I was).
My best guess at how to resolve this would be to make the signal handler configurable (either on the command line or via a #define - or both).
If you know how we could get the best of both worlds (get the result you want without the need to configure) that would be even better!

Regarding the original intention, @rcdailey, it was just to try and do a best effort continuation of the Catch runner that treats the interrupt as a failure.

@rcdailey
Copy link
Author

Thanks for responding. The logic used to handle the signal is fairly complex at first glance:

inline void fatal( std::string const& message, int exitCode ) {
    IContext& context = Catch::getCurrentContext();
    IResultCapture* resultCapture = context.getResultCapture();
    resultCapture->handleFatalErrorCondition( message );

    if( Catch::alwaysTrue() ) // avoids "no return" warnings
        exit( exitCode );
}

Which calls:

virtual void handleFatalErrorCondition( std::string const& message ) {
    ResultBuilder resultBuilder = makeUnexpectedResultBuilder();
    resultBuilder.setResultType( ResultWas::FatalErrorCondition );
    resultBuilder << message;
    resultBuilder.captureExpression();

    handleUnfinishedSections();

    // Recreate section for test case (as we will lose the one that was in scope)
    TestCaseInfo const& testCaseInfo = m_activeTestCase->getTestCaseInfo();
    SectionInfo testCaseSection( testCaseInfo.lineInfo, testCaseInfo.name, testCaseInfo.description );

    Counts assertions;
    assertions.failed = 1;
    SectionStats testCaseSectionStats( testCaseSection, assertions, 0, false );
    m_reporter->sectionEnded( testCaseSectionStats );

    TestCaseInfo testInfo = m_activeTestCase->getTestCaseInfo();

    Totals deltaTotals;
    deltaTotals.testCases.failed = 1;
    m_reporter->testCaseEnded( TestCaseStats(   testInfo,
                                                deltaTotals,
                                                "",
                                                "",
                                                false ) );
    m_totals.testCases.failed++;
    testGroupEnded( "", m_totals, 1, 1 );
    m_reporter->testRunEnded( TestRunStats( m_runInfo, m_totals, false ) );
}

What is the goal with all of this logic? Would it be unreasonable to just completely remove the signal handlers and let the OS manage it? Quite honestly, the ideal solution would be to forward the interrupt to the OS so that it would function normally, but I'm not certain if we can do that. Thoughts?

@philsquared
Copy link
Collaborator

The original problem was that when a signal is raised there was no way to know which test case was being executed (because they are written lazily). That might be ok if it happens during a debug session (or you can otherwise recreate the symbols) - but otherwise can make it hard to track down.
This handler tries to invoke the lazy printing of context before resigning.

@rcdailey
Copy link
Author

@refi64
Copy link

refi64 commented Oct 28, 2016

@rcdailey I just have to say, I really like your GitHub pic. :D

@tombh
Copy link

tombh commented Dec 30, 2016

I'm pretty sure this is a problem as well when using backward-cpp to show backtraces. When I comment out the relevant signal in catch's code (SIGSEGV in my case), backward-cpp suddenly starts working again.

@philsquared
Copy link
Collaborator

philsquared commented Jan 12, 2017

I've made some changes to the signal handling so that, after intercepting and doing the reporting, it then re-installs the original handler and re-raises the signal (instead of calling exit()).
So far I have these changes on the signal branch.
If you get a chance (@rcdailey, @tombh, @vasyan1337h4x0r and @hobbes1069 - if you're still watching) - could you try out:

https://raw.githubusercontent.com/philsquared/Catch/signals/single_include/catch.hpp

And let me know if you still have problems?

@philsquared philsquared added the Resolved - pending review Issue waiting for feedback from the original author label Jan 12, 2017
@horenmar
Copy link
Member

@philsquared If you are changing signal handling, can you also incorporate changes in #753 ?

@vasyan1337h4x0r
Copy link

@philsquared Works perfectly for me. Cores are dumped and I can see the source of an error.
Looking forward to a release with this. Thanks!

@philsquared
Copy link
Collaborator

Cool, thanks @vasyan1337h4x0r.
I'll probably merge the branch in shortly (possibly with some of @horenmar's other changes - which I hadn't seen until after I'd done this).
In the meantime would still be good to hear from more of the commenters here.

@tombh
Copy link

tombh commented Jan 13, 2017

@philsquared I can confirm the sector branch is working with backward-cpp now. Excellent :)

@horenmar
Copy link
Member

@tombh Can you also check dev-signals? It should contain that change + other signal handling changes, but I am unable to resolve an error its starting to look like it might be confined to my specific machine, so more data would be useful.

@tombh
Copy link

tombh commented Jan 13, 2017

@horenmar The single_include header from dev-signals works exactly the same as Phil's recent signal fix: backward-cpp now receives the signals.

@horenmar
Copy link
Member

Since the dev-signals branch has been merged to master, I am going to close this.

@horenmar horenmar removed Discussion In progress Resolved - pending review Issue waiting for feedback from the original author Tweak request labels Jan 25, 2017
mlin added a commit to vgteam/vg that referenced this issue Mar 14, 2017
make segfaults easier to debug: catchorg/Catch2#611
@rcdailey
Copy link
Author

What's the plan for this? Been a long time and it would be nice to have a backtrace for segfaults on linux, since most of the time we're running unit tests in our CI build server with no debugging tools or GUI.

@horenmar
Copy link
Member

@rcdailey Current behaviour should be that a signal is caught, failure reported and then the signal is passed upward to whoever is waiting. Does this not work for you?

@rcdailey
Copy link
Author

So I'm still not getting a backtrace, maybe I'm doing something else wrong? Here is the output from one of my tests:

[1/2] cd /home/TABLETOPMEDIA/robert/frontend_unit_tests && /usr/local/bin/ctest -C Debug -T Test --output-on-failure
   Site: febld
   Build name: Linux-clang++-3.8
Test project /home/TABLETOPMEDIA/robert/frontend_unit_tests
    Start 1: Test_UI
1/5 Test #1: Test_UI ..........................***Exception: SegFault  0.04 sec
Exception during initilization of LinuxEventData std::exceptionException during initilization of LinuxEventData std::exception
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Test_UI is a Catch v1.9.6 host application.
Run with -? for options

-------------------------------------------------------------------------------
Checking JavaScript Timer.Start()
-------------------------------------------------------------------------------
/home/TABLETOPMEDIA/robert/frontend/Core/UI/Test/Source/JavaScript/TestJSTimer.cpp:36
...............................................................................

/home/TABLETOPMEDIA/robert/frontend/Core/UI/Test/Source/JavaScript/TestJSTimer.cpp:36: FAILED:
due to a fatal error condition:
  SIGSEGV - Segmentation violation signal

@horenmar
Copy link
Member

horenmar commented Aug 28, 2017

Not sure what is ctest's behaviour, but when I create a minimal example and run it, I get regular coredump:

~/scratch$ cat test.cpp 
#define CATCH_CONFIG_MAIN
#include "catch.hpp"

TEST_CASE("A", "[sigsegv]") {
    int* iptr = nullptr;
    REQUIRE(*iptr);
}

~/scratch$ g++ -std=c++11 test.cpp && ./a.out 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
a.out is a Catch v2.0.0-develop.2 host application.
Run with -? for options

-------------------------------------------------------------------------------
A
-------------------------------------------------------------------------------
test.cpp:4
...............................................................................

test.cpp:4: FAILED:
due to a fatal error condition:
  SIGSEGV - Segmentation violation signal

===============================================================================
test cases: 1 | 1 failed
assertions: 1 | 1 failed

Segmentation fault (core dumped)
~/scratch$ ls
a.out  catch.hpp  core  test.cpp

The signal got to the system handler. which led to core dump and the core dump was collected into the core file (I should set better core naming pattern).

Do note that I had to do a bit of a dance to enable core dump collection on my system first, so maybe your system has core dumps disabled?

edit: This is from Catch 2, but signal handling code hasn't been touched in quite a while. You might also try to disable signal handling completely and see if that gives you the desired behaviour, using CATCH_CONFIG_NO_POSIX_SIGNALS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants