Skip to content
This repository has been archived by the owner on Apr 25, 2023. It is now read-only.

Emit stacktrace on segfault with instructions for core-dump only scenarios #32

Merged

Conversation

Strum355
Copy link
Contributor

@Strum355 Strum355 commented Nov 19, 2020

@Strum355
Copy link
Contributor Author

Strum355 commented Nov 19, 2020

@beyang @efritz could one (or both) of ye try this out on OSX please.
Heres a sample snippet for causing a sigsegv (from beyangs commit)

Put it into https://github.com/sourcegraph/lsif-clang/blob/llvmorg-10.0.0-lsif-clang/src/indexer/IndexerMain.cpp

void baz() {
 int *foo = (int*)-1; // make a bad pointer
  printf("%d\n", *foo);       // causes segfault
}

int main(int argc, const char **argv) {
  //sys::PrintStackTraceOnErrorSignal(argv[0]);

  baz();
  // ...
}

Make sure binutils is installed and let me know if theres a good backtrace with code snippet : )

Command to build ~= cmake -B build && make -C build -j<cores> if ye dont have it on-hand

@Strum355
Copy link
Contributor Author

With that latest commit pls ^ forgot to push

@Strum355 Strum355 linked an issue Nov 20, 2020 that may be closed by this pull request
@macraig macraig removed this from the 3.22 milestone Nov 20, 2020
@Strum355 Strum355 force-pushed the nsc/epic-backtrace-avec-backwardcpp branch from ccf9c2b to cc899cf Compare November 30, 2020 17:33
@Strum355
Copy link
Contributor Author

Further toying around shows that this isnt an absolute success. My suspicions are it doesnt apply to any of the worker threads that are spun up while indexing. The original pastebin was in the serial emitting of LSIF data on the main thread. Probably still worth merging this and seeing how it can be expanded upon

@shrouxm
Copy link
Contributor

shrouxm commented Jan 13, 2021

@Strum355 Curious before diving in: have you looked into writing our own ToolExecutor? Currently using the AllTUsToolExecutor but the implementation is rly smol and it seems like there'd be an obvious insertion point for what we want (Sourcegraph snippet widget when), but idk how backtraces/multiprocessing works in cpp any more than u do

@Strum355
Copy link
Contributor Author

@gbrik ah that's an interesting thought, hadnt clicked that the ToolExecutors were pluggable/abstract classes. Im in the same knowledge boat as you here, I didnt uncover anything about backtrace-cpp with multithreading, was under the impression that it should just work:tm: but apparently not? Im sure theres probably something obvious (or less so) that Im missing here.

If its an issue with threads, would subprocessing work I wonder?

@shrouxm
Copy link
Contributor

shrouxm commented Jan 13, 2021

@Strum355 I have binutils installed but I don't get a pretty stack trace:

Stack trace (most recent call last):
#3    Object "[0xffffffffffffffff]", at 0xffffffffffffffff, in 
#2    Object "bin/lsif-clang", at 0x5610aa26ebad, in 
#1    Object "/lib/x86_64-linux-gnu/libc.so.6", at 0x7fa96b7370b2, in __libc_start_main
#0    Object "bin/lsif-clang", at 0x5610aa272473, in 
Segmentation fault (Address not mapped to object [0xffffffffffffffff])
zsh: segmentation fault (core dumped)  bin/lsif-clang

This is just doing the baz() with bad pointer from above.

@Strum355
Copy link
Contributor Author

Just to confirm, do you have binutils-dev specifically? Im not sure if just binutils will do, as theyre two separate packages and the latter doesnt install the former afaik

@shrouxm
Copy link
Contributor

shrouxm commented Jan 13, 2021 via email

@Strum355 Strum355 force-pushed the nsc/epic-backtrace-avec-backwardcpp branch from cc899cf to 205b796 Compare January 13, 2021 20:19
@Strum355
Copy link
Contributor Author

Simplest way: src/index/SymbolCollector.cpp#602 add the baz function, and call it inside *SymbolCollector::addDeclaration immediately following it @gbrik
image

@shrouxm
Copy link
Contributor

shrouxm commented Jan 13, 2021

If I put the following:

const Symbol *SymbolCollector::addDeclaration(const NamedDecl &ND, SymbolID ID,
                                              bool IsMainFileOnly) {
  backward::SignalHandling sh;
  baz();

I think I get a pretty stack trace (abbrev'd):

...
        343:     // If OriginalDecl is preferred, replace the existing canonical
        344:     // declaration (e.g. a class forward declaration). There should be at most
#0  | Source "/home/arrow/sourcegraph/lsif-clang/src/index/SymbolCollector.cpp", line 611, in clang::clangd::SymbolCollector::addDeclaration(clang::NamedDecl const&, clang::clangd::SymbolID, bool)
    |   609:                                               bool IsMainFileOnly) {
    |   610:   backward::SignalHandling sh;
    | > 611:   baz();
    |   612:   auto &Ctx = ND.getASTContext();
    |   613:   auto &SM = Ctx.getSourceManager();
    | Source "/home/arrow/sourcegraph/lsif-clang/src/index/SymbolCollector.cpp", line 605, in clang::clangd::baz()
    |   603: void baz() {
    |   604:   int *foo = (int *)-1; // make a bad pointer
    | > 605:   printf("%d\n", *foo); // causes segfault
    |   606: }
      Source "/usr/include/x86_64-linux-gnu/bits/stdio2.h", line 107, in clang::clangd::SymbolCollector::addDeclaration(clang::NamedDecl const&, clang::clangd::SymbolID, bool) [0x5643050dc05e]
        104: __fortify_function int
        105: printf (const char *__restrict __fmt, ...)
        106: {
      > 107:   return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
        108: }
        109: # elif !defined __cplusplus
        110: #  define printf(...) \
Segmentation fault (Address not mapped to object [0xffffffffffffffff])
#1    Source "#1    Source "/home/arrow/sourcegraph/lsif-clang/src/index/SymbolCollector.cpp", line 341, in clang::clangd::SymbolCollector::handleDeclOccurrence(clang::Decl const*, unsigned int, llvm::ArrayRef<clang::index::SymbolRelation>, clang::SourceLocation, clang::index::IndexDataConsumer::ASTNodeInfo) [/home/arrow/sourcegraph/lsif-clang/src/index/SymbolCollector.cpp", line 341, in clang::clangd::SymbolCollector::handleDeclOccurrence(clang::Decl const*, unsigned int, llvm::ArrayRef<clang::index::SymbolRelation>, clang::SourceLocation, clang::index::IndexDataConsumer::ASTNodeInfo) [0x5643050dd598]
0x5643050dd598]
        339:   const Symbol *BasicSymbol = Symbols.find(*ID);
        340:   if (!BasicSymbol) // Regardless of role, ND is the canonical declaration.
      > 341:     BasicSymbol = addDeclaration(*ND, std::move(*ID), IsMainFileOnly);
        342:   else if (isPreferredDeclaration(*OriginalDecl, Roles))
        343:     // If OriginalDecl is preferred, replace the existing canonical
        344:     // declaration (e.g. a class forward declaration). There should be at most
        339:   const Symbol *BasicSymbol = Symbols.find(*ID);
        340:   if (!BasicSymbol) // Regardless of role, ND is the canonical declaration.
      > 341:     BasicSymbol = addDeclaration(*ND, std::move(*ID), IsMainFileOnly);
        342:   else if (isPreferredDeclaration(*OriginalDecl, Roles))
        343:     // If OriginalDecl is preferred, replace the existing canonical
        344:     // declaration (e.g. a class forward declaration). There should be at most
#1    Source "/home/arrow/sourcegraph/lsif-clang/src/index/SymbolCollector.cpp", line 341, in clang::cla
ngd::SymbolCollector::handleDeclOccurrence(clang::Decl const*, unsigned int, llvm::ArrayRef<clang::index::SymbolRelation>, clang::SourceLocation, clang::index::IndexDataConsumer::ASTNodeInfo) [0x5643050dd598]
        339:   const Symbol *BasicSymbol = Symbols.find(*ID);
        340:   if (!BasicSymbol) // Regardless of role, ND is the canonical declaration.
      > 341:     BasicSymbol = addDeclaration(*ND, std::move(*ID), IsMainFileOnly);
        342:   else if (isPreferredDeclaration(*OriginalDecl, Roles))
        343:     // If OriginalDecl is preferred, replace the existing canonical
        344:     // declaration (e.g. a class forward declaration). There should be at most
#1    Source "/home/arrow/sourcegraph/lsif-clang/src/index/SymbolCollector.cpp", line 341, in clang::clangd::SymbolCollector::handleDeclOccurrence(clang::Decl const*, unsigned int, llvm::ArrayRef<clang::index::SymbolRelation>, clang::SourceLocation, clang::index::IndexDataConsumer::ASTNodeInfo) [0x5643050dd598]
      zsh: segmentation fault (core dumped)  bin/lsif-clang compile_commands.json

so maybe try adding that line to a copypasta of AllTUs executor when it kicks off the threads?

@Strum355 Strum355 added this to the Code Intelligence Sprint 4 milestone Jan 18, 2021
@Strum355 Strum355 marked this pull request as ready for review January 18, 2021 22:35
@Strum355 Strum355 changed the title WIP: Pass one of backward-cpp based backtraces on failure Emit stacktrace on segfault with instructions for core-dump only scenarios Jan 18, 2021
@Strum355
Copy link
Contributor Author

Efforts to get stacktraces from within the executor workers was unsuccessful, so I explored how to utilize coredumps for debugging the times when backwards-cpp fails. I wrote a doc at https://github.com/sourcegraph/lsif-clang/pull/32/files#diff-8739bb7905639474aa523485b68fd8d6ced0fe5aec3707e8ba738896e598c12e to guide users on how to enable coredumps that we can use

@@ -0,0 +1,48 @@
# So you got a segfault

While time and effort have been put into ensuring a bug-free experience, some edge cases may still arise that result in a segfault.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally try to avoid writing defensively like this in technical docs. The user normally just cares about finding out how to troubleshoot. Some proposals:

  • rename file to segfault.md
  • use title "How to troubleshoot a segfault"
  • first sentence explains the purpose of the document, something like "This document explains the steps to report an issue related to segfaults".
  • create GitHub issue template for reporting a segfault. This could replace the "Info to be provided" header. Link to this document from the template.


![stacktrace in terminal](images/stacktrace.png)

### 1. No stacktrace? Core dump time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the goal of the "1." prefix? I generally try to formulate headers as "How to ...". This makes it quite easy for users to skim over the document to find actionable advise.

Suggested change
### 1. No stacktrace? Core dump time
### How to capture a core dump
It's helpful to include a core dump if the segfault produces no stack trace. The instructions to capture a core dump depend on the operating system.
....

@Strum355
Copy link
Contributor Author

Strum355 commented Feb 4, 2021

@olafurpg to get this into the hands of a customer to further debug an issue, Im moving the segfault doc into a separate PR to unblock this PR

@Strum355 Strum355 force-pushed the nsc/epic-backtrace-avec-backwardcpp branch from 7c7441e to 75e801d Compare February 4, 2021 19:37
@Strum355 Strum355 merged commit 9e26e13 into llvmorg-10.0.0-lsif-clang Feb 4, 2021
@Strum355 Strum355 deleted the nsc/epic-backtrace-avec-backwardcpp branch February 4, 2021 19:38
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve debuggability
5 participants