Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pdns_server crashes during startup when the first non-dynloaded module initializes #6624

Closed
tih opened this issue May 18, 2018 · 12 comments
Closed

Comments

@tih
Copy link
Contributor

tih commented May 18, 2018

  • Program: Authoritative
  • Issue type: Bug report

Short description

After the big change from "theL" to "g_log", commit e6a9dde, pdns_server crashes during initialization of any module that is not dynamically loaded, when the module tries to log its initialization. If all modules are dynamically loaded, there is no crash. Removing that particular log action from the module, so that it is no longer the first thing that attempts to log something, will also avoid the crash: the module may then log things later, with no ill effects.

Environment

  • Operating system: NetBSD/amd64 8.0RC1 and NetBSD/amd64-current
  • Software version: master branch HEAD
  • Software source: Locally compiled from git clone

Steps to reproduce

  1. ./configure --with-modules="gpgsql"
  2. gmake

Expected behaviour

That pdns_server and its supporting files are properly built.

Actual behaviour

The freshly built pdns_server dumps core when it is run by gmake to generate its sample config file.

The backtrace looks like this:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000077a0174b7cf5 in ?? () from /usr/lib/libc.so.12
(gdb) bt
#0  0x000077a0174b7cf5 in ?? () from /usr/lib/libc.so.12
#1  0x000077a0174b93bb in free () from /usr/lib/libc.so.12
#2  0x000077a017c9b667 in std::string::reserve(unsigned long) () from /usr/lib/libstdc++.so.8
#3  0x000077a017c9b6dd in std::string::append(std::string const&) () from /usr/lib/libstdc++.so.8
#4  0x0000000156938451 in Logger::operator<< (s=..., this=0x156e00fe0 <g_log>) at logger.cc:177
#5  Logger::operator<< (this=0x156e00fe0 <g_log>, 
    s=s@entry=0x156b8ad48 "[gpgsqlbackend] This is the gpgsql backend version 0.0.g23de6c095") at logger.cc:183
#6  0x0000000156b3707f in gPgSQLLoader::gPgSQLLoader (this=<optimized out>) at gpgsqlbackend.cc:184
#7  0x000000015689704f in ?? ()
#8  0x00007f7fffa83fe0 in ?? ()
#9  0x0000000156895f69 in _init ()
#10 0x0000000000000001 in ?? ()
#11 0x0000000156896f57 in ___start ()
#12 0x00007f7f144033f5 in _rtld () from /usr/libexec/ld.elf_so

Other information

@Habbie and I looked at this together, and he confirmed the crash on a fresh NetBSD 8.0RC1 install (I'm running on NetBSD-current). He did the bisecting that isolated the particular commit that introduces the problem. @zeha pointed out that with the change from theL to g_log, the logging object is no longer subject to the same auto-initialization. We therefore suspect that some explicit initialization may now be needed - and that the ordering of events during startup happens to be such, that modules that are not dynamically loaded get initialized before the logging object.

It may be a toolchain difference between Linux and NetBSD, of course, but it is also possible that the lack of initialization just happens not to cause a crash on Linux at the moment.

@rgacogne
Copy link
Member

I wonder if switching to thread_local instead of using the older pthread_XXXspecific() mechanism wouldn't solve this issue.

@tih
Copy link
Contributor Author

tih commented May 20, 2018

Using your patch, from #6625, does, indeed, resolve this. It does, however, also point out how the logging of the modules' initialization happens sufficiently early that the log level hasn't been set, causing all the log entries where the modules announce their initialization to be lost.

@tih
Copy link
Contributor Author

tih commented May 22, 2018

Tested with the updated patch from @rgacogne, in #6639, which also resolves this issue.

@Habbie
Copy link
Member

Habbie commented Jan 24, 2020

apparently, #6689 fixed this for auth, but not for rec? Is this still the case?

@tih
Copy link
Contributor Author

tih commented Jan 24, 2020

I still suspect the observed problem with the recursor on 32 bit Arm was a toolchain issue related to thread local variables. At the time, I didn't pursue it further, because recompiling and testing took an extremely long time on the Raspberry Pi I had to do it on - and now, I don't even have that. Sorry!

It's working fine on aarch64, anyway.

@Habbie
Copy link
Member

Habbie commented Jan 24, 2020

Thanks for the reply. I'll just close this then, and if it's still broken, somebody will eventually let us know :)

@Habbie Habbie closed this as completed Jan 24, 2020
@omoerbeek
Copy link
Member

pdns_resolver --config works fine on OpenBSD armv7 at least

@Habbie
Copy link
Member

Habbie commented Jan 27, 2020

pdns_resolver --config works fine on OpenBSD armv7 at least

resolver? recursor? server?

@omoerbeek
Copy link
Member

Damns, should not comment on issues before coffee... pdns_recursor

@Habbie
Copy link
Member

Habbie commented Jan 27, 2020

Damns, should not comment on issues before coffee... pdns_recursor

Ack - I don't think that was ever a problem on any platform, the issue is with the dynloading of .sos in auth.

@omoerbeek
Copy link
Member

omoerbeek commented Jan 27, 2020

I was reacting on:

apparently, #6689 fixed this for auth, but not for rec? Is this still the case?

@Habbie
Copy link
Member

Habbie commented Jan 27, 2020

fixed this for auth, but not for rec? Is this still the case?

I am also mixing up things. Thanks for checking :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants