Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ABI changes when compiled with ASAN #19224

Closed
coryan opened this issue Jun 4, 2019 · 13 comments
Closed

ABI changes when compiled with ASAN #19224

coryan opened this issue Jun 4, 2019 · 13 comments

Comments

@coryan
Copy link
Contributor

coryan commented Jun 4, 2019

What version of gRPC and what language are you using?

gRPC-1.21.0 / C++.

What operating system (Linux, Windows,...) and version?

Linux, Ubuntu 18.04 (Bionic Beaver)

What runtime / compiler are you using (e.g. python version or version of gcc)

$ clang --version
clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

What did you do?

Compiled and installed gRPC. Then compiled my code with ASAN and ran existing unit tests that passed with gRPC-1.19.1

What did you expect to see?

I expected the test to pass, as they did before.

What did you see instead?

My program crashes with assertion failures deep inside gRPC.

GPR_ASSERT(clock_type == GPR_CLOCK_MONOTONIC ||

More details here:

https://source.cloud.google.com/results/invocations/5ea85d8f-5aed-4200-9dce-5e26335ef8c6/targets/cloud-cpp%2Fgithub%2Fdocker%2Fasan-presubmit/log

Diagnosis

gRPC has a different ABI when compiled with ASAN vs. a regular build. These classes change size:

#ifdef GRPC_ASAN_ENABLED

so this class changes size:

class Mutex {
public:
Mutex() { g_core_codegen_interface->gpr_mu_init(&mu_); }
~Mutex() { g_core_codegen_interface->gpr_mu_destroy(&mu_); }
Mutex(const Mutex&) = delete;
Mutex& operator=(const Mutex&) = delete;
gpr_mu* get() { return &mu_; }
const gpr_mu* get() const { return &mu_; }
private:
union {
gpr_mu mu_;
std::mutex do_not_use_sth_;
#ifdef GPR_HAS_PTHREAD_H
pthread_mutex_t do_not_use_pth_;
#endif
};
};

and therefore grpc::ClientContext, which is part of the public API, changes size:

grpc::internal::Mutex mu_;

And then it is game over, there is a violation of the ODR, and the application is in UB land.

Maybe gRPC does not want to support applications that use AddressSanitizer unless they compile everything from source. Is that the case? Is there some documentation where I can read more about what API/ABI guarantees does gRPC make, if any?

Anything else we should know about your project / environment?

I think that is it.

@nicolasnoble
Copy link
Member

Typically, when building sanitizers, you need to have the whole nine yards compiled with sanitizers enabled, which should be fine, right?

@coryan
Copy link
Contributor Author

coryan commented Jun 4, 2019

. > Typically, when building sanitizers, you need to have the whole nine yards compiled with sanitizers enabled,

I believe that is a requirements only for the MemorySanitizer:

https://clang.llvm.org/docs/MemorySanitizer.html#handling-external-code

The other sanitizers do not document similar restrictions and they work fine in practice. This bug is specifically about the AddressSanitizer, which is the one you are changing the ABI on.

which should be fine, right?

Sure. That works.

You (the gRPC team) need to decide if you want to support AddressSanitizer only when all the code is compiled with it. If you do decide that, do you mind documenting this? Or if the documentation exists, point me to it?

You also need to decide if changing the memory layout of your objects when running with AddressSanitizer affects the sanitizer results. Part of the goal is to make sure you detect out-of-bounds access, but you change the bounds when you compile with address sanitizer, so maybe that is not ideal? It may be an immaterial difference, but you may want to think about it.

@nicolasnoble
Copy link
Member

Right now, the Makefile offering to build with sanitizers is only for our own internal tests. We never really intended for this to leak as a public thing, and this wasn't intended for public consumption. We will eventually drop these build options from the Makefile to favor bazel's BUILD file only for targeting *san. In this scenario, you will still be able to test your product in conjunction with a sanitizer setting, and bazel's behavior will enforce that all the dependencies are built alongside your own code, ensuring consistency.

@coryan
Copy link
Contributor Author

coryan commented Jun 4, 2019

Right now, the Makefile offering to build with sanitizers is only for our own internal tests. We never really intended for this to leak as a public thing, and this wasn't intended for public consumption.

So? Once I install gRPC in /usr/local: can I sanitize my own code if I link with gRPC? The answer used to be "yes", and now it is "no". Stopping me from using ASAN if I link your library is very surprising behavior.

We will eventually drop these build options from the Makefile to favor bazel's BUILD file only for targeting *san. In this scenario, you will still be able to test your product in conjunction with a sanitizer setting, and bazel's behavior will enforce that all the dependencies are built alongside your own code, ensuring consistency.

I do not think that makes any difference for this question.

@nicolasnoble
Copy link
Member

This never intended to be externally usable, so the answer should always have been "no".

@coryan
Copy link
Contributor Author

coryan commented Jun 5, 2019

This never intended to be externally usable, so the answer should always have been "no".

Would you mind updating the documentation to say: "Only compiling with Bazel is supported for sanitizer builds" or something to that effect?

@lisongmin
Copy link

This problem should be fixed in my opinion.

When we check our own program with sanitize, we mainly care if our program is leaked but not grpc.

Since our program may link to many libraries, it is hard to rebuild all libraries(e.g. grpc) with sanitize enable and it is not necessary at all.

currently, when I sanitize my program, I havo to change grpc/impl/codegen/sync_posix.h manually

from

#ifdef GRPC_ASAN_ENABLED 

to

#if 0

@stale
Copy link

stale bot commented Apr 7, 2020

This issue/PR has been automatically marked as stale because it has not had any update (including commits, comments, labels, milestones, etc) for 180 days. It will be closed automatically if no further update occurs in 1 day. Thank you for your contributions!

@coryan
Copy link
Contributor Author

coryan commented Apr 7, 2020

Please keep this open. This continues to be a problem:

  • The ABI changes unexpectedly and contrary to common practice for OSS libraries.
  • The documentation does not describe this problem, nor does it describe that only Bazel is expected to work with the sanitizers (which would be less than ideal, but at least upfront about the deviation from common practices).

@stale stale bot removed the disposition/stale label Apr 7, 2020
@aurelienrb
Copy link

Hello,

I want to be make you understand the current issue with not only ASAN but also TSAN: grpc prevents the user from using ASAN or TSAN in its own code because of the way some public structs are defined in the public grpc headers.

This is related to the way GRPC_ASAN_ENABLED / GRPC_TSAN_ENABLED are defined in port_platform.h

#if defined(__SANITIZE_ADDRESS__)
#define GRPC_ASAN_ENABLED

If the client code compiles with ASAN or TSAN, this macros will be defined on his side only - thus the ODR.

This is a real pain because if I want to check my own code I need to use and link with another specific grpc build (I had to update our CI pipeline to use an extra checked build of grpc in order to be able to check our own code). And if you remove those ASAN flags from CMake this will make things worst: it will forbid your users to use ASAN/TSAN in their own code.

I have described the issue in details in #21838 and proposed PR #22533 for at least detect this issue at link time. But indeed the clean solution would be to have a stable public ABI regardless of ASAN or TSAN flags in the user code 👍

So maybe it's just about adjusting the way GRPC_ASAN_ENABLED / GRPC_TSAN_ENABLED macros are defined: make them defined from Bazel scripts only and never from the current compiler flags.

@coryan
Copy link
Contributor Author

coryan commented Jun 23, 2020

Seems like #22325 enables a workaround for this issue. In fact, I would not be opposed to calling it "fixed".

/cc: @veblush

@aurelienrb
Copy link

aurelienrb commented Jun 23, 2020

Looks good indeed, but the last bit seems missing: how do you enable those flags from CMake? it would be nice to have a CMake option, and IMHO this option should disable ASAN/TSAN by default so that the user will get proper build options by default

@veblush
Copy link
Contributor

veblush commented Sep 20, 2022

It looks okay to close this.

@veblush veblush closed this as completed Sep 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants