Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document behavior if applications schedule on completion queue after it has been shutdown #38112

Open
erin2722 opened this issue Nov 12, 2024 · 0 comments

Comments

@erin2722
Copy link

What version of gRPC and what language are you using?

v1.59.2 gRPC C++

What operating system (Linux, Windows,...) and version?

Linux 5.19.0
22.04.1-Ubuntu
aarch64

What runtime / compiler are you using (e.g. python version or version of gcc)

clang version 12.0.1

What did you do?

Ran the following test to determine the behavior of scheduling on the completion queue after it has been shutdown:

TEST(GRPC, CompletionQueueShutdown) {
    ::grpc::CompletionQueue cq;

    std::thread pollingThread([&]() {
        void* tag;
        bool ok = false;
        while (cq.Next(&tag, &ok)) {
            std::cout << ok << std::endl;
        }
    });

    // Schedule something successfully
    ::grpc::Alarm alarm;
    alarm.Set(&cq, gpr_timespec{0, 0, GPR_TIMESPAN}, 0);

    cq.Shutdown();

    sleep(5);  // wait for the completion queue shutdown to happen.

    ::grpc::Alarm alarm1;
    alarm1.Set(&cq, gpr_timespec{0, 0, GPR_TIMESPAN}, 0);  // this will crash the process.

    pollingThread.join();
}

What did you expect to see?

I would expect that scheduling work on the completion queue after it has been shutdown fails gracefully rather than crashing the process, or has very clear documentation that scheduling on the completion queue after shutdown will crash the process. The above unit test demonstrates the issue with the Alarm, but I believe that this issue is true for other components that use the completion queue:

Of particular interest here is the Call class, which on v1.59.2 appears to crash if the application attempts to schedule operations when the cq is shutdown:

GPR_ASSERT(grpc_cq_begin_op(cq_, notify_tag));

On master, it looks like this may no longer be the case:

if (!is_notify_tag_closure) grpc_cq_begin_op(cq_, notify_tag);

Could gRPC add some more documentation about the guarantees/different failure modes of scheduling on a shutdown completion queue? The ideal situation (in my opinion) would be that gRPC throws instead of crashing the process, but I'll leave the decision up the maintainers and am mainly requesting documentation here.

What did you see instead?

I saw the process crash with the following backtrace, and there isn't any documentation indicated that scheduling on the cq after it has been shutdown is process-fatal.

{"t":{"$date":"2024-11-12T22:05:23.938Z"},"s":"I",  "c":"CONTROL",  "id":31445,   "ctx":"main","msg":"Frame","attr":{"frame":{"a":"FFFFA10B7130","b":"FFFFA1090000","o":"27130","s":"abort","s+":"E4"}}}
{"t":{"$date":"2024-11-12T22:05:23.938Z"},"s":"I",  "c":"CONTROL",  "id":31445,   "ctx":"main","msg":"Frame","attr":{"frame":{"a":"FFFFB184BA50","b":"FFFFB1820000","o":"2BA50","s":"_ZN9grpc_core5CrashESt17basic_string_viewIcSt11char_traitsIcEENS_14SourceLocationE","C":"grpc_core::Crash(std::basic_string_view<char, std::char_traits<char> >, grpc_core::SourceLocation)","s+":"C8"}}}
{"t":{"$date":"2024-11-12T22:05:23.938Z"},"s":"I",  "c":"CONTROL",  "id":31445,   "ctx":"main","msg":"Frame","attr":{"frame":{"a":"FFFFB18490CC","b":"FFFFB1820000","o":"290CC","s":"gpr_assertion_failed","s+":"6C"}}}
{"t":{"$date":"2024-11-12T22:05:23.938Z"},"s":"I",  "c":"CONTROL",  "id":31445,   "ctx":"main","msg":"Frame","attr":{"frame":{"a":"FFFFB274B840","b":"FFFFB2690000","o":"BB840","s":"_ZN4grpc8internal9AlarmImpl3SetEPNS_15CompletionQueueE12gpr_timespecPv","C":"grpc::internal::AlarmImpl::Set(grpc::CompletionQueue*, gpr_timespec, void*)","s+":"1DC"}}}
{"t":{"$date":"2024-11-12T22:05:23.938Z"},"s":"I",  "c":"CONTROL",  "id":31445,   "ctx":"main","msg":"Frame","attr":{"frame":{"a":"AAAAE5E64194","b":"AAAAE5D70000","o":"F4194","s":"_ZN5mongo9transport4grpc12_GLOBAL__N_153UnitTest_SuiteNameGRPCTestNameCompletionQueueShutdown7_doTestEv","C":"mongo::transport::grpc::(anonymous namespace)::UnitTest_SuiteNameGRPCTestNameCompletionQueueShutdown::_doTest()","s+":"D4"}}}

Anything else we should know about your project / environment?

N/A

@erin2722 erin2722 changed the title Document behavior if applications schedule on completion queue after it has been Shutdown Document behavior if applications schedule on completion queue after it has been shutdown Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants