Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPPTask::runImpl is not exception safe #2322

Closed
JaySon-Huang opened this issue Jul 5, 2021 · 4 comments · Fixed by #2338
Closed

MPPTask::runImpl is not exception safe #2322

JaySon-Huang opened this issue Jul 5, 2021 · 4 comments · Fixed by #2338
Assignees
Labels
severity/major type/bug The issue is confirmed as a bug.

Comments

@JaySon-Huang
Copy link
Contributor

JaySon-Huang commented Jul 5, 2021

Steps to reproduce: checkout this commit: JaySon-Huang@597dd86

I've added a failpoint exception_during_mpp_write_err_to_tunnel to mock this bug.

create table t (a int primary key clustered);
insert into test.t values (1),(2),(3),(4);
set tidb_enforce_mpp=1

-- 1. With no failpoint injected, running this SQL is OK, MPPTask can be released as expected
tidb> select * from (select * from t) a join (select * from t) b;
ch>  select * from dt_tables where tidb_database='test' and tidb_table='t' \G
...
tidb_database:                            test
tidb_table:                               t
...
storage_stable_oldest_snapshot_lifetime:  0 <-- the snapshot will be released as expected

-- 2. With failpoint "exception_during_mpp_non_root_task_run" enabled
ch> DBGInvoke enable_fail_point(exception_during_mpp_non_root_task_run);
-- MPPTask can be released as expected too
tidb> select * from (select * from t) a join (select * from t) b;
ch>  select * from dt_tables where tidb_database='test' and tidb_table='t' \G
...
tidb_database:                            test
tidb_table:                               t
...
storage_stable_oldest_snapshot_lifetime:  0 <-- the snapshot will be released as expected

-- 3. With failpoint "exception_during_mpp_non_root_task_run" and "exception_during_mpp_write_err_to_tunnel" enabled
ch> DBGInvoke enable_fail_point(exception_during_mpp_non_root_task_run);
ch> DBGInvoke enable_fail_point(exception_during_mpp_write_err_to_tunnel);
-- MPPTask can not be released, it will hold the resources of IStorage, and cause other trouble, such as:
--   TiFlash stucks after applying DDL operations
--   DeltaTree can not run GC on its delta data
tidb> select * from (select * from t) a join (select * from t) b;
ch>  select * from dt_tables where tidb_database='test' and tidb_table='t' \G
...
tidb_database:                            test
tidb_table:                               t
...
storage_stable_oldest_snapshot_lifetime:  45.19153881 <-- the snapshot will not be released
...

And we can see that ~MPPTask() is not called in case 3, there is no logging like "finish MPPTask", while case 1 and case 2 have.

JaySon-Huang@c4ba52f#diff-31a1170d1da0609fde7eb7068713665415014352587f2256fd15a29fe581b3caR302-R336
I guess these changes can resolve part of the problem, cause MPPTunnel::close won't throw exception like MPPTunnel::write when the tunnel get closed.
However, if other exceptions are thrown in MPPTunnel::close, then MPPTask still gets stuck and can not release resources of the Storage layer.

@JaySon-Huang
Copy link
Contributor Author

JaySon-Huang commented Jul 5, 2021

Related logs: thread_15054_task162.zip

# There is a snapshot that been held for 7399.651 seconds from this logging. Actually, the snapshot has not been released for more than 20 hours in later processing. And make trouble for GCing old data.
[2021/07/05 00:34:37.639 +08:00] [WARN] [<unknown>] ["PageStorage: db_45.t_154.data gcApply remove 26 invalid snapshots, 2594 snapshots left, longest lifetime 7399.651 seconds, created from thread_id 15054"] [thread_id=1532]

# Trace back by the thread_id 15054, we find that 
[2021/07/04 22:30:31.843 +08:00] [DEBUG] [<unknown>] ["FlashService: virtual grpc::Status DB::FlashService::DispatchMPPTask(grpc::ServerContext*, const mpp::DispatchTaskRequest*, mpp::DispatchTaskResponse*): Handling mpp dispatch request: meta {\n  start_ts: 426091221432665442\n  task_id: 162\n  ..."] [thread_id=15054]
[2021/07/04 22:30:31.870 +08:00] [DEBUG] [<unknown>] ["task 162: begin to register the task [426091221432665442,162]"] [thread_id=15054]
[2021/07/04 22:30:31.870 +08:00] [DEBUG] [<unknown>] ["task 162: begin to register the tunnel tunnel162+185"] [thread_id=15054]
# Execute some wait index and done
[2021/07/04 22:30:31.878 +08:00] [DEBUG] [<unknown>] ["task 162: begin to register the tunnel tunnel162+210"] [thread_id=15054]
[2021/07/04 22:30:31.879 +08:00] [DEBUG] [<unknown>] ["executeQuery: (from 10.4.131.15:44025, query_id: 3fbb0844-ff45-4b13-b816-75fcbf0333a1) ... "] [thread_id=15054]
[2021/07/04 22:30:31.988 +08:00] [DEBUG] [<unknown>] ["DAGQueryBlockInterpreter: Batch read index send 5877 request got 5877 response, cost 86ms"] [thread_id=15054]
[2021/07/04 22:30:31.990 +08:00] [DEBUG] [<unknown>] ["Region: [region 264641616, applied: term 6 index 452567] need to wait learner index: 452635"] [thread_id=15054]
[2021/07/04 22:30:35.949 +08:00] [DEBUG] [<unknown>] ["Region: [region 264641616] wait learner index 452635 done"] [thread_id=15054]
...
[2021/07/04 22:31:17.480 +08:00] [DEBUG] [<unknown>] ["Region: [region 302054948] wait learner index 357795 done"] [thread_id=15054]
[2021/07/04 22:31:17.507 +08:00] [DEBUG] [<unknown>] ["DAGQueryBlockInterpreter: Finish wait index | resolve locks | check memory cache for 5876 regions, cost 175ms"] [thread_id=15054]
[2021/07/04 22:31:17.534 +08:00] [DEBUG] [<unknown>] ["DAGQueryBlockInterpreter: [Learner Read] batch read index | wait index cost 1100 ms totally, regions_num=5876, concurrency=1"] [thread_id=15054]
[2021/07/04 22:31:17.536 +08:00] [DEBUG] [<unknown>] ["DAGQueryBlockInterpreter: DB::DAGQueryBlockInterpreter::getAndLockStorageWithSchemaVersion(DB::TableID, DB::Int64)::<lambda(const String&)> Table 154 schema OK, no syncing required. Schema version [storage, global, query]: [717, 725, 725]."] [thread_id=15054]
[2021/07/04 22:31:17.560 +08:00] [DEBUG] [<unknown>] ["StorageDeltaMerge: Read with tso: 426091221432665442"] [thread_id=15054]
# Get snapshot of the Storage layer
[2021/07/04 22:31:18.550 +08:00] [DEBUG] [<unknown>] ["DeltaMergeStore[db_45.t_154]: Read create segment snapshot done"] [thread_id=15054]
[2021/07/04 22:31:18.562 +08:00] [DEBUG] [<unknown>] ["DeltaMergeStore[db_45.t_154]: Read create stream done"] [thread_id=15054]
[2021/07/04 22:31:18.904 +08:00] [DEBUG] [<unknown>] ["DAGQueryBlockInterpreter: Start to retry 4 regions ({302889825,8716,1251},{321506838,8716,1251},{280139908,12298,1297},{265243024,12831,1241},)"] [thread_id=15054]
[2021/07/04 22:31:18.948 +08:00] [DEBUG] [<unknown>] ["pingcap/coprocessor: build 4 ranges."] [thread_id=15054]
[2021/07/04 22:31:19.229 +08:00] [DEBUG] [<unknown>] ["pingcap/coprocessor: has 4 tasks."] [thread_id=15054]
...
[2021/07/04 22:31:19.835 +08:00] [INFO] [<unknown>] ["DAGQueryBlockInterpreter: execution stream size for query block(before aggregation) __QB_2_ is 36"] [thread_id=15054]
[2021/07/04 22:31:19.852 +08:00] [INFO] [<unknown>] ["DAGQueryBlockInterpreter: execution stream size for query block(before aggregation) __QB_1_ is 36"] [thread_id=15054]
[2021/07/04 22:31:19.902 +08:00] [DEBUG] [<unknown>] ["executeQuery: Query pipeline:...\n"] [thread_id=15054]
[2021/07/04 22:31:20.704 +08:00] [DEBUG] [<unknown>] ["SquashingTransform: Squashing config - min_block_size_rows: 20000 min_block_size_bytes: 0"] [thread_id=15054]
[2021/07/04 22:31:20.774 +08:00] [INFO] [<unknown>] ["MPPHandler: processing dispatch is over; the time cost is 48854 ms"] [thread_id=15054]
...
[2021/07/04 22:31:20.775 +08:00] [INFO] [<unknown>] ["task 162: task starts running"] [thread_id=28459]
[2021/07/04 22:31:20.814 +08:00] [DEBUG] [<unknown>] ["task 162: begin read "] [thread_id=28459]
[2021/07/04 22:31:57.202 +08:00] [WARN] [<unknown>] ["task 162: Begin cancel task: [426091221432665442,162]"] [thread_id=30453]
[2021/07/04 22:31:57.464 +08:00] [WARN] [<unknown>] ["task 162: Finish cancel task: [426091221432665442,162]"] [thread_id=30453]
## Exception thrown in running
[2021/07/04 22:32:00.041 +08:00] [ERROR] [<unknown>] ["task 162: task running meets error DB::Exception: Query was cancelled Stack Trace ...\n"] [thread_id=28459]
## Another exception thrown in write error
[2021/07/04 22:32:00.058 +08:00] [ERROR] [<unknown>] ["task 162: Failed to write error DB::Exception: Query was cancelled to all tunnels: Code: ..."] [thread_id=28459]
[2021/07/04 22:32:00.058 +08:00] [INFO] [<unknown>] ["task 162: task ends, time cost is 39350 ms."] [thread_id=28459]
[2021/07/04 22:32:00.104 +08:00] [DEBUG] [<unknown>] ["task 162: task unregistered"] [thread_id=28459]

@JaySon-Huang
Copy link
Contributor Author

JaySon-Huang commented Jul 6, 2021

And the not released MPPTasks make some threads wait forever. Here is a backtrace stacks after MPPTask get stuck.
all_bt.log.zip

We may end up with situation like [2], that TiFlash create lots of threads and can not release them. End up with throwing "Resource temporarily unavailable" exception when creating new threads, TiFlash crashes.

[1] Some threads stuck forever:

Thread 429 (Thread 0x7f35883f5700 (LWP 47270)):
#0  0x00007f3731287a35 in pthread_cond_wait@@GLIBC_2.3.2 () from /usr/lib64/libpthread.so.0
#1  0x00000000153f8b3c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>) at /root/flow/gcc-7.3.0/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:864
#2  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../.././libstdc++-v3/src/c++11/condition_variable.cc:53
#3  0x000000001454389a in std::condition_variable::wait<DB::MPPTunnel::waitForFinish()::{lambda()#1}>(std::unique_lock<std::mutex>&, DB::MPPTunnel::waitForFinish()::{lambda()#1}) (this=0x7f3586266068, __lock=..., __p=...) at /usr/local/include/c++/7.3.0/condition_variable:99
#4  0x0000000014542f6e in DB::MPPTunnel::waitForFinish (this=0x7f3586266010) at /data1/jaysonhuang/tics/dbms/src/Flash/Mpp/MPPHandler.h:173
#5  0x0000000014539bf9 in DB::FlashService::EstablishMPPConnection (this=0x7f372ca21f10, grpc_context=0x7f3586242020, request=0x7f358620e3b0, writer=0x7f35883cba80) at /data1/jaysonhuang/tics/dbms/src/Flash/FlashService.cpp:201
#6  0x0000000014f5bdcb in std::__invoke_impl<grpc::Status, grpc::Status (tikvpb::Tikv::Service::* const&)(grpc_impl::ServerContext*, mpp::EstablishMPPConnectionRequest const*, grpc_impl::ServerWriter<mpp::MPPDataPacket>*), tikvpb::Tikv::Service*, grpc_impl::ServerContext*, mpp::EstablishMPPConnectionRequest const*, grpc_impl::ServerWriter<mpp::MPPDataPacket>*>
    (__f=@0x7f372c23f4b8: &virtual tikvpb::Tikv::Service::EstablishMPPConnection(grpc_impl::ServerContext*, mpp::EstablishMPPConnectionRequest const*, grpc_impl::ServerWriter<mpp::MPPDataPacket>*), __t=@0x7f35883cb9a8: 0x7f372ca21f10, __args#0=@0x7f35883cb9a0: 0x7f3586242020, __args#1=@0x7f35883cb998: 0x7f358620e3b0, __args#2=@0x7f35883cb990: 0x7f35883cba80)
    at /usr/local/include/c++/7.3.0/bits/invoke.h:73
#7  0x0000000014f577e4 in std::__invoke<grpc::Status (tikvpb::Tikv::Service::* const&)(grpc_impl::ServerContext*, mpp::EstablishMPPConnectionRequest const*, grpc_impl::ServerWriter<mpp::MPPDataPacket>*), tikvpb::Tikv::Service*, grpc_impl::ServerContext*, mpp::EstablishMPPConnectionRequest const*, grpc_impl::ServerWriter<mpp::MPPDataPacket>*> (
    __fn=@0x7f372c23f4b8: &virtual tikvpb::Tikv::Service::EstablishMPPConnection(grpc_impl::ServerContext*, mpp::EstablishMPPConnectionRequest const*, grpc_impl::ServerWriter<mpp::MPPDataPacket>*), __args#0=@0x7f35883cb9a8: 0x7f372ca21f10, __args#1=@0x7f35883cb9a0: 0x7f3586242020, __args#2=@0x7f35883cb998: 0x7f358620e3b0, 
    __args#3=@0x7f35883cb990: 0x7f35883cba80) at /usr/local/include/c++/7.3.0/bits/invoke.h:96
#8  0x0000000014f4e7a0 in std::_Mem_fn_base<grpc::Status (tikvpb::Tikv::Service::*)(grpc_impl::ServerContext*, mpp::EstablishMPPConnectionRequest const*, grpc_impl::ServerWriter<mpp::MPPDataPacket>*), true>::operator()<tikvpb::Tikv::Service*, grpc_impl::ServerContext*, mpp::EstablishMPPConnectionRequest const*, grpc_impl::ServerWriter<mpp::MPPDataPacket>*> (
    this=0x7f372c23f4b8, __args#0=@0x7f35883cb9a8: 0x7f372ca21f10, __args#1=@0x7f35883cb9a0: 0x7f3586242020, __args#2=@0x7f35883cb998: 0x7f358620e3b0, __args#3=@0x7f35883cb990: 0x7f35883cba80) at /usr/local/include/c++/7.3.0/functional:175
#9  0x0000000014f3cb19 in std::_Function_handler<grpc::Status (tikvpb::Tikv::Service*, grpc_impl::ServerContext*, mpp::EstablishMPPConnectionRequest const*, grpc_impl::ServerWriter<mpp::MPPDataPacket>*), std::_Mem_fn<grpc::Status (tikvpb::Tikv::Service::*)(grpc_impl::ServerContext*, mpp::EstablishMPPConnectionRequest const*, grpc_impl::ServerWriter<mpp::MPPDataPacket>*)> >::_M_invoke(std::_Any_data const&, tikvpb::Tikv::Service*&&, grpc_impl::ServerContext*&&, mpp::EstablishMPPConnectionRequest const*&&, grpc_impl::ServerWriter<mpp::MPPDataPacket>*&&) (__functor=..., __args#0=@0x7f35883cb9a8: 0x7f372ca21f10, __args#1=@0x7f35883cb9a0: 0x7f3586242020, __args#2=@0x7f35883cb998: 0x7f358620e3b0, 
    __args#3=@0x7f35883cb990: 0x7f35883cba80) at /usr/local/include/c++/7.3.0/bits/std_function.h:302
#10 0x0000000014fd1556 in std::function<grpc::Status (tikvpb::Tikv::Service*, grpc_impl::ServerContext*, mpp::EstablishMPPConnectionRequest const*, grpc_impl::ServerWriter<mpp::MPPDataPacket>*)>::operator()(tikvpb::Tikv::Service*, grpc_impl::ServerContext*, mpp::EstablishMPPConnectionRequest const*, grpc_impl::ServerWriter<mpp::MPPDataPacket>*) const (
    this=0x7f372c23f4b8, __args#0=0x7f372ca21f10, __args#1=0x7f3586242020, __args#2=0x7f358620e3b0, __args#3=0x7f35883cba80) at /usr/local/include/c++/7.3.0/bits/std_function.h:706
#11 0x0000000014f87ecc in grpc_impl::internal::ServerStreamingHandler<tikvpb::Tikv::Service, mpp::EstablishMPPConnectionRequest, mpp::MPPDataPacket>::RunHandler(grpc::internal::MethodHandler::HandlerParameter const&)::{lambda()#1}::operator()() const (__closure=0x7f35883cbd20) at /usr/local/include/grpcpp/impl/codegen/method_handler_impl.h:195
#12 0x0000000014fd158b in grpc_impl::internal::CatchingFunctionHandler<grpc_impl::internal::ServerStreamingHandler<tikvpb::Tikv::Service, mpp::EstablishMPPConnectionRequest, mpp::MPPDataPacket>::RunHandler(grpc::internal::MethodHandler::HandlerParameter const&)::{lambda()#1}>(grpc_impl::internal::ServerStreamingHandler<tikvpb::Tikv::Service, mpp::EstablishMPPConnectionRequest, mpp::MPPDataPacket>::RunHandler(grpc::internal::MethodHandler::HandlerParameter const&)::{lambda()#1}&&) (handler=...) at /usr/local/include/grpcpp/impl/codegen/method_handler_impl.h:42
#13 0x0000000014f87f7b in grpc_impl::internal::ServerStreamingHandler<tikvpb::Tikv::Service, mpp::EstablishMPPConnectionRequest, mpp::MPPDataPacket>::RunHandler (this=0x7f372c23f4b0, param=...) at /usr/local/include/grpcpp/impl/codegen/method_handler_impl.h:190
#14 0x00000000150c69f1 in grpc_impl::Server::SyncRequest::CallData::ContinueRunAfterInterception() ()
#15 0x00000000150c7fdf in grpc_impl::Server::SyncRequestThreadManager::DoWork(void*, bool, bool) ()
#16 0x00000000150cd9ab in grpc::ThreadManager::MainWorkLoop() ()
#17 0x00000000150cdb4c in grpc::ThreadManager::WorkerThread::Run() ()
#18 0x0000000015300683 in grpc_core::(anonymous namespace)::ThreadInternalsPosix::ThreadInternalsPosix(char const*, void (*)(void*), void*, bool*, grpc_core::Thread::Options const&)::{lambda(void*)#1}::_FUN(void*) ()
#19 0x00007f3731283ea5 in start_thread () from /usr/lib64/libpthread.so.0
#20 0x00007f373083a9fd in clone () from /usr/lib64/libc.so.6

[2] image

@JaySon-Huang
Copy link
Contributor Author

@windtalker Do you have any idea how to fix this problem?

@windtalker
Copy link
Contributor

Looks like we need make sure that MPPTunnel is closed in any case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants