Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

we should take care of the resource release sequence when the processes exit #716

Closed
wadeliuyi opened this issue Aug 2, 2019 · 3 comments
Assignees
Milestone

Comments

@wadeliuyi
Copy link
Contributor

fix crash problem when stop process that meta client and raft part depend io thread pool, but the io thread pool will stopped first by gServer:
1, raft bt.
(gdb) bt
#0 0x000000000208f587 in folly::IOThreadPoolExecutor::getEventBase (this=) at /usr/include/c++/8/bits/shared_ptr_base.h:1018
#1 0x0000000001b48970 in nebula::raftex::RaftPart::appendLogsInternal (this=0x7fbfdc168c10, iter=..., termId=8) at /home/wade.liu/rd/nebula/src/kvstore/raftex/RaftPart.cpp:512
#2 0x0000000001b47dd0 in nebula::raftex::RaftPart::appendLogAsync (this=0x7fbfdc168c10, source=0 '\000', logType=nebula::raftex::LogType::NORMAL, log="")
at /home/wade.liu/rd/nebula/src/kvstore/raftex/RaftPart.cpp:452
#3 0x0000000001b501ff in nebula::raftex::RaftPart::sendHeartbeat (this=0x7fbfdc168c10) at /home/wade.liu/rd/nebula/src/kvstore/raftex/RaftPart.cpp:1270
#4 0x0000000001b4ce4f in nebula::raftex::RaftPart::statusPolling (this=0x7fbfdc168c10) at /home/wade.liu/rd/nebula/src/kvstore/raftex/RaftPart.cpp:940
#5 0x0000000001b4c9d8 in nebula::raftex::RaftPart::<lambda()>::operator()(void) const (__closure=0x7fbfd7ce7400) at /home/wade.liu/rd/nebula/src/kvstore/raftex/RaftPart.cpp:949
#6 0x0000000001b5d36c in std::__invoke_impl<void, nebula::raftex::RaftPart::statusPolling()::<lambda()>&>(std::__invoke_other, nebula::raftex::RaftPart::<lambda()> &) (__f=...)
at /usr/include/c++/8/bits/invoke.h:60
#7 0x0000000001b5d2a1 in std::__invoke<nebula::raftex::RaftPart::statusPolling()::<lambda()>&>(nebula::raftex::RaftPart::<lambda()> &) (__fn=...) at /usr/include/c++/8/bits/invoke.h:95
#8 0x0000000001b5d19a in std::_Bind<nebula::raftex::RaftPart::statusPolling()::<lambda()>()>::__call(std::tuple<> &&, std::_Index_tuple<>) (this=0x7fbfd7ce7400, __args=...)
at /usr/include/c++/8/functional:400
#9 0x0000000001b5ccaa in std::_Bind<nebula::raftex::RaftPart::statusPolling()::<lambda()>()>::operator()<>(void) (this=0x7fbfd7ce7400) at /usr/include/c++/8/functional:484
#10 0x0000000001b5c647 in std::_Function_handler<void(), std::_Bind<nebula::raftex::RaftPart::statusPolling()::<lambda()>()> >::_M_invoke(const std::_Any_data &) (__functor=...)
at /usr/include/c++/8/bits/std_function.h:297

2, meta bt.
#0 0x000000000208dc37 in folly::IOThreadPoolExecutor::getEventBase (this=) at /usr/include/c++/8/bits/shared_ptr_base.h:1018
#1 0x00000000018169eb in nebula::meta::MetaClient::getResponse<nebula::meta::cpp2::HBReq, nebula::meta::MetaClient::heartbeat()::<lambda(auto:110, auto:111)>, nebula::meta::MetaClient::heartbeat()::<lambda(nebula::meta::cpp2::HBResp&&)> >(nebula::meta::cpp2::HBReq, nebula::meta::MetaClient::<lambda(auto:110, auto:111)>, nebula::meta::MetaClient::<lambda(nebula::meta::cpp2::HBResp&&)>, bool) (
this=0x7f7642d60600, req=..., remoteFunc=..., respGen=..., toLeader=true) at /home/wade.liu/rd/nebula/src/meta/client/MetaClient.cpp:254
#2 0x000000000180d6f6 in nebula::meta::MetaClient::heartbeat (this=0x7f7642d60600) at /home/wade.liu/rd/nebula/src/meta/client/MetaClient.cpp:987
#3 0x0000000001806204 in nebula::meta::MetaClient::heartBeatThreadFunc (this=0x7f7642d60600) at /home/wade.liu/rd/nebula/src/meta/client/MetaClient.cpp:85
#4 0x00000000018876da in std::__invoke_impl<void, void (nebula::meta::MetaClient::&)(), nebula::meta::MetaClient&> (
__f=@0x7f7642dc1ea0: (void (nebula::meta::MetaClient::*)(nebula::meta::MetaClient * const)) 0x18061e0 nebula::meta::MetaClient::heartBeatThreadFunc(), __t=@0x7f7642dc1eb0: 0x7f7642d60600)
at /usr/include/c++/8/bits/invoke.h:73

@wadeliuyi
Copy link
Contributor Author

wadeliuyi commented Aug 2, 2019

I think three ways to solve it, maybe not the best.
first is one process just one rpc server, maybe we can combine the raft service and main service together, but we also take care about the resource release sequence when the process exit.
second is raft service and main service not share the io thread pool, raft use it self, but raft service have a high flow of io, so we need open many thread for raft service, this result the whole process has too much thread.
the other one is that define a stop interface for every module, and stop them by a right sequence when process receive a stop signal.

@jude-zhu jude-zhu added this to the R201910_RC1 milestone Aug 7, 2019
@sherman-the-tank sherman-the-tank changed the title we should take care about that resource release sequence when our process exit. we should take care of the resource release sequence when the processes exit Aug 7, 2019
@dangleptr
Copy link
Contributor

#732

@monadbobo monadbobo assigned dangleptr and unassigned monadbobo Aug 7, 2019
@dangleptr
Copy link
Contributor

Close it now

liwenhui-soul pushed a commit to liwenhui-soul/nebula that referenced this issue May 10, 2022
<!--
Thanks for your contribution!
In order to review PR more efficiently, please add information according to the template.
-->

## What type of PR is this?
- [ ] bug
- [ ] feature
- [X] enhancement

## What problem(s) does this PR solve?
#### Issue(s) number: 

#### Description:

Improve test case stability

## How do you solve it?



## Special notes for your reviewer, ex. impact of this fix, design document, etc:



## Checklist:
Tests:
- [ ] Unit test(positive and negative cases)
- [ ] Function test
- [ ] Performance test
- [ ] N/A

Affects:
- [ ] Documentation affected (Please add the label if documentation needs to be modified.)
- [ ] Incompatibility (If it breaks the compatibility, please describe it and add the label.)
- [ ] If it's needed to cherry-pick (If cherry-pick to some branches is required, please label the destination version(s).)
- [ ] Performance impacted: Consumes more CPU/Memory


## Release notes:

Please confirm whether to be reflected in release notes and how to describe:
> ex. Fixed the bug .....


Migrated from vesoft-inc#4044

Co-authored-by: Yee <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants