-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ML] Protect access to member with mutex in CBackgroundPersister #18
[ML] Protect access to member with mutex in CBackgroundPersister #18
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're correct that there's a concurrency problem in this class, but I think the proposed fix will have undesirable side effects.
The idea was that the m_IsBusy
flag would prevent two simultaneous background persists being run, which makes it unnecessary to move the lock in the method where you moved it.
Holding the lock for the entire duration of the background persist means that if another thread tries to kick off a background persist while one is in progress then the second attempt will block rather than returning false
quickly.
I think the best way to fix the concurrency problems is as follows:
- Replace the current change with a comment in
CBackgroundPersister::CBackgroundThread::run()
saying that it's undesirable to hold the lock for the actual persist, but thatm_IsBusy
should prevent modification ofm_PersistFuncs
during the persist. - In
CBackgroundPersister::startPersist()
move theif (m_PersistFuncs.empty())
check below theif (m_IsBusy)
check, becausestartPersist()
is a public method and should protect access tom_PersistFuncs
. - Hold the lock while returning in
CBackgroundPersister::isBusy()
. This will ensure that we won't answer the question on whether we're busy part way through a group of operations in another method related to adjusting the busy flag.
@@ -157,10 +161,10 @@ class API_EXPORT CBackgroundPersister : private core::CNonCopyable | |||
core::CFastMutex m_Mutex; | |||
|
|||
//! Is the background thread currently busy persisting data? | |||
volatile bool m_IsBusy; | |||
std::atomic_bool m_IsBusy; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On backporting to 6.3 you might find this needs to be changed to atomic_t::atomic_bool
, with the <atomic>
header changed to core/AtomicTypes.h
. Visual Studio 2013 didn't implement all the std
atomic types correctly. You might get lucky with atomic_bool
, as presumably it's one of the simpler ones, but if not then AtomicTypes.h
is how to fix it. (Basically it will use Boost atomic on Windows.)
lib/api/CBackgroundPersister.cc
Outdated
@@ -122,14 +122,14 @@ bool CBackgroundPersister::addPersistFunc(core::CDataAdder::TPersistFunc persist | |||
|
|||
bool CBackgroundPersister::startPersist(void) | |||
{ | |||
if (m_PersistFuncs.empty()) | |||
if (this->isBusy()) | |||
{ | |||
return false; | |||
} | |||
|
|||
core::CScopedFastLock lock(m_Mutex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this lock needs to be moved up above the busy check, otherwise the busy flag can be set in between checking it and locking the mutex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 that fixes the race condition I commented on
You're right holding the lock will cause another thread to block then immediately persist again once the lock is released. In theory there is a race condition in https://github.com/elastic/ml-cpp/blob/master/lib/api/CBackgroundPersister.cc#L130 If the first thread is somewhere between line 130 and line 148 then the second will pass the Given the way the class is used this is safe but I've added a comment to the docs on the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
All usages of
m_PersistFuncs
should be protected by the mutex