Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Protect access to member with mutex in CBackgroundPersister #18

Merged
merged 3 commits into from
Mar 15, 2018
Merged

[ML] Protect access to member with mutex in CBackgroundPersister #18

merged 3 commits into from
Mar 15, 2018

Conversation

davidkyle
Copy link
Member

All usages of m_PersistFuncs should be protected by the mutex

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're correct that there's a concurrency problem in this class, but I think the proposed fix will have undesirable side effects.

The idea was that the m_IsBusy flag would prevent two simultaneous background persists being run, which makes it unnecessary to move the lock in the method where you moved it.

Holding the lock for the entire duration of the background persist means that if another thread tries to kick off a background persist while one is in progress then the second attempt will block rather than returning false quickly.

I think the best way to fix the concurrency problems is as follows:

  1. Replace the current change with a comment in CBackgroundPersister::CBackgroundThread::run() saying that it's undesirable to hold the lock for the actual persist, but that m_IsBusy should prevent modification of m_PersistFuncs during the persist.
  2. In CBackgroundPersister::startPersist() move the if (m_PersistFuncs.empty()) check below the if (m_IsBusy) check, because startPersist() is a public method and should protect access to m_PersistFuncs.
  3. Hold the lock while returning in CBackgroundPersister::isBusy(). This will ensure that we won't answer the question on whether we're busy part way through a group of operations in another method related to adjusting the busy flag.

@@ -157,10 +161,10 @@ class API_EXPORT CBackgroundPersister : private core::CNonCopyable
core::CFastMutex m_Mutex;

//! Is the background thread currently busy persisting data?
volatile bool m_IsBusy;
std::atomic_bool m_IsBusy;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On backporting to 6.3 you might find this needs to be changed to atomic_t::atomic_bool, with the <atomic> header changed to core/AtomicTypes.h. Visual Studio 2013 didn't implement all the std atomic types correctly. You might get lucky with atomic_bool, as presumably it's one of the simpler ones, but if not then AtomicTypes.h is how to fix it. (Basically it will use Boost atomic on Windows.)

@@ -122,14 +122,14 @@ bool CBackgroundPersister::addPersistFunc(core::CDataAdder::TPersistFunc persist

bool CBackgroundPersister::startPersist(void)
{
if (m_PersistFuncs.empty())
if (this->isBusy())
{
return false;
}

core::CScopedFastLock lock(m_Mutex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this lock needs to be moved up above the busy check, otherwise the busy flag can be set in between checking it and locking the mutex.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 that fixes the race condition I commented on

@davidkyle
Copy link
Member Author

You're right holding the lock will cause another thread to block then immediately persist again once the lock is released.

In theory there is a race condition in startPersist if 2 threads call the function at the same time

https://github.com/elastic/ml-cpp/blob/master/lib/api/CBackgroundPersister.cc#L130

If the first thread is somewhere between line 130 and line 148 then the second will pass the isBusy test then block on acquiring the lock. The first thread releases the lock on line 151 then the second thread will continue eventually hitting m_BackgroundThread.waitForFinish() where a deadlock occurs as the background thread cannot acquire the lock on line 246 as is it held by the 2 thread.

Given the way the class is used this is safe but I've added a comment to the docs on the startBackgroundPersistIfAppropriate method warning against this use.

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davidkyle davidkyle merged commit 9cc58e6 into elastic:master Mar 15, 2018
@davidkyle davidkyle deleted the background-persist-protect branch March 15, 2018 15:20
@sophiec20 sophiec20 changed the title Protect access to member with mutex in CBackgroundPersister [ML] Protect access to member with mutex in CBackgroundPersister Mar 28, 2018
@sophiec20 sophiec20 added the :ml label Apr 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants