-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve a class of ConcurrentModificationException from during bulk requests #3094
Resolve a class of ConcurrentModificationException from during bulk requests #3094
Conversation
…esolve ConcurrentModificationException on bulk request Signed-off-by: Craig Perkins <[email protected]>
Codecov Report
@@ Coverage Diff @@
## main #3094 +/- ##
============================================
+ Coverage 62.40% 62.42% +0.01%
+ Complexity 3352 3350 -2
============================================
Files 254 254
Lines 19749 19749
Branches 3334 3334
============================================
+ Hits 12325 12328 +3
+ Misses 5792 5788 -4
- Partials 1632 1633 +1
|
FYI here's the python script I've been using for testing:
|
Steps for reproducing the issue:
Repeat the steps after this change and its stable. I'm not sure if a change like this is easily unit testable. |
This will definitely resolve the issue in PrivilegesEvaluator, but looking again at the linked issue shows this thrown at another place which I haven't been able to reproduce. The OP in the linked issue shows it being thrown from here: https://github.com/opensearch-project/security/blob/main/src/main/java/org/opensearch/security/securityconf/ConfigModelV7.java#L1302 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change is good, cleaner interfaces are better, but I don't think this fixes the underlying issue. Lets merge this, but I don't think we should say the root cause has been address - what do you think about addressing that issue seperately?
Signed-off-by: Craig Perkins <[email protected]>
3cac3e4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we start using more advanced concurrency changes we should have an automated test that can verify the issue.
Can we write a unit/component test that makes an User and then manipulates it on many threads at once to reproduce? I don't think we need the full service reproduction of the issue to know its fixed or not.
public synchronized final void addSecurityRoles(final Collection<String> securityRoles) { | ||
if (securityRoles != null && this.securityRoles != null) { | ||
this.securityRoles.addAll(securityRoles); | ||
} | ||
} | ||
|
||
public final Set<String> getSecurityRoles() { | ||
public synchronized final Set<String> getSecurityRoles() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to protect concurrent access to these methods you need a single locking mechanism, these , its at the class level.synchronized
keywords work at the method level, not the class level
I prefer to make a field object to handle the lock for each log explicitly, e.g. private final Object securityRolesLock = new Object();
then do lock(securityRolesLock) { ... }
inside of all places where the securityRoles collection is interacted with.
IMO it isn't clear if this will address the underlying issue because concurrent modification can happen outside these methods by iterating over the collection outside of these methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now I'm looking into splitting getSecurityRoles
and getMappedRoles
because the currently logic in place intermingles the 2 which causes the issue. getSecurityRoles
is called within mapRoles
which can be problematic because after mapRoles
is finished executing then addSecurityRoles
is called within PrivilegesEvaluator
.
What I'd like to see is one thread doing the mapping of the roles and other threads can read from its result if its already been computed. I'll try different ways to write an automated test for this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR would make this issue occur with much less frequency, but obviously it would be best to ensure a situation like this can never happen
Signed-off-by: Craig Perkins <[email protected]>
@peternied @willyborankin @RyanL1997 I updated this PR to remove the |
…esolve ConcurrentModificationException on bulk request (#3094) ### Description This PR changes the order of `setUserInfoInThreadContext` and `addSecurityRoles` to ensure that `addSecurityRoles` is called before `setUserInfoInThreadContext`. The ConcurrentModificationException would arise in scenarios where the first thread was done with `setUserInfoInThreadContext` and had moved onto `addSecurityRoles` to add the mapped roles into the user's set of security roles. On the first call to `addSecurityRoles` it may be populating the set with data, any subsequent call would be a noop in other threads. If simultaneously there is another thread executing `setUserInfoInThreadContext` while the first thread is in `addSecurityRoles` then a ConcurrentModificationException is thrown inside the `Sets.union(...)` call. By calling `addSecurityRoles` before `setUserInfoInThreadContext`, it can be guaranteed that no ConcurrentModificationException could be thrown because the user's security roles will already be set and any thread that attempts another call will be a noop. * Category (Enhancement, New feature, Bug fix, Test fix, Refactoring, Maintenance, Documentation) Bug fix ### Issues Resolved #2263 ### Testing Tested by running a python script to bulk insert 100k documents and using tmux to run the script in multiple shells at once. Before the change the bug is reproducible regularly, but after the change the bug cannot be reproduced. ### Check List - [ ] New functionality includes testing - [ ] New functionality has been documented - [ ] Commits are signed per the DCO using --signoff By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check [here](https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#developer-certificate-of-origin). --------- Signed-off-by: Craig Perkins <[email protected]> Signed-off-by: Craig Perkins <[email protected]> (cherry picked from commit cd699bb) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…urityRoles to resolve ConcurrentModificationException on bulk request (#3113) Backport cd699bb from #3094. Signed-off-by: Craig Perkins <[email protected]> Signed-off-by: Craig Perkins <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Craig Perkins <[email protected]>
…esolve ConcurrentModificationException on bulk request (opensearch-project#3094) ### Description This PR changes the order of `setUserInfoInThreadContext` and `addSecurityRoles` to ensure that `addSecurityRoles` is called before `setUserInfoInThreadContext`. The ConcurrentModificationException would arise in scenarios where the first thread was done with `setUserInfoInThreadContext` and had moved onto `addSecurityRoles` to add the mapped roles into the user's set of security roles. On the first call to `addSecurityRoles` it may be populating the set with data, any subsequent call would be a noop in other threads. If simultaneously there is another thread executing `setUserInfoInThreadContext` while the first thread is in `addSecurityRoles` then a ConcurrentModificationException is thrown inside the `Sets.union(...)` call. By calling `addSecurityRoles` before `setUserInfoInThreadContext`, it can be guaranteed that no ConcurrentModificationException could be thrown because the user's security roles will already be set and any thread that attempts another call will be a noop. * Category (Enhancement, New feature, Bug fix, Test fix, Refactoring, Maintenance, Documentation) Bug fix ### Issues Resolved opensearch-project#2263 ### Testing Tested by running a python script to bulk insert 100k documents and using tmux to run the script in multiple shells at once. Before the change the bug is reproducible regularly, but after the change the bug cannot be reproduced. ### Check List - [ ] New functionality includes testing - [ ] New functionality has been documented - [ ] Commits are signed per the DCO using --signoff By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check [here](https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#developer-certificate-of-origin). --------- Signed-off-by: Craig Perkins <[email protected]> Signed-off-by: Craig Perkins <[email protected]>
…esolve ConcurrentModificationException on bulk request (#3094) ### Description This PR changes the order of `setUserInfoInThreadContext` and `addSecurityRoles` to ensure that `addSecurityRoles` is called before `setUserInfoInThreadContext`. The ConcurrentModificationException would arise in scenarios where the first thread was done with `setUserInfoInThreadContext` and had moved onto `addSecurityRoles` to add the mapped roles into the user's set of security roles. On the first call to `addSecurityRoles` it may be populating the set with data, any subsequent call would be a noop in other threads. If simultaneously there is another thread executing `setUserInfoInThreadContext` while the first thread is in `addSecurityRoles` then a ConcurrentModificationException is thrown inside the `Sets.union(...)` call. By calling `addSecurityRoles` before `setUserInfoInThreadContext`, it can be guaranteed that no ConcurrentModificationException could be thrown because the user's security roles will already be set and any thread that attempts another call will be a noop. * Category (Enhancement, New feature, Bug fix, Test fix, Refactoring, Maintenance, Documentation) Bug fix ### Issues Resolved #2263 ### Testing Tested by running a python script to bulk insert 100k documents and using tmux to run the script in multiple shells at once. Before the change the bug is reproducible regularly, but after the change the bug cannot be reproduced. ### Check List - [ ] New functionality includes testing - [ ] New functionality has been documented - [ ] Commits are signed per the DCO using --signoff By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check [here](https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#developer-certificate-of-origin). --------- Signed-off-by: Craig Perkins <[email protected]> Signed-off-by: Craig Perkins <[email protected]> (cherry picked from commit cd699bb) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…urityRoles to resolve ConcurrentModificationException on bulk request (#3173) Backport cd699bb from #3094. Signed-off-by: Craig Perkins <[email protected]> Signed-off-by: Craig Perkins <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
The backport to
To backport manually, run these commands in your terminal: # Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/security/backport-1.3 1.3
# Navigate to the new working tree
pushd ../.worktrees/security/backport-1.3
# Create a new branch
git switch --create backport/backport-3094-to-1.3
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 cd699bb7d3a07b8919ef2fb5e8fb4ccd2e622acb
# Push it to GitHub
git push --set-upstream origin backport/backport-3094-to-1.3
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/security/backport-1.3 Then, create a pull request where the |
@cwperks please make sure to backport this change to 1.x as well |
…esolve ConcurrentModificationException on bulk request (opensearch-project#3094) This PR changes the order of `setUserInfoInThreadContext` and `addSecurityRoles` to ensure that `addSecurityRoles` is called before `setUserInfoInThreadContext`. The ConcurrentModificationException would arise in scenarios where the first thread was done with `setUserInfoInThreadContext` and had moved onto `addSecurityRoles` to add the mapped roles into the user's set of security roles. On the first call to `addSecurityRoles` it may be populating the set with data, any subsequent call would be a noop in other threads. If simultaneously there is another thread executing `setUserInfoInThreadContext` while the first thread is in `addSecurityRoles` then a ConcurrentModificationException is thrown inside the `Sets.union(...)` call. By calling `addSecurityRoles` before `setUserInfoInThreadContext`, it can be guaranteed that no ConcurrentModificationException could be thrown because the user's security roles will already be set and any thread that attempts another call will be a noop. * Category (Enhancement, New feature, Bug fix, Test fix, Refactoring, Maintenance, Documentation) Bug fix opensearch-project#2263 Tested by running a python script to bulk insert 100k documents and using tmux to run the script in multiple shells at once. Before the change the bug is reproducible regularly, but after the change the bug cannot be reproduced. - [ ] New functionality includes testing - [ ] New functionality has been documented - [ ] Commits are signed per the DCO using --signoff By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check [here](https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#developer-certificate-of-origin). --------- Signed-off-by: Craig Perkins <[email protected]> Signed-off-by: Craig Perkins <[email protected]> (cherry picked from commit cd699bb)
…esolve ConcurrentModificationException on bulk request (opensearch-project#3094) This PR changes the order of `setUserInfoInThreadContext` and `addSecurityRoles` to ensure that `addSecurityRoles` is called before `setUserInfoInThreadContext`. The ConcurrentModificationException would arise in scenarios where the first thread was done with `setUserInfoInThreadContext` and had moved onto `addSecurityRoles` to add the mapped roles into the user's set of security roles. On the first call to `addSecurityRoles` it may be populating the set with data, any subsequent call would be a noop in other threads. If simultaneously there is another thread executing `setUserInfoInThreadContext` while the first thread is in `addSecurityRoles` then a ConcurrentModificationException is thrown inside the `Sets.union(...)` call. By calling `addSecurityRoles` before `setUserInfoInThreadContext`, it can be guaranteed that no ConcurrentModificationException could be thrown because the user's security roles will already be set and any thread that attempts another call will be a noop. * Category (Enhancement, New feature, Bug fix, Test fix, Refactoring, Maintenance, Documentation) Bug fix opensearch-project#2263 Tested by running a python script to bulk insert 100k documents and using tmux to run the script in multiple shells at once. Before the change the bug is reproducible regularly, but after the change the bug cannot be reproduced. - [ ] New functionality includes testing - [ ] New functionality has been documented - [ ] Commits are signed per the DCO using --signoff By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check [here](https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#developer-certificate-of-origin). --------- Signed-off-by: Craig Perkins <[email protected]> Signed-off-by: Craig Perkins <[email protected]> (cherry picked from commit cd699bb)
…urityRoles to resolve ConcurrentModificationException on bulk request (#3094) (#3193) Backport #3094 to 1.3 --------- Signed-off-by: Craig Perkins <[email protected]>
Description
This PR changes the order of
setUserInfoInThreadContext
andaddSecurityRoles
to ensure thataddSecurityRoles
is called beforesetUserInfoInThreadContext
. The ConcurrentModificationException would arise in scenarios where the first thread was done withsetUserInfoInThreadContext
and had moved ontoaddSecurityRoles
to add the mapped roles into the user's set of security roles. On the first call toaddSecurityRoles
it may be populating the set with data, any subsequent call would be a noop in other threads.If simultaneously there is another thread executing
setUserInfoInThreadContext
while the first thread is inaddSecurityRoles
then a ConcurrentModificationException is thrown inside theSets.union(...)
call.By calling
addSecurityRoles
beforesetUserInfoInThreadContext
, it can be guaranteed that no ConcurrentModificationException could be thrown because the user's security roles will already be set and any thread that attempts another call will be a noop.Bug fix
Issues Resolved
#2263
Testing
Tested by running a python script to bulk insert 100k documents and using tmux to run the script in multiple shells at once. Before the change the bug is reproducible regularly, but after the change the bug cannot be reproduced.
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.