-
-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize DuplicateEventDetectionEventProcessor performance. #1247
Optimize DuplicateEventDetectionEventProcessor performance. #1247
Conversation
Using WeakHashMap to store objects opens up a possibility that the map will grow to thousands of entries and causes performance loss when searching for stored exception. This may happen either if outside of the SDK code references to exceptions are kept or if GC does not run for long time. Changing the implementation to `ConcurrentLinkedDeque` gives the event processor control over how many exceptions are stored in the memory.
} | ||
|
||
@Override | ||
public SentryEvent process(final @NotNull SentryEvent event, final @Nullable Object hint) { | ||
final Throwable throwable = event.getOriginThrowable(); | ||
if (throwable != null) { | ||
if (capturedObjects.containsKey(throwable) | ||
if (capturedObjects.contains(throwable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I read this allCauses
, doesn't seem optimal.
Even with this refactor, if exceptions have many causes we'll be allocating new lists and running this thing for each captureException, and doing the lookup for each one of those. The 100 limit is multiplied by the number of inner exceptions.
Do we even need to allocate a list there? That while
could be inlined here and the contains
could be checked on each iteration.
And with this refactor, we'll build up a list of 100 and always keep them around, even long after the exception was collected. This means for a long running process we'll always lookup on the full map. Still an improvement in case a service is throwing a ton of exceptions and for some reason it doesn't run GC because it has TB of RAM but does sound suboptimal to other scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I understand. allCauses
is executed on the exception that is being captured now. We are not looking into causes of all previously captured exceptions.
And with this refactor, we'll build up a list of 100 and always keep them around, even long after the exception was collected. This means for a long running process we'll always lookup on the full map. Still an improvement in case a service is throwing a ton of exceptions and for some reason it doesn't run GC because it has TB of RAM but does sound suboptimal to other scenarios.
That's true. Do you have a suggestion that would work in all scenarios? We can also store exceptions along with the time they were captured, and once DuplicateEventProcessor
while iterating for checking for equality, check also if it's not older than X, if so - remove.
@@ -6,6 +6,7 @@ | |||
* Enchancement: Support @SentrySpan and @SentryTransaction on classes and interfaces. (#1243) | |||
* Enchancement: Do not serialize empty collections and maps (#1245) | |||
* Ref: Simplify RestTemplate instrumentation (#1246) | |||
* Ref: Optimize DuplicateEventDetectionEventProcessor performance (#1247). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should also go under the breaking change
section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's changing the algorithm that sits under the hood. I am not sure why this would be qualified as a breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its a behavior change, there's a bufferSize
now and there is the possibility to report the same exception twice, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlikely, only if applications sends hundreds/thousands errors per second. Buffer size is hidden from the user. I can add it to breaking change if you like but I don't believe we should bump a version because of this.
sentry/src/main/java/io/sentry/DuplicateEventDetectionEventProcessor.java
Show resolved
Hide resolved
Codecov Report
@@ Coverage Diff @@
## main #1247 +/- ##
============================================
+ Coverage 74.91% 74.95% +0.04%
- Complexity 1743 1748 +5
============================================
Files 183 183
Lines 6118 6130 +12
Branches 609 610 +1
============================================
+ Hits 4583 4595 +12
Misses 1256 1256
Partials 279 279
Continue to review full report at Codecov.
|
|
||
/** Deduplicates events containing throwable that has been already processed. */ | ||
public final class DuplicateEventDetectionEventProcessor implements EventProcessor { | ||
private final WeakHashMap<Throwable, Object> capturedObjects = new WeakHashMap<>(); | ||
private final ConcurrentLinkedQueue<Throwable> capturedObjects = new ConcurrentLinkedQueue<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not confortable with this to be honest. Can we discuss this before going ahead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure sure, I was waiting for your blessing anyway :-) We must change the WeakHashMap
though (https://kdowbecki.github.io/WeakHashMap-is-not-thread-safe/ - this likely contributed to our issues). The option could be to synchronize WeakHashMap
but this comes with drawbacks too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no thread safe WeakHashMap
in Java standard library, but perhaps we can include this dependency? https://github.com/raphw/weak-lock-free
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or: considering that Deduplication is neccessary only with Spring + logging framework integration, perhaps we can move this event processor to Spring module and use ConcurrentReferenceHashMap
from Spring Framework. This would solve the problem assuming that only thread safety is an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this on a call and decided to go with @maciejwalkowiak's suggestion of using Collections.synchronizedMap
.
|
@marandaneto take a look please at the recent changes. What do you think about this implementation? |
val options = SentryOptions().apply { | ||
if (enableDeduplication != null) { | ||
this.setEnableDeduplication(enableDeduplication) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could init SentryOptions
outside of getSut
but still inside of Fixture
and call options setters directly on each test, I think that's the way we've been doing so far.
getSut
only takes params if the Sut (eg DuplicateEventDetectionEventProcessor) itself requires those params
looks good to me, left a comment about the testing, its a nit though, feel free to ignore. |
📜 Description
Optimize potential
DuplicateEventDetectionEventProcessor
performance issue by changing how objects are stored fromWeakHashMap
toConcurrentLinkedQueue
.💡 Motivation and Context
Using WeakHashMap to store objects opens up a possibility that the map will grow to thousands of entries and causes performance loss when searching for stored exception. This may happen either if outside of the SDK code references to exceptions are kept or if GC does not run for long time.
Changing the implementation to
ConcurrentLinkedQueue
gives the event processor control over how many exceptions are stored in the memory.💚 How did you test it?
Unit tests.
📝 Checklist