Event Hubs: Performance degradation between 5.6.0 and 5.7.0 #20841

conniey · 2021-04-20T19:35:20Z

From #19698:

I replaced in my project
azure-messaging-eventhubs: 5.6.0 and azure-messaging-eventhubs-checkpointstore-blob: 1.5.0
with
azure-messaging-eventhubs: 5.7.0 and azure-messaging-eventhubs-checkpointstore-blob: 1.6.0
and unfortunately I see a performance decrease by max 50%.

I also tested
azure-messaging-eventhubs: 5.6.0 and azure-messaging-eventhubs-checkpointstore-blob: 1.6.0
as well as
azure-messaging-eventhubs: 5.7.0 and azure-messaging-eventhubs-checkpointstore-blob: 1.5.0
with the same decreasing performance.

I didn't dive into this yet in detail.
Would be nice if you can check this behavior on your site.

Related #20791

The text was updated successfully, but these errors were encountered:

the-mod · 2021-04-20T21:31:37Z

@conniey: sorry for the late reply and thanks for creating a new Issue.

I ran my Testapp with 3 Replicas on AKS 1.19.9 with 3 Nodes of Standard_D8as_v4.
Each JVM (zulu-openjdk 16) had 3GB Heap Space assigned.
The Code was compiled for Java13 and is using Spring Boot 2.3.2.RELEASE.
I send 10 mill Messages to an eventhub for each run (lets name it input eventhub) where my App is reading from.
My App is writing the Results roundrobin to two Eventhubs (output eventhubs) after processing.

dark blue line: incoming messages of the input eventhub.
turquoise line: outgoing messages of the input eventhub. This are the messages read by my app
red/blue lines: incoming messages of the output eventhubs. This are the messages my app is writing.
The Resolution of the Chart is 1 Minute

Here are my results:

We can see that the SDK combinations mentioned above seams to face a limit (1mill/minute) of reading/receiving messages.
Let me know if I can provide more Informations.
Regards

mikeharder · 2021-08-26T19:11:26Z

@the-mod: We believe this performance regression should be fixed in [email protected] and [email protected]. Could you please try upgrading to these versions and let us know if it has fixed your issue?

the-mod · 2021-08-27T12:53:36Z

Hi @mikeharder thanks for pinging me.

I tried the Version you mentioned and I saw reactor.core.Exceptions$OverflowException Exceptions flying:

reactor.core.Exceptions$OverflowException: at reactor.core.Exceptions.failWithOverflow (Exceptions.java:220) at reactor.core.publisher.FluxWindowTimeout$WindowTimeoutSubscriber.onNext (FluxWindowTimeout.java:241) at reactor.core.publisher.FluxPeek$PeekSubscriber.onNext (FluxPeek.java:200) at reactor.core.publisher.FluxDoFinally$DoFinallySubscriber.onNext (FluxDoFinally.java:130) at reactor.core.publisher.FluxPeek$PeekSubscriber.onNext (FluxPeek.java:200) at reactor.core.publisher.FluxMap$MapSubscriber.onNext (FluxMap.java:120) at com.azure.messaging.eventhubs.implementation.AmqpReceiveLinkProcessor.drainQueue (AmqpReceiveLinkProcessor.java:486) at com.azure.messaging.eventhubs.implementation.AmqpReceiveLinkProcessor.drain (AmqpReceiveLinkProcessor.java:447) at com.azure.messaging.eventhubs.implementation.AmqpReceiveLinkProcessor.lambda$onNext$8 (AmqpReceiveLinkProcessor.java:261) at reactor.core.publisher.LambdaSubscriber.onNext (LambdaSubscriber.java:160) at reactor.core.publisher.FluxOnBackpressureBufferStrategy$BackpressureBufferDropOldestSubscriber.innerDrain (FluxOnBackpressureBufferStrategy.java:270) at reactor.core.publisher.FluxOnBackpressureBufferStrategy$BackpressureBufferDropOldestSubscriber.drain (FluxOnBackpressureBufferStrategy.java:234) at reactor.core.publisher.FluxOnBackpressureBufferStrategy$BackpressureBufferDropOldestSubscriber.onNext (FluxOnBackpressureBufferStrategy.java:199) at reactor.core.publisher.FluxFlatMap$FlatMapMain.tryEmit (FluxFlatMap.java:543) at reactor.core.publisher.FluxFlatMap$FlatMapInner.onNext (FluxFlatMap.java:984) at reactor.core.publisher.MonoCreate$DefaultMonoSink.success (MonoCreate.java:160) at com.azure.core.amqp.implementation.ReactorReceiver.lambda$new$0 (ReactorReceiver.java:78) at com.azure.core.amqp.implementation.handler.DispatchHandler.onTimerTask (DispatchHandler.java:34) at com.azure.core.amqp.implementation.ReactorDispatcher$WorkScheduler.run (ReactorDispatcher.java:184) at org.apache.qpid.proton.reactor.impl.SelectableImpl.readable (SelectableImpl.java:118) at org.apache.qpid.proton.reactor.impl.IOHandler.handleQuiesced (IOHandler.java:61) at org.apache.qpid.proton.reactor.impl.IOHandler.onUnhandled (IOHandler.java:390) at com.azure.core.amqp.implementation.handler.CustomIOHandler.onUnhandled (CustomIOHandler.java:41) at org.apache.qpid.proton.engine.BaseHandler.onReactorQuiesced (BaseHandler.java:87) at org.apache.qpid.proton.engine.BaseHandler.handle (BaseHandler.java:206) at org.apache.qpid.proton.engine.impl.EventImpl.dispatch (EventImpl.java:108) at org.apache.qpid.proton.engine.impl.EventImpl.delegate (EventImpl.java:129) at org.apache.qpid.proton.engine.impl.EventImpl.dispatch (EventImpl.java:114) at org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch (ReactorImpl.java:324) at org.apache.qpid.proton.reactor.impl.ReactorImpl.process (ReactorImpl.java:291) at com.azure.core.amqp.implementation.ReactorExecutor.run (ReactorExecutor.java:86) at reactor.core.scheduler.SchedulerTask.call (SchedulerTask.java:68) at reactor.core.scheduler.SchedulerTask.call (SchedulerTask.java:28) at java.util.concurrent.FutureTask.run (FutureTask.java:264) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run (ScheduledThreadPoolExecutor.java:304) at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1130) at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:630) at java.lang.Thread.run (Thread.java:831)

mikeharder · 2021-08-30T23:05:34Z

The OverflowException in [email protected] is a known issue and we are working on a fix.

As a workaround, I believe the EventProcessorClientBuilder.processEvent() and EventProcessorClientBuilder.processEventBatch() APIs which do not accept a maxWaitTime parameter should avoid the OverflowException, if these would work in your application.

the-mod · 2021-09-08T12:26:00Z

@mikeharder

I tried the processEventBatch() without the maxWaitTime Parameter as suggested.
It run stable at a high level of Performance.

I only got some Errors in the log without any Stacktrace:

com.azure.core.amqp.implementation.ReactorDispatcher - ReactorDispatcher instance is closed.

mikeharder · 2021-09-08T21:21:03Z

@the-mod: I see existing issues with error ReactorDispatcher instance is closed, but they are all closed: #19698, #19753

@conniey, @anuchandy: Do you know more about this issue and whether it has been fixed?

conniey · 2021-09-10T15:35:33Z

This may be a false warning (we need to suppress) because any time work is scheduled on a closed reactor, we emit that message. It's normal for Reactor instances to be closed when we are recreating connections, etc.

Does this message impact your application? Do you see it not recovering?

the-mod · 2021-09-13T17:11:14Z

@conniey The App was running fine. Only saw this popping up in the logs.

conniey · 2021-09-13T17:14:47Z

Thanks for confirming! @anuchandy and I were talking about how useful this log is because it seems to add to noise rather than root cause.

conniey · 2021-09-21T00:04:44Z

It sounds like we were able to solve the issue. We are still looking at resolving the overflow exception via #23950. Please feel free to open another issue if it crops up again. Thanks!

This was referenced Apr 20, 2021

[BUG] ReactorDispatcher instance is closed. Connection is lost #19698

Closed

Azure Core AMQP Nickel+ Semester Reliability Improvements #18819

Closed

conniey added this to the [2021] June milestone Apr 26, 2021

conniey assigned YijunXieMS Apr 27, 2021

YijunXieMS assigned conniey and unassigned YijunXieMS May 19, 2021

conniey mentioned this issue May 25, 2021

Adds performance tests for Event Hubs. #21822

Merged

conniey mentioned this issue Jun 8, 2021

[QUERY]EventProcessorClientBuilder.processEventBatch - Should max wait time make a big difference in performance #22100

Closed

2 tasks

conniey modified the milestones: [2021] June, [2021] August Jul 13, 2021

conniey modified the milestones: [2021] August, [2021] September Aug 18, 2021

anuchandy mentioned this issue Sep 7, 2021

Track windowTimeout progress with reactor-team / investigate workaround for OverflowException #23950

Closed

conniey mentioned this issue Sep 7, 2021

Azure Core AMQP Copper Improvements #23952

Closed

12 tasks

conniey closed this as completed Sep 21, 2021

github-actions bot locked and limited conversation to collaborators Apr 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Event Hubs: Performance degradation between 5.6.0 and 5.7.0 #20841

Event Hubs: Performance degradation between 5.6.0 and 5.7.0 #20841

conniey commented Apr 20, 2021 •

edited

Loading

the-mod commented Apr 20, 2021

mikeharder commented Aug 26, 2021

the-mod commented Aug 27, 2021

mikeharder commented Aug 30, 2021

the-mod commented Sep 8, 2021

mikeharder commented Sep 8, 2021 •

edited

Loading

conniey commented Sep 10, 2021

the-mod commented Sep 13, 2021

conniey commented Sep 13, 2021

conniey commented Sep 21, 2021

Event Hubs: Performance degradation between 5.6.0 and 5.7.0 #20841

Event Hubs: Performance degradation between 5.6.0 and 5.7.0 #20841

Comments

conniey commented Apr 20, 2021 • edited Loading

the-mod commented Apr 20, 2021

mikeharder commented Aug 26, 2021

the-mod commented Aug 27, 2021

mikeharder commented Aug 30, 2021

the-mod commented Sep 8, 2021

mikeharder commented Sep 8, 2021 • edited Loading

conniey commented Sep 10, 2021

the-mod commented Sep 13, 2021

conniey commented Sep 13, 2021

conniey commented Sep 21, 2021

conniey commented Apr 20, 2021 •

edited

Loading

mikeharder commented Sep 8, 2021 •

edited

Loading