Messages from few partitions are getting delayed #18467

ponsakthi · 2021-02-05T03:43:09Z

We are using Microsoft.Azure.EventHubs.Processor 4.2.0 version on .net core 3.1 running as a pod on Openshift cluster
We have 8 pods listening to eventhub with 32 partitions.

Quite often we are seeing issues where the pod is not able to receive message from few partition where as it is able to pull messages from other partitions. The delay is as high as 10 minutes at times. This gets auto resolved and we get all the messages in burst fashion. But we don't have visibility on why is there a huge delay on few partitions. Is there a trace or log that can show us what is happening behind the scenes while polling the eventhub?

Is this the same issue that is fixed as part of #12691 in the latest version Microsoft.Azure.EventHubs.Processor 4.3.1

Mohit-Chakraborty · 2021-02-05T05:20:44Z

Thank you for your feedback. Tagging and routing to the team best able to assist.

ghost · 2021-02-05T19:41:32Z

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @samuelkoppes.

Issue Details

We are using Microsoft.Azure.EventHubs.Processor 4.2.0 version on .net core 3.1 running as a pod on Openshift cluster
We have 8 pods listening to eventhub with 32 partitions.

Quite often we are seeing issues where the pod is not able to receive message from few partition where as it is able to pull messages from other partitions. The delay is as high as 10 minutes at times. This gets auto resolved and we get all the messages in burst fashion. But we don't have visibility on why is there a huge delay on few partitions. Is there a trace or log that can show us what is happening behind the scenes while polling the eventhub?

Is this the same issue that is fixed as part of #12691 in the latest version Microsoft.Azure.EventHubs.Processor 4.3.1

Author:	ponsakthi
Assignees:	-
Labels:	`Client`, `Event Hubs`, `Service Attention`, `customer-reported`, `needs-team-attention`, `question`
Milestone:	-

ponsakthi · 2021-02-11T07:22:45Z

Update:
We upgraded to Microsoft.Azure.EventHubs.Processor 4.3.1 and we are seeing the same issue . But we are using EventProcessorHost (v4) and not EventProcessorClient(v5)

One more thing that we observed: Say pod A is holding the lease to partition 1. The lease ownership changes to Pod B if Pod A is not successfully renewing the least but Pod B is not pulling messages even if the ownership is with it. Later Pod A reclaims the lease back and successfully starts processing from partition 1 again. For the entire time when PodB was holding the lease no messages were processed and hence the latency. Is this a known issue in v4 client and have we addressed this in v5 client?

serkantkaraca · 2021-04-23T04:01:35Z

I have investigated very similar issue with another customer on K8s and that turned to be a downstream write stuck issue.

Can you please trace in ProcessEventsAsync code as in and out? See if you have a blocking call which is causing the stuck behavior. Each 'in' should have a corresponding 'out' w/ reasonable delay.

ghost · 2021-06-22T20:00:26Z

Hi, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!

Mohit-Chakraborty added Client This issue points to a problem in the data-plane of the library. Event Hubs needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team labels Feb 5, 2021

ghost removed the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Feb 5, 2021

jsquire added the Service Attention Workflow: This issue is responsible by Azure service team. label Feb 5, 2021

ramya-rao-a added needs-author-feedback Workflow: More information is needed from author to address the issue. and removed needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team labels Jun 15, 2021

ghost added the no-recent-activity There has been no recent activity on this issue. label Jun 22, 2021

ghost closed this as completed Jul 7, 2021

github-actions bot locked and limited conversation to collaborators Mar 28, 2023

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Messages from few partitions are getting delayed #18467

Messages from few partitions are getting delayed #18467

ponsakthi commented Feb 5, 2021

Mohit-Chakraborty commented Feb 5, 2021

ghost commented Feb 5, 2021

ponsakthi commented Feb 11, 2021

serkantkaraca commented Apr 23, 2021 •

edited

Loading

ghost commented Jun 22, 2021

Messages from few partitions are getting delayed #18467

Messages from few partitions are getting delayed #18467

Comments

ponsakthi commented Feb 5, 2021

Mohit-Chakraborty commented Feb 5, 2021

ghost commented Feb 5, 2021

ponsakthi commented Feb 11, 2021

serkantkaraca commented Apr 23, 2021 • edited Loading

ghost commented Jun 22, 2021

serkantkaraca commented Apr 23, 2021 •

edited

Loading