Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Service Bus] Batching receiver in receiveAndDelete mode loses messages when working with large number of messages #5757

Closed
ramya0820 opened this issue Oct 23, 2019 · 8 comments
Assignees
Labels
bug This issue requires a change to an existing behavior in the product in order to be resolved. Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Service Bus

Comments

@ramya0820
Copy link
Member

  • Package Name: @azure/service-bus
  • Package Version: 1.1.0

Describe the bug
The batching receiver in Service Bus does not work as expected in receiveAndDelete mode i.e., when fetching large number of messages, a significantly different number of messages are being taken off.
For example, when doing receiveMessages(300), often times about only 100 messages are actually received and about 120 messages are deleted from the Service Bus entity.

To Reproduce

    const connectionString = "<insert-connection-string>";
    const sbClient = ServiceBusClient.createFromConnectionString(connectionString);

    const subscriptionClient = sbClient.createSubscriptionClient(topicName, subscriptionName);
    const receiver = await subscriptionClient.createReceiver(ReceiveMode.receiveAndDelete);

    const numOfMessages = 300;
    try {
      const messages = await receiver.receiveMessages(numOfMessages);
      await receiver.close();

      should.equal(
        messages.length,
        numOfMessages,
        `Expected ${numOfMessages} but received ${messages.length}`
      );

      // Manually check message count on the Service Bus entity

      await subscriptionClient.close();
    } finally {
      await sbClient.close();
    }

Expected behavior
As is the case with peekLock mode, the exact number of messages requested are to be retrieved preferably. And the number of messages retrieved must equal the number of messages deleted from the Service Bus entity.

@ramya0820 ramya0820 added bug This issue requires a change to an existing behavior in the product in order to be resolved. Client This issue points to a problem in the data-plane of the library. Service Bus labels Oct 23, 2019
ramya0820 pushed a commit to ramya0820/azure-sdk-for-js that referenced this issue Nov 1, 2019
@ramya0820 ramya0820 self-assigned this Nov 5, 2019
@soates
Copy link

soates commented Nov 19, 2019

When will this be merged and released? We are seeing this exact issue. I don't think it matters on the batch size. We see this issue with a batch size of 20.

@ramya-rao-a
Copy link
Contributor

Thanks for reporting @soates

We are working on the fix at the moment and should have a patch update ready soon.
We will post an update here when ready.

Meanwhile, can you share which version of the service bus library you are currently using?

@ramya0820 ramya0820 added the customer-reported Issues that are reported by GitHub users external to the Azure organization. label Nov 20, 2019
@soates
Copy link

soates commented Nov 25, 2019

Hey, we are using version 1.1.0. We really want to use this package in production but are running into many issues which we did not expect with the stability of this package..

@ramya-rao-a
Copy link
Contributor

@soates We should be able to get the patch update out by tomorrow.

Can you create a new issue for all the other issues you are finding with the package? We would like to help in any way we can and would appreciate any feedback you have to make the package better.

@ramya-rao-a
Copy link
Contributor

@soates This bug has been fixed in the latest version (1.1.1) of the @azure/service-bus package.

Thanks for your patience.

@ghost
Copy link

ghost commented Nov 27, 2019

Thanks for working with Microsoft on GitHub! Tell us how you feel about your experience using the reactions on this comment.

@ramya-rao-a
Copy link
Contributor

Some details on the root cause for this bug for future reference:

In PR #6265, we remove this timer when receiveMessages is used in ReceiveAndDelete mode which fixes the data loss issue.

@ramya0820
Copy link
Member Author

While we practically have seen the issue so far in just the receiveAndDelete mode and could be a service side behavior/issue - the possibility of new message timer exceeding for peekLock mode does exist theoretically? (Changes in #6601 got me thinking about how if we don't handle the credit drain event, we might end up in receiving messages over wire but not in code for newMessageWaitTime exceeding case)

For peekLock mode, this may result in delivery count of messages getting incremented if the messages get received over link and not by our code - which may be problematic across large number of requests?
If so, we might want to handle #5757 changes for peekLock mode as well?

The concern about lock getting lost on messages seems to be an expected service behavior and may be something to address as well. (Most likely by the user. From SDK side, could so by disallowing receivers in peekLock mode to hold onto messages for too long, or use autoLockRenewal (Sorry if there was an another issue already addressing this cause I recall discussing this earlier)
cc: @ramya-rao-a @chradek

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug This issue requires a change to an existing behavior in the product in order to be resolved. Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Service Bus
Projects
None yet
Development

No branches or pull requests

3 participants