Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to cancel receive operations #19955

Merged
merged 7 commits into from
Mar 30, 2021

Conversation

danielmarbach
Copy link
Contributor

Alternative to #19888

Closes #19306

All SDK Contribution checklist:

This checklist is used to make sure that common guidelines for a pull request are followed.

  • Please open PR in Draft mode if it is:
    • Work in progress or not intended to be merged.
    • Encountering multiple pipeline failures and working on fixes.
  • If an SDK is being regenerated based on a new swagger spec, a link to the pull request containing these swagger spec changes has been included above.
  • I have read the contribution guidelines.
  • The pull request does not introduce breaking changes.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

SDK Generation Guidelines

  • The generate.cmd file for the SDK has been updated with the version of AutoRest, as well as the commitid of your swagger spec or link to the swagger spec, used to generate the code. (Track 2 only)
  • The *.csproj and AssemblyInfo.cs files have been updated with the new version of the SDK. Please double check nuget.org current release version.

Additional management plane SDK specific contribution checklist:

Note: Only applies to Microsoft.Azure.Management.[RP] or Azure.ResourceManager.[RP]

  • Include updated management metadata.
  • Update AzureRP.props to add/remove version info to maintain up to date API versions.

Management plane SDK Troubleshooting

  • If this is very first SDK for a services and you are adding new service folders directly under /SDK, please add new service label and/or contact assigned reviewer.
  • If the check fails at the Verify Code Generation step, please ensure:
    • Do not modify any code in generated folders.
    • Do not selectively include/remove generated files in the PR.
    • Do use generate.ps1/cmd to generate this PR instead of calling autorest directly.
      Please pay attention to the @microsoft.csharp version output after running generate.ps1. If it is lower than current released version (2.3.82), please run it again as it should pull down the latest version,

Old outstanding PR cleanup

Please note:
If PRs (including draft) has been out for more than 60 days and there are no responses from our query or followups, they will be closed to maintain a concise list for our reviewers.

@ghost ghost added Service Bus customer-reported Issues that are reported by GitHub users external to the Azure organization. labels Mar 30, 2021
@ghost
Copy link

ghost commented Mar 30, 2021

Thank you for your contribution @danielmarbach! We will review the pull request and get back to you soon.

@ghost ghost added the Community Contribution Community members are working on the issue label Mar 30, 2021
TimeSpan.FromMilliseconds(20),
maxWaitTime ?? timeout,
callback,
(link, receiveMessagesCompletionSource));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes with the cost of boxing yet I still think it is better than the alternative describe above

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

}, receiveMessagesCompletionSource, useSynchronizationContext: false);

// in case BeginReceiveRemoteMessages throws exception will be materialized on the synchronous path
_ = Task.Factory
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I had an approach following this pattern

var receiveTask = Receive(...)
var completed = await Task.WhenAny(receiveTask, tcs.Task).ConfigureAwait(false)
if(completed == tcs.Task)
   await tcs.Task.ConfigureAwait()

await receiveTask.ConfigureAwait()

Copy link
Contributor Author

@danielmarbach danielmarbach Mar 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then I figure with this approach we always pay the price of the array allocation of WhenAny plus the additional conditions on the path including the state machinery. So I ended up always using the TCS to materialize either the result, exceptions of end or the cancellation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if you prefer the WhenAny approach. While it has more state machine involved and allocates the WhenAny array it wouldn't require us to box the value tuple and might make the code slightly more straightforward to read at the cost of more allocations

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your implementation better; I sketched out the WhenAny approach in the issue, but had no idea that Register was a thing. While this does introduce a bit more density in the FromAsync machinery, it already had some complexity, With your current approach, I find the flow easier to follow than the "which task completed" juggling that WaitAny would require.


await processor.StartProcessingAsync();
await tcs.Task;
await Task.Delay(10000); // wait long enough to be hanging in the next receive on the empty queue
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsquire I tested the other proposed approach and unfortunately those tests passed even when I reverted my cancellation changes in the receive method. They passed because it wasn't guaranteed that the code was hanging in another receive attempt

@@ -99,6 +99,43 @@ await using (var scope = await ServiceBusScope.CreateWithQueue(enablePartitionin
}
}

[Test]
public async Task ReceiveMessagesWhenQueueEmpty()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also want to prove that cancelling won't increment the delivery count for any messages that were already in the Amqp library's local buffer. This may be hard to do, but maybe we can try sending a message just before we cancel?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can also add these tests in a follow up PR.

using var cancellationTokenSource = new CancellationTokenSource(TimeSpan.FromSeconds(3));

var start = DateTime.UtcNow;
await processor.StopProcessingAsync(cancellationTokenSource.Token);
Copy link
Member

@JoshLove-msft JoshLove-msft Mar 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also have a test that doesn't pass a token here (or update an existing test to assert the time elapsed)? It should still stop processing pretty quickly (or at least as quick as the user handler takes to complete). We would also want a test that verifies that stopping still allows in-flight user handlers to complete.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can also add these tests in a follow up PR.

@JoshLove-msft
Copy link
Member

Addresses #17734

receivedMessages.Add(AmqpMessageConverter.AmqpMessageToSBMessage(message));
message.Dispose();
}

return receivedMessages;
}
catch (OperationCanceledException)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check if cancellationToken.IsCancellationRequest? And also possibly restrict to TaskCanceledException?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, we can't restrict there because the completion source will throw the OperationCanceledException. We try to normalize everything to TaskCanceledException around the SDK so that is what callers see. The Service Bus and Event Hubs troubleshooting guides attribute a specific meaning to OperationCanceled that indicates service behavior and we wanted to avoid confusion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OperationCanceledException is the base of TaskCanceledException so I figured then when conditions can be removed.

@JoshLove-msft
Copy link
Member

/azp run net - servicebus - tests

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@JoshLove-msft
Copy link
Member

This is awesome! Thanks @danielmarbach

Copy link
Member

@jsquire jsquire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late to the party, but LGTM.

}, receiveMessagesCompletionSource, useSynchronizationContext: false);

// in case BeginReceiveRemoteMessages throws exception will be materialized on the synchronous path
_ = Task.Factory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your implementation better; I sketched out the WhenAny approach in the issue, but had no idea that Register was a thing. While this does introduce a bit more density in the FromAsync machinery, it already had some complexity, With your current approach, I find the flow easier to follow than the "which task completed" juggling that WaitAny would require.

receivedMessages.Add(AmqpMessageConverter.AmqpMessageToSBMessage(message));
message.Dispose();
}

return receivedMessages;
}
catch (OperationCanceledException)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, we can't restrict there because the completion source will throw the OperationCanceledException. We try to normalize everything to TaskCanceledException around the SDK so that is what callers see. The Service Bus and Event Hubs troubleshooting guides attribute a specific meaning to OperationCanceled that indicates service behavior and we wanted to avoid confusion.

TimeSpan.FromMilliseconds(20),
maxWaitTime ?? timeout,
callback,
(link, receiveMessagesCompletionSource));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

@danielmarbach danielmarbach deleted the cancellation-take2 branch March 31, 2021 14:41
},
(link, maxMessages, maxWaitTime, timeout),
default
var receiveMessagesCompletionSource =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I think we may need to revert this. @jsquire pointed out that with this approach we will just be leaving receive operations hanging on the AMQP link, which will cause a backup. I confirmed this with a test that attempts to receive after a previous cancel. I think we will need to either limit the scope of this change to just StopProcessing calls, because in that case it is okay that receive operations are blocked, or better yet, see if we can contribute Cancellation token support to the AMQP library.
/cc @xinchen10

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        public async Task CancellingDoesNotBlockSubsequentReceives(bool prefetch)
        {
            await using (var scope = await ServiceBusScope.CreateWithQueue(enablePartitioning: false, enableSession: false))
            {
                await using var client = CreateClient();

                ServiceBusSender sender = client.CreateSender(scope.QueueName);
                var receiver = client.CreateReceiver(scope.QueueName, new ServiceBusReceiverOptions { PrefetchCount = prefetch ? 10 : 0 });

                using var cancellationTokenSource = new CancellationTokenSource(2000);
                var start = DateTime.UtcNow;

                Assert.That(
                    async () => await receiver.ReceiveMessageAsync(TimeSpan.FromSeconds(60), cancellationToken: cancellationTokenSource.Token),
                    Throws.InstanceOf<TaskCanceledException>());

                await sender.SendMessageAsync(GetMessage());
                var msg = await receiver.ReceiveMessageAsync();
                Assert.AreEqual(1, msg.DeliveryCount);
                var end = DateTime.UtcNow;
                Assert.NotNull(msg);
                Assert.Less(end - start, TimeSpan.FromSeconds(5));
            }
        }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above test fails on the second receive call as we are blocked on the cancelled receive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why we originally closed the link in the other PR but that has also other drawbacks. I think even StopProcessing can be problematic because the processor is designed to be restarted right?

Copy link
Member

@JoshLove-msft JoshLove-msft Mar 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the processor can be restarted - actually StopProcessing just stops receiving rather than closing any links. Close/Dispose would close links. I really think the best way forward is to try to get this integrated into the AMQP lib, so that we can actually end the operations early instead of ignoring them.

Copy link
Member

@JoshLove-msft JoshLove-msft Mar 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would vote for option 2.

Copy link
Member

@JoshLove-msft JoshLove-msft Apr 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 1, I don't think cancelling pending receive calls when totalCredit is 0 is sufficient because there could be concurrent receives occurring on the same link. Even with option 2, we wouldn't be able to correlate receive calls with ReceiveAsyncResults. IMO the cancellation token provides the best user experience.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really have a hard time to understand all the push back against cancellationtoken. Cooperative cancellation is the defacto standard in dotnet for IO bound operations. It is present almost anywhere in moderns async enabled API even in the runtime as well as across the ecosystem. Even the SDK guidance of the whole azure SDK where a lot of people have contributed to and intense user studies have been done adheres to those principles because this is how this ecosystem works. So why so much push back?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielmarbach the push back is not for cancellation tokens. Its more about supporting the shutdown scenario in a better way that also makes sense to AMQP (I admit that I am influenced more by other AMQP implementations, especially the Apache Qpid products and their JMS implementation). Your PR to the AMQP library (thank you for that) adds cancellation token to the receive method only. It gives a feeling that the library API is created on a needed basis and it was done just to make the shutdown scenario work. To properly support cancellation tokens, we will also need to look at other Task based APIs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Unfortunately, I'm only a community contributor without corporate backing so the only thing I could commit to in my precious spare time was exactly that. Your comment put things under a different light, and it sounds more like the door is open rather than the door is closed which I have potentially unrightfully experienced or shall I say read into the conversations. I appreciate you taking the time and clarify that.

If there would be some way to openly share on the repo some plans, ideas, directions including things that could be done I'm happy to contribute a few things when I have time, and it fits my small knowledge area that I have of the AMQP lib. For me, it boils down to have this project under some sort of active governance and communication plan to see where things are heading to (or not).

azure-sdk pushed a commit to azure-sdk/azure-sdk-for-net that referenced this pull request Aug 29, 2022
[SQL] Bump ManagedDatabaseRestoreDetails and ManagedDatabase version in v5 tag (Azure#19955)

* Bump managedDatabaseRestoreDetails version

* bump managed databases version as well
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Community Contribution Community members are working on the issue customer-reported Issues that are reported by GitHub users external to the Azure organization. Service Bus
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate Force-Closing AMQP Links for Cancellation
4 participants