Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose current retry count in the context of the function (Issue #2595) #2658

Merged
merged 10 commits into from
Jan 21, 2021

Conversation

alrod
Copy link
Member

@alrod alrod commented Jan 9, 2021

Related to #2595
Resolves #2661

@alrod alrod requested review from pragnagopa and mathewc January 9, 2021 06:50
@pragnagopa pragnagopa requested a review from jeffhollan January 11, 2021 18:26
@pragnagopa
Copy link
Member

pragnagopa commented Jan 11, 2021

@alrod - Please open a separate issue in host repo to track updating RPC layer to include RetryContext and link to this PR

/// <summary>
/// Gets or sets value for retry count.
/// </summary>
public int RetryCount { get; set; }
Copy link
Member

@mathewc mathewc Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue #2595 mentions that the use case is to be able to check if RetryCount = MaxRetries. This doesn't enable that, right? Was it decided that that wasn't needed? However, the logic for determining whether this is truly the last retry would also have to also take the message dequeue count into account in the case of SB/Queues, since there are two levels of retries going on. E.g. if we're at the max dequeue count AND the max retry, we know this is the very last attempt.

Copy link
Member Author

@alrod alrod Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not read #2595 carefully enough. I added IRetryStratagy to RetryPolicy and exposed 'RetryPolicy' in 'ExecutionContext'.

However, the logic for determining whether this is truly the last retry would also have to also take the message dequeue count into account in the case of SB/Queues, since there are two levels of retries going on. E.g. if we're at the max dequeue count AND the max retry, we know this is the very last attempt.

We need to point this out in our docs. By design the retry logic only execute the function code inside one instance and do not know about specific trigger retry mechanism(it's possible that another instance will retry execution for a storage/SB message).

@pragnagopa, is it ok add IRetryStratagy for RPC layer? or we need to use simple types?

Copy link
Member

@mathewc mathewc Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a little strange to add IRetryStrategy to RetryContext, because that context was designed to pass into IRetryStrategy.GetNextDelay. We're now wanting to pass RetryContext in other contexts and want more info on it. Another option would be to add MaxRetries to RetryContext. That is also a bit of duplication from the standpoint of an IRetryStrategy implementor, but at least leaves us with a single context object with all the info needed for inspection by other consumers (e.g. ExecutionContext). Thoughts @pragnagopa? I don't think IRetryStrategy itself is needed for ExecutionContext, since that interface is for determining the next delay, something that we wouldn't want function code calling since calling it would advance the retry state.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anyway, I added MaxRetriesCount to ExecutionContext instead of IRetryStrategy. I think it fits better.

@@ -22,6 +22,8 @@ public interface IFunctionInstance
IFunctionInvoker Invoker { get; }

FunctionDescriptor FunctionDescriptor { get; }

int RetryCount { get; }
Copy link
Member

@mathewc mathewc Jan 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be a breaking change to a public interface. There's an existing extensibility issue here as evidenced by the IFunctionInstanceEx interface which was added recently to add a single new property. One way we might solve this for now w/o breaking changes:

  • add internal property RetryCount to FunctionInstance class. Alternatively to address Pragna's suggestion above this could be RetryContext itself
  • add internal property FunctionInstance to FunctionBindingContext
  • in the function executor extension where we're invoking (right before call to TryExecuteAsync), try/cast to FunctionInstance and set RetryCount (or RetryContext)
  • in ExecutionContextBindingProvider, try/cast to FunctionInstance to get RetryCount/RetryContext and FunctionInstance.FunctionDescriptor.RetryStrategy.MaxRetries (from FunctionBindingContext) to flow through via ExecutionContext

This keeps it all internal - we can improve in the future w/o any breaking changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

/// <param name="functionInvocationServices">The user logger.</param>
/// <param name="functionDescriptor">Current function being executed. </param>
/// /// <param name="functionDescriptor">Current retry count. </param>
public FunctionBindingContext(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just add a new ctor taking parameters IFunctionInstanceEx and CancellationToken. All the values needed come from the function instance. FunctionExecutor can then be simplified to call this overload.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@alrod
Copy link
Member Author

alrod commented Jan 13, 2021

@alrod - Please open a separate issue in host repo to track updating RPC layer to include RetryContext and link to this PR

Azure/azure-functions-host#7048

@alrod alrod requested review from mathewc and pragnagopa January 13, 2021 01:14
@alrod alrod changed the title Expose current retry count in the context of the function (Issue #2595) [pack] Expose current retry count in the context of the function (Issue #2595) Jan 14, 2021
@alrod alrod changed the title [pack] Expose current retry count in the context of the function (Issue #2595) Expose current retry count in the context of the function (Issue #2595) Jan 14, 2021
/// </summary>
/// <param name="functionInstanceId">The instance ID of the function being bound to.</param>
/// <param name="functionCancellationToken">The <see cref="CancellationToken"/> to use.</param>
/// <param name="functionInvocationServices">The user logger.</param>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These params don't match

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

retryContext = new RetryContext
{
RetryCount = attempt,
Exception = retryContext == null ? null : retryContext.Exception, // Pass exception from previos attempt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: spelling "previos"

@@ -27,5 +27,7 @@ public interface IFunctionInstance
public interface IFunctionInstanceEx : IFunctionInstance
{
IServiceProvider InstanceServices { get; }

RetryContext RetryContext { get; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again - this is a breaking change. You can't add new properties to public interfaces - it would break any existing implementations. I don't understand why you're adding this here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I though that IFunctionInstanceEx is internal by some reason. Fixed.

@@ -21,6 +21,11 @@ public class RetryContext
/// </summary>
public Exception Exception { get; set; }

/// <summary>
/// Gets or sets max retry stratagy.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "Gets or sets the retry strategy"

@alrod alrod requested a review from mathewc January 16, 2021 00:35
@@ -11,11 +13,18 @@ namespace Microsoft.Azure.WebJobs.Host
/// </summary>
public class RetryContext
{
private Dictionary<string, object> _stateDictionary = new Dictionary<string, object>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this Lazy so it's only created when needed. E.g. for FixedDelay we'll never need it, so don't want the perf hit of unused object creation.

/// <summary>
/// Gets the state dictionary.
/// </summary>
internal IDictionary<string, object> StateDictionary => _stateDictionary;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just name this "State"? We can also add a doc section here to the effect "This property can be used by retry strategies to store arbitrary state for a retried invocation.".

/// <summary>
/// Gets the state dictionary.
/// </summary>
internal IDictionary<string, object> StateDictionary => _stateDictionary;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just name this "State"? We can also add a doc section here to the effect "This property can be used by retry strategies to store arbitrary state for a retried invocation.".

Copy link
Member

@mathewc mathewc Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this needs to be public - it's part of the programming model now for retry strategy authors. IRetryStrategy implementations used across invocations, so they need a place to store per invocation state.

else
{
delayStrategy = new RandomizedExponentialBackoffStrategy(_parsedMinimumInterval, _parsedMaximumInterval);
context.StateDictionary.Add(nameof(RandomizedExponentialBackoffStrategy), new RandomizedExponentialBackoffStrategy(_parsedMinimumInterval, _parsedMaximumInterval));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the dictionary key, can we use something like _msDelayStrategy? In general when we add fx properties to user accessible collecitons we use a MS prefix to avoid collisions.

if (functionInstance is FunctionInstance instance)
{
// Build retry context
retryContext = new RetryContext
Copy link
Member

@mathewc mathewc Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to copy State as well here. If you don't have a test that is verifying that custom state is handled properly across retries, please add one. I wonder if to facilitate this, we shouldn't just make State a settable property rather than a Lazy.

@@ -49,9 +48,21 @@ public ExponentialBackoffRetryAttribute(int maxRetryCount, string minimumInterva

public override TimeSpan GetNextDelay(RetryContext context)
{
IDelayStrategy delayStrategy;

if (context.StateDictionary.TryGetValue(nameof(RandomizedExponentialBackoffStrategy), out object stateObject))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to point out that this new State dictionary now enables our ExponentialBackoffRetryAttribute implementation as it existed. Note that if we weren't reusing the RandomizedBackoffRetryAttribute stateful class, this retry strategy could have been implemented without using State, since all required exponential backoff state can be determined/incremented solely based on the retry count which is maintained on the RetryContext.

So our addition of State is good since it adds flexibility for strategy implementors, but it'll be rare that people actually need to use it.

};

instance.RetryContext = retryContext;
functionResult = await executor.TryExecuteAsync(instance, cancellationToken);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this code can be simplified by removing the else block below and executing these two lines outside of this if.

@alrod alrod requested a review from mathewc January 19, 2021 21:43
Copy link
Member

@pragnagopa pragnagopa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Added few comments. Since this is changing public surface, it would be good to prototype changes on the host and Rpc layer to ensure no other changes in SDK are needed.

@@ -62,7 +62,6 @@ public void ExponentialBackOffDelay_Constructor_Throws()
{
Assert.Throws<ArgumentOutOfRangeException>(() => new ExponentialBackoffRetryAttribute(5, "01:020000000000:25", "00:00:10"));
Assert.Throws<ArgumentOutOfRangeException>(() => new ExponentialBackoffRetryAttribute(5, "01:02:25", "100000000"));
Assert.Throws<ArgumentException>(() => new ExponentialBackoffRetryAttribute(5, "05:02:25", "00:00:10"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intentionally removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the ExponentialBackoffRetryAttribute ctr does not throw the exception anymore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good to have this validation in place to avoid adding additional validation logic in the host. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can retain the validation by refactoring the RandomizedExponentialBackoffStrategy ctor validation into a static Validate method on the RandomizedExponentialBackoffStrategy class that both that class ctor and ExponentialBackoffRetryAttribute ctor can call.

@alrod
Copy link
Member Author

alrod commented Jan 20, 2021

Thanks for the PR. Added few comments. Since this is changing public surface, it would be good to prototype changes on the host and Rpc layer to ensure no other changes in SDK are needed.

I checked the change with host/node worker locally:

https://github.com/alrod/azure-webjobs-sdk-script/tree/alrod/rpcretry
https://github.com/Azure/azure-functions-nodejs-worker/tree/alrod/retryCount

@alrod alrod requested a review from pragnagopa January 20, 2021 03:11
Copy link
Member

@pragnagopa pragnagopa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the prototyping changes in RPC layer! Looking at https://github.com/alrod/azure-webjobs-sdk-script/tree/alrod/rpcretry , we should define "RetryContext" to ensure we can expose all the properties - including exception. We can discuss these changes in Azure/azure-functions-language-worker-protobuf#52 . Rest looks OK.

}

[Fact]
public void ExponentialBackOffDelay_GetNextDelay_Throws()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do the Validate refactor suggested above, this test can go away in favor of the existing ctor tests.

@@ -22,6 +22,8 @@ public static async Task<IDelayedException> TryExecuteAsync(this IFunctionExecut
var attempt = 0;
IDelayedException functionResult = null;
ILogger logger = null;
// Build retry context
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove comment

@@ -36,7 +36,6 @@ public static async Task<IDelayedException> TryExecuteAsync(this IFunctionExecut
// Set retry context only if retry strategy exists and this is not first attempt
if (functionInstance.FunctionDescriptor.RetryStrategy != null && attempt > 0 && functionInstance is FunctionInstance instance)
{
retryContext = retryContext ?? new RetryContext();
Copy link
Member

@mathewc mathewc Jan 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we can also simplify the above if clause (and comment) - we only need if(retryContext != null && functionInstance is FunctionInstance instance) since attempt will always be > 0 at that point, and we short circuit below if RetryStrategy is null.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ExponentialBackoffRetryAttribute shares does not work properly as it shares state with other executions
3 participants