-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose current retry count in the context of the function (Issue #2595) #2658
Conversation
src/Microsoft.Azure.WebJobs.Host/Bindings/ExecutionContext/ExecutionContext.cs
Show resolved
Hide resolved
@alrod - Please open a separate issue in host repo to track updating RPC layer to include RetryContext and link to this PR |
/// <summary> | ||
/// Gets or sets value for retry count. | ||
/// </summary> | ||
public int RetryCount { get; set; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue #2595 mentions that the use case is to be able to check if RetryCount = MaxRetries. This doesn't enable that, right? Was it decided that that wasn't needed? However, the logic for determining whether this is truly the last retry would also have to also take the message dequeue count into account in the case of SB/Queues, since there are two levels of retries going on. E.g. if we're at the max dequeue count AND the max retry, we know this is the very last attempt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not read #2595 carefully enough. I added IRetryStratagy
to RetryPolicy
and exposed 'RetryPolicy' in 'ExecutionContext'.
However, the logic for determining whether this is truly the last retry would also have to also take the message dequeue count into account in the case of SB/Queues, since there are two levels of retries going on. E.g. if we're at the max dequeue count AND the max retry, we know this is the very last attempt.
We need to point this out in our docs. By design the retry logic only execute the function code inside one instance and do not know about specific trigger retry mechanism(it's possible that another instance will retry execution for a storage/SB message).
@pragnagopa, is it ok add IRetryStratagy
for RPC layer? or we need to use simple types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a little strange to add IRetryStrategy to RetryContext, because that context was designed to pass into IRetryStrategy.GetNextDelay. We're now wanting to pass RetryContext in other contexts and want more info on it. Another option would be to add MaxRetries to RetryContext. That is also a bit of duplication from the standpoint of an IRetryStrategy implementor, but at least leaves us with a single context object with all the info needed for inspection by other consumers (e.g. ExecutionContext). Thoughts @pragnagopa? I don't think IRetryStrategy itself is needed for ExecutionContext, since that interface is for determining the next delay, something that we wouldn't want function code calling since calling it would advance the retry state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
anyway, I added MaxRetriesCount to ExecutionContext instead of IRetryStrategy. I think it fits better.
@@ -22,6 +22,8 @@ public interface IFunctionInstance | |||
IFunctionInvoker Invoker { get; } | |||
|
|||
FunctionDescriptor FunctionDescriptor { get; } | |||
|
|||
int RetryCount { get; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a breaking change to a public interface. There's an existing extensibility issue here as evidenced by the IFunctionInstanceEx interface which was added recently to add a single new property. One way we might solve this for now w/o breaking changes:
- add internal property RetryCount to FunctionInstance class. Alternatively to address Pragna's suggestion above this could be RetryContext itself
- add internal property FunctionInstance to FunctionBindingContext
- in the function executor extension where we're invoking (right before call to TryExecuteAsync), try/cast to FunctionInstance and set RetryCount (or RetryContext)
- in ExecutionContextBindingProvider, try/cast to FunctionInstance to get RetryCount/RetryContext and FunctionInstance.FunctionDescriptor.RetryStrategy.MaxRetries (from FunctionBindingContext) to flow through via ExecutionContext
This keeps it all internal - we can improve in the future w/o any breaking changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
/// <param name="functionInvocationServices">The user logger.</param> | ||
/// <param name="functionDescriptor">Current function being executed. </param> | ||
/// /// <param name="functionDescriptor">Current retry count. </param> | ||
public FunctionBindingContext( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just add a new ctor taking parameters IFunctionInstanceEx and CancellationToken. All the values needed come from the function instance. FunctionExecutor can then be simplified to call this overload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
cde0db2
to
6407f91
Compare
|
/// </summary> | ||
/// <param name="functionInstanceId">The instance ID of the function being bound to.</param> | ||
/// <param name="functionCancellationToken">The <see cref="CancellationToken"/> to use.</param> | ||
/// <param name="functionInvocationServices">The user logger.</param> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These params don't match
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
retryContext = new RetryContext | ||
{ | ||
RetryCount = attempt, | ||
Exception = retryContext == null ? null : retryContext.Exception, // Pass exception from previos attempt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: spelling "previos"
@@ -27,5 +27,7 @@ public interface IFunctionInstance | |||
public interface IFunctionInstanceEx : IFunctionInstance | |||
{ | |||
IServiceProvider InstanceServices { get; } | |||
|
|||
RetryContext RetryContext { get; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again - this is a breaking change. You can't add new properties to public interfaces - it would break any existing implementations. I don't understand why you're adding this here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I though that IFunctionInstanceEx is internal by some reason. Fixed.
@@ -21,6 +21,11 @@ public class RetryContext | |||
/// </summary> | |||
public Exception Exception { get; set; } | |||
|
|||
/// <summary> | |||
/// Gets or sets max retry stratagy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "Gets or sets the retry strategy"
@@ -11,11 +13,18 @@ namespace Microsoft.Azure.WebJobs.Host | |||
/// </summary> | |||
public class RetryContext | |||
{ | |||
private Dictionary<string, object> _stateDictionary = new Dictionary<string, object>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this Lazy so it's only created when needed. E.g. for FixedDelay we'll never need it, so don't want the perf hit of unused object creation.
/// <summary> | ||
/// Gets the state dictionary. | ||
/// </summary> | ||
internal IDictionary<string, object> StateDictionary => _stateDictionary; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just name this "State"? We can also add a doc section here to the effect "This property can be used by retry strategies to store arbitrary state for a retried invocation.".
/// <summary> | ||
/// Gets the state dictionary. | ||
/// </summary> | ||
internal IDictionary<string, object> StateDictionary => _stateDictionary; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just name this "State"? We can also add a doc section here to the effect "This property can be used by retry strategies to store arbitrary state for a retried invocation.".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this needs to be public - it's part of the programming model now for retry strategy authors. IRetryStrategy implementations used across invocations, so they need a place to store per invocation state.
else | ||
{ | ||
delayStrategy = new RandomizedExponentialBackoffStrategy(_parsedMinimumInterval, _parsedMaximumInterval); | ||
context.StateDictionary.Add(nameof(RandomizedExponentialBackoffStrategy), new RandomizedExponentialBackoffStrategy(_parsedMinimumInterval, _parsedMaximumInterval)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the dictionary key, can we use something like _msDelayStrategy? In general when we add fx properties to user accessible collecitons we use a MS prefix to avoid collisions.
if (functionInstance is FunctionInstance instance) | ||
{ | ||
// Build retry context | ||
retryContext = new RetryContext |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to copy State as well here. If you don't have a test that is verifying that custom state is handled properly across retries, please add one. I wonder if to facilitate this, we shouldn't just make State a settable property rather than a Lazy.
@@ -49,9 +48,21 @@ public ExponentialBackoffRetryAttribute(int maxRetryCount, string minimumInterva | |||
|
|||
public override TimeSpan GetNextDelay(RetryContext context) | |||
{ | |||
IDelayStrategy delayStrategy; | |||
|
|||
if (context.StateDictionary.TryGetValue(nameof(RandomizedExponentialBackoffStrategy), out object stateObject)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just want to point out that this new State dictionary now enables our ExponentialBackoffRetryAttribute implementation as it existed. Note that if we weren't reusing the RandomizedBackoffRetryAttribute stateful class, this retry strategy could have been implemented without using State, since all required exponential backoff state can be determined/incremented solely based on the retry count which is maintained on the RetryContext.
So our addition of State is good since it adds flexibility for strategy implementors, but it'll be rare that people actually need to use it.
}; | ||
|
||
instance.RetryContext = retryContext; | ||
functionResult = await executor.TryExecuteAsync(instance, cancellationToken); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this code can be simplified by removing the else block below and executing these two lines outside of this if.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. Added few comments. Since this is changing public surface, it would be good to prototype changes on the host and Rpc layer to ensure no other changes in SDK are needed.
src/Microsoft.Azure.WebJobs.Host/Executors/FunctionExecutorExtensions.cs
Show resolved
Hide resolved
@@ -62,7 +62,6 @@ public void ExponentialBackOffDelay_Constructor_Throws() | |||
{ | |||
Assert.Throws<ArgumentOutOfRangeException>(() => new ExponentialBackoffRetryAttribute(5, "01:020000000000:25", "00:00:10")); | |||
Assert.Throws<ArgumentOutOfRangeException>(() => new ExponentialBackoffRetryAttribute(5, "01:02:25", "100000000")); | |||
Assert.Throws<ArgumentException>(() => new ExponentialBackoffRetryAttribute(5, "05:02:25", "00:00:10")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
intentionally removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, the ExponentialBackoffRetryAttribute
ctr does not throw the exception anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good to have this validation in place to avoid adding additional validation logic in the host. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can retain the validation by refactoring the RandomizedExponentialBackoffStrategy ctor validation into a static Validate method on the RandomizedExponentialBackoffStrategy class that both that class ctor and ExponentialBackoffRetryAttribute ctor can call.
I checked the change with host/node worker locally: https://github.com/alrod/azure-webjobs-sdk-script/tree/alrod/rpcretry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the prototyping changes in RPC layer! Looking at https://github.com/alrod/azure-webjobs-sdk-script/tree/alrod/rpcretry , we should define "RetryContext" to ensure we can expose all the properties - including exception. We can discuss these changes in Azure/azure-functions-language-worker-protobuf#52 . Rest looks OK.
} | ||
|
||
[Fact] | ||
public void ExponentialBackOffDelay_GetNextDelay_Throws() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you do the Validate refactor suggested above, this test can go away in favor of the existing ctor tests.
@@ -22,6 +22,8 @@ public static async Task<IDelayedException> TryExecuteAsync(this IFunctionExecut | |||
var attempt = 0; | |||
IDelayedException functionResult = null; | |||
ILogger logger = null; | |||
// Build retry context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove comment
@@ -36,7 +36,6 @@ public static async Task<IDelayedException> TryExecuteAsync(this IFunctionExecut | |||
// Set retry context only if retry strategy exists and this is not first attempt | |||
if (functionInstance.FunctionDescriptor.RetryStrategy != null && attempt > 0 && functionInstance is FunctionInstance instance) | |||
{ | |||
retryContext = retryContext ?? new RetryContext(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see we can also simplify the above if clause (and comment) - we only need if(retryContext != null && functionInstance is FunctionInstance instance)
since attempt will always be > 0 at that point, and we short circuit below if RetryStrategy is null.
Related to #2595
Resolves #2661