Alternative implementation of `AtomicState` leveraging WaitAsync #6109

ismaelhamed · 2022-09-23T08:13:14Z

Might be related to #6106

The circuit breaker, in its current implementation, could give false positives (signal tasks as done when in reality they failed).

I ported the CircuitBreakerStressSpec and slightly modified it to make all tasks take longer than the CB's CallTimeout, so that all of them fail. Instead, by running the test you can see that they are all "marked" as succeeded (DoneCount).

Upon further inspection:

The tasks indeed all failed with a TimeException, but the exception is swallowed by the CallFail method. This makes impossible to capture the TimeException outside the CB --as demonstrated by the StressActor.
Because we are awaiting the task first, and only after it finishes we check whether it took longer than the CallTimeout, we could potentially be awaiting indefinitely for the task to complete. This PR leverages the new Task.WaitAsync in .NET6 instead.

When compared with the results of the test in the previous PR, we now get the correct behavior:

BEFORE

FailCount:0, DoneCount:1000, CircCount:0, TimeoutCount:0
FailCount:0, DoneCount:1000, CircCount:0, TimeoutCount:0
FailCount:0, DoneCount:1000, CircCount:0, TimeoutCount:0

AFTER

FailCount:0, DoneCount:0, CircCount:106753, TimeoutCount:1001
FailCount:0, DoneCount:0, CircCount:110008, TimeoutCount:1001
FailCount:0, DoneCount:0, CircCount:110216, TimeoutCount:1001

ismaelhamed · 2022-09-23T08:14:31Z

NOTE: for whatever reason, Task.WaitAsync in .NET6 also fails sometimes.

ismaelhamed · 2022-09-23T11:31:48Z

BTW, some of the CB's tests are failing now because they seem tweaked to work with the current CB implementation. If this PR goes ahead, I'll make sure to fix them all.

Aaronontheweb · 2022-09-23T13:52:00Z

BTW, some of the CB's tests are failing now because they seem tweaked to work with the current CB implementation. If this PR goes ahead, I'll make sure to fix them all.

got it - thanks for letting us know!

Aaronontheweb · 2022-09-23T14:31:56Z

Does #6108 need to be merged first or does this?

Aaronontheweb

LGTM - please proceed

src/core/Akka.Tests/Pattern/CircuitBreakerStressSpec.cs

src/core/Akka/Util/Internal/AtomicState.cs

src/core/Akka/Util/Extensions/TaskExtensions.cs

ismaelhamed · 2022-09-25T07:51:53Z

Does #6108 need to be merged first or does this?

No, I've included the CircuitBreakerStressSpec in this one too.

ismaelhamed · 2022-09-26T07:42:53Z

For WithSyncCircuitBreaker #L259, I wonder if we should just do:

public void WithSyncCircuitBreaker(Action body) => 
    WithCircuitBreaker(body, b => Task.Run(b)).GetAwaiter().GetResult();

Aaronontheweb · 2022-09-28T08:20:44Z

LMK when this is ready for review

Aaronontheweb · 2022-09-28T08:21:16Z

For WithSyncCircuitBreaker #L259, I wonder if we should just do:

public void WithSyncCircuitBreaker(Action body) => 
    WithCircuitBreaker(body, b => Task.Run(b)).GetAwaiter().GetResult();

I think that's fine IMHO

ismaelhamed · 2022-09-28T09:56:47Z

@Aaronontheweb this is ready for review. I have an improved version of the WaitAsync with some optimizations and support token cancellation (taken from the .NET6 implementation), but I'd prefer an initial review first.

Aaronontheweb · 2022-09-28T11:10:04Z

Looks like you have a test suite issue here:

Akka.Tests.Pattern.ASynchronousCircuitBreakerThatIsClosed.A synchronous circuit breaker that is closed must increment failure count on callTimeout before call finishes
System.AggregateException : One or more errors occurred. (Timeout 00:00:00.1000000 expired while waiting for condition.
Expected: True
Actual:   False)
---- Timeout 00:00:00.1000000 expired while waiting for condition.
Expected: True
Actual:   False

Aaronontheweb · 2022-10-31T14:02:14Z

Going to re-review this this week

ismaelhamed · 2022-10-31T14:19:57Z

I haven't had the time to work on this some more, and I'm still not sure why that test keeps failing. I even went ahead and reimplemented Within and AwaitCond (this one specially doesn't work like the JVM, so it might be skewing tests), but no luck.

Aaronontheweb

I think I have your failing test figured out

Aaronontheweb · 2022-10-31T14:13:12Z

src/core/Akka.TestKit/TestKitBase_AwaitConditions.cs

@@ -293,10 +293,59 @@ protected static async Task<bool> InternalAwaitConditionAsync(Func<Task<bool>> c
            return true;
        }

-        private static void ConditionalLog(ILoggingAdapter logger, string format, params object[] args)
+        protected void AwaitCond(Func<bool> p, TimeSpan? max = null, TimeSpan? interval = null, string message = "")


Don't we already have an AwaitCondition method or does this do something different?

The behavior of AwaitCondition is different from the JVM, that's why I tried a straight port instead. No luck, keeps failing.

AwaitCondition arbitrarily calculates the interval(when not specified) as a 10th of max. It should be a fixed 100 ms, otherwise tests ported from the JVM might not behave as expected.

Aaronontheweb · 2022-10-31T14:22:53Z

src/core/Akka.Tests/Pattern/CircuitBreakerSpec.cs

-    {
-        [Fact(DisplayName = "A synchronous circuit breaker that is half open should pass call and transition to close on success")]
-        public void Should_Pass_Call_And_Transition_To_Close_On_Success( )
+        [Fact(DisplayName = "A synchronous circuit breaker that is closed must increment failure count on callTimeout before call finishes")]


So this spec is currently racy

Aaronontheweb · 2022-10-31T14:24:19Z

src/core/Akka.Tests/Pattern/CircuitBreakerSpec.cs

+            var breaker = ShortCallTimeoutCb();
+            Task.Run(() => breaker.Instance.WithSyncCircuitBreaker(() => Thread.Sleep(Dilated(TimeSpan.FromSeconds(1)))));
+            Within(TimeSpan.FromMilliseconds(900),
+                () => AwaitCond(() => breaker.Instance.CurrentFailureCount == 1, Dilated(TimeSpan.FromMilliseconds(100))));


I think your problem here is your parameters on AwaitCond - you meant to set Dilated(TimeSpan.FromMilliseconds(100)) as the interval but it's being used as the max value here. @ismaelhamed

https://github.com/akka/akka/blob/13892f7d01e2b0b0a4fbb7e20b3bed3bafee3dde/akka-actor-tests/src/test/scala/akka/pattern/CircuitBreakerSpec.scala#L413

@Aaronontheweb , see my comment above. In the meantime, I've fixed it by passing both max and interval.

src/core/Akka/Util/Extensions/TaskExtensions.cs

Aaronontheweb · 2023-02-09T02:29:03Z

@ismaelhamed is this ready for review?

ismaelhamed · 2023-02-09T07:23:28Z

@ismaelhamed is this ready for review?

The implementation yes, but I couldn't figure out where the problem with the specs was. I'll give it another shot soon.

ismaelhamed · 2023-03-22T09:24:14Z

Unrelated test failing now.

Aaronontheweb · 2023-03-23T14:45:00Z

@ismaelhamed is this good for me to review again?

Aaronontheweb · 2023-03-23T14:45:34Z

ah yes, it is - you just requested one from me! I'll get right on it.

Aaronontheweb

LGTM

Aaronontheweb · 2023-03-24T15:23:56Z

Queued for auto-merge - nice work @ismaelhamed

Aaronontheweb added akka-actor akka-persistence bug-reproduction Used to reproduce bugs. labels Sep 23, 2022

Aaronontheweb approved these changes Sep 23, 2022

View reviewed changes

src/core/Akka.Tests/Pattern/CircuitBreakerStressSpec.cs Outdated Show resolved Hide resolved

src/core/Akka/Util/Internal/AtomicState.cs Show resolved Hide resolved

src/core/Akka/Util/Extensions/TaskExtensions.cs Show resolved Hide resolved

ismaelhamed force-pushed the circuit-breaker-wait branch from e74696f to 61885c4 Compare September 25, 2022 07:50

ismaelhamed mentioned this pull request Sep 25, 2022

Ported CircuitBreakerStressSpec #6108

Closed

ismaelhamed force-pushed the circuit-breaker-wait branch 2 times, most recently from 9dcef85 to c22486f Compare September 26, 2022 07:38

ismaelhamed force-pushed the circuit-breaker-wait branch 3 times, most recently from bf42423 to bda4de1 Compare September 28, 2022 06:07

ismaelhamed force-pushed the circuit-breaker-wait branch from bda4de1 to d97fc2f Compare September 28, 2022 09:54

ismaelhamed marked this pull request as ready for review September 28, 2022 09:54

lucavice mentioned this pull request Sep 30, 2022

Exceeding max-concurrent-recoveries triggers circuit breaker #6106

Open

ismaelhamed marked this pull request as draft October 11, 2022 07:54

ismaelhamed force-pushed the circuit-breaker-wait branch 5 times, most recently from 8b69365 to 29c012a Compare October 13, 2022 10:47

ismaelhamed force-pushed the circuit-breaker-wait branch from 05fed9a to d9a288f Compare October 17, 2022 10:43

Aaronontheweb reviewed Oct 31, 2022

View reviewed changes

ismaelhamed force-pushed the circuit-breaker-wait branch from 96d228b to d8545d0 Compare November 1, 2022 07:35

ismaelhamed force-pushed the circuit-breaker-wait branch 4 times, most recently from 052c442 to ffe4c43 Compare December 28, 2022 08:39

ismaelhamed force-pushed the circuit-breaker-wait branch 2 times, most recently from 867be27 to 30a69af Compare March 22, 2023 06:47

ismaelhamed marked this pull request as ready for review March 22, 2023 09:24

ismaelhamed force-pushed the circuit-breaker-wait branch from 30a69af to 93ac7e8 Compare March 23, 2023 07:07

ismaelhamed requested a review from Aaronontheweb March 23, 2023 08:42

Aaronontheweb added this to the 1.5.2 milestone Mar 23, 2023

ismaelhamed force-pushed the circuit-breaker-wait branch from 93ac7e8 to dc45d73 Compare March 24, 2023 06:40

Aaronontheweb approved these changes Mar 24, 2023

View reviewed changes

Aaronontheweb enabled auto-merge (squash) March 24, 2023 15:23

ismaelhamed force-pushed the circuit-breaker-wait branch from fd41320 to 8b793ed Compare March 25, 2023 06:40

Alternative implementation of AtomicState leveraging WaitAsync

3f1490c

ismaelhamed force-pushed the circuit-breaker-wait branch from 8b793ed to 3f1490c Compare March 28, 2023 05:37

Aaronontheweb disabled auto-merge March 28, 2023 18:03

Aaronontheweb merged commit d156ff4 into akkadotnet:dev Mar 28, 2023

Arkatufus mentioned this pull request Apr 5, 2023

Update RELEASE_NOTES.md for v1.5.2 release #6635

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alternative implementation of `AtomicState` leveraging WaitAsync #6109

Alternative implementation of `AtomicState` leveraging WaitAsync #6109

ismaelhamed commented Sep 23, 2022 •

edited

Loading

ismaelhamed commented Sep 23, 2022

ismaelhamed commented Sep 23, 2022

Aaronontheweb commented Sep 23, 2022

Aaronontheweb commented Sep 23, 2022

Aaronontheweb left a comment

ismaelhamed commented Sep 25, 2022

ismaelhamed commented Sep 26, 2022

Aaronontheweb commented Sep 28, 2022

Aaronontheweb commented Sep 28, 2022

ismaelhamed commented Sep 28, 2022

Aaronontheweb commented Sep 28, 2022

Aaronontheweb commented Oct 31, 2022

ismaelhamed commented Oct 31, 2022

Aaronontheweb left a comment

Aaronontheweb Oct 31, 2022

ismaelhamed Oct 31, 2022

ismaelhamed Mar 22, 2023

Aaronontheweb Oct 31, 2022

Aaronontheweb Oct 31, 2022

ismaelhamed Oct 31, 2022

ismaelhamed Mar 22, 2023

Aaronontheweb commented Feb 9, 2023

ismaelhamed commented Feb 9, 2023

ismaelhamed commented Mar 22, 2023

Aaronontheweb commented Mar 23, 2023

Aaronontheweb commented Mar 23, 2023

Aaronontheweb left a comment

Aaronontheweb commented Mar 24, 2023

Alternative implementation of AtomicState leveraging WaitAsync #6109

Alternative implementation of AtomicState leveraging WaitAsync #6109

Conversation

ismaelhamed commented Sep 23, 2022 • edited Loading

ismaelhamed commented Sep 23, 2022

ismaelhamed commented Sep 23, 2022

Aaronontheweb commented Sep 23, 2022

Aaronontheweb commented Sep 23, 2022

Aaronontheweb left a comment

Choose a reason for hiding this comment

ismaelhamed commented Sep 25, 2022

ismaelhamed commented Sep 26, 2022

Aaronontheweb commented Sep 28, 2022

Aaronontheweb commented Sep 28, 2022

ismaelhamed commented Sep 28, 2022

Aaronontheweb commented Sep 28, 2022

Aaronontheweb commented Oct 31, 2022

ismaelhamed commented Oct 31, 2022

Aaronontheweb left a comment

Choose a reason for hiding this comment

Aaronontheweb Oct 31, 2022

Choose a reason for hiding this comment

ismaelhamed Oct 31, 2022

Choose a reason for hiding this comment

ismaelhamed Mar 22, 2023

Choose a reason for hiding this comment

Aaronontheweb Oct 31, 2022

Choose a reason for hiding this comment

Aaronontheweb Oct 31, 2022

Choose a reason for hiding this comment

ismaelhamed Oct 31, 2022

Choose a reason for hiding this comment

ismaelhamed Mar 22, 2023

Choose a reason for hiding this comment

Aaronontheweb commented Feb 9, 2023

ismaelhamed commented Feb 9, 2023

ismaelhamed commented Mar 22, 2023

Aaronontheweb commented Mar 23, 2023

Aaronontheweb commented Mar 23, 2023

Aaronontheweb left a comment

Choose a reason for hiding this comment

Aaronontheweb commented Mar 24, 2023

Alternative implementation of `AtomicState` leveraging WaitAsync #6109

Alternative implementation of `AtomicState` leveraging WaitAsync #6109

ismaelhamed commented Sep 23, 2022 •

edited

Loading