Added `TestControl`, with first-class support for time mocking #2276

djspiewak · 2021-08-30T05:06:46Z

Time mocking is back and better than ever! Yay!!! Thoughts very much welcome. Folks who have expressed an interest in this include @SystemFw, @ChristopherDavenport, @benhutchison, and many others. Now is the time to give feedback! I tried to be comprehensive in the TestControl scaladoc.

Note that this also includes a binary-compatible (but source- and semantically-incompatible) reworking of TestContext. The previous API conflated too many concerns and was very error-prone. Time advancement and schedule ticking is now fully separated. Also, as a bonus, I found (and fixed!) a bug in tickAll which was causing all of our test suites to take dramatically longer than necessary, and now they run much faster both in CI and on my laptop. Drinks all around.

The primary concept here is that TestControl represents a relatively low level mechanism for controlling the runtime. Higher level mechanisms should be built on top of this and integrated within specific downstream test frameworks (such as munit). Note that, just as before, there's still no substitute for running your program on the real runtime, and there are several programs which will deadlock on TestContext but not on the real thing. However, for anything involving sleep, this kind of thing is basically indispensable.

In theory this should be a lot more convenient than what we had to do in CE2 (creating artificial Clock instances), since the runtime is automatically threaded through at all points, meaning that you can just use the normal Clock and Temporal instances and everything will just work out of the box. In fact, time mocking even works on the bare IO functions (e.g. IO.sleep, IO.monotonic, and IO.realTime).

As a final goodie, I added support for configuring the source of randomness used in TestContext, similar to how ScalaCheck supports optionally setting the seed as a base64-encoded value. I would imagine that downstream test framework support will print out the seed in any error messages, as well as provide an API for optionally configuring that seed and passing it to the TestControl they use internally to run your examples.

As I said, now is the time for feedback! This is a pretty meaningful API addition and I'd like to make sure we get it right.

…in `TestContext`

djspiewak · 2021-08-30T05:19:53Z

Published as 3.3-7-579f468 so that we can experiment in downstream libraries like https://github.com/typelevel/munit-cats-effect and https://github.com/typelevel/cats-effect-testing.

vasilmkd

Very excited. Marking this as request changes but the changes are pretty minor.

Regarding the binary compatibility, because of the change in behavior, at least to me, the whole situation seems to give a false sense of security that this could be a drop in replacement.

Is there a way to make the private[testkit] implementations delegate to the new implementations in a way that is semantically correct in the new model? For example, do the tick() then advanceAndTick() trick or something.

If the answer is no, maybe it would indeed be better to not guarantee binary compatibility at all and just advise people to upgrade, recompile, fix source changes and change to the new semantics at the same time. Of course, you're much more experienced than me in this regard, so I won't fight any final decision.

kernel-testkit/shared/src/main/scala/cats/effect/kernel/testkit/TestContext.scala

tests/shared/src/test/scala/cats/effect/ResourceSpec.scala

testkit/shared/src/main/scala/cats/effect/testkit/TestControl.scala

djspiewak · 2021-08-30T18:31:42Z

Noodled a bit on some Specs2 support built on top of this API: typelevel/cats-effect-testing#193

"execute a simple test with mock time" in {
  var first = false
  var second = false
  var third = false
  val wait = IO.sleep(1.hour)

  val program = for {
    _ <- IO { first = true }
    _ <- wait
    _ <- IO { second = true }
    _ <- wait
    _ <- IO { third = true }
  } yield 42

  program must execute { (control, result) =>
    result() must beNone

    first must beFalse
    second must beFalse
    third must beFalse

    control.tick()

    first must beTrue
    second must beFalse
    third must beFalse

    control.advanceAndTick(1.hour)

    first must beTrue
    second must beTrue
    third must beFalse

    control.advanceAndTick(1.hour)

    first must beTrue
    second must beTrue
    third must beTrue

    result() must beSome(beRight(42))
  }
}

… spin as aggressively

alexandru · 2021-08-31T06:25:00Z

kernel-testkit/shared/src/main/scala/cats/effect/kernel/testkit/TestContext.scala

+   * where the scheduler ticks see different task lists despite
+   * identical configuration.
+   */
+  def apply(seed: String): TestContext =


Inclusion of a seed is a very good idea 👍

testkit/shared/src/main/scala/cats/effect/testkit/TestControl.scala

SystemFw · 2021-08-31T21:44:43Z

testkit/shared/src/main/scala/cats/effect/testkit/TestControl.scala

+ * rely on `realTime` and `monotonic`, either directly on `IO` or via the
+ * typeclass abstractions.
+ *
+ * WARNING: ''Never'' use this runtime on programs which use the [[IO#evalOn]]


I think it's worth adding a sentence about behaviour of blocking

Good call. To more directly answer, it behaves like JavaScript: the blocking pool and the compute pool are both TestContext.

testkit/shared/src/main/scala/cats/effect/testkit/TestControl.scala

kernel-testkit/shared/src/main/scala/cats/effect/kernel/testkit/TestContext.scala

djspiewak · 2021-09-08T01:27:07Z

Bumping this a bit, @SystemFw. Any follow-ups needed here? Should we look at the munit integration first?

SystemFw · 2021-09-08T07:53:14Z

Yeah munit integration would be good, and tbh I think it'd be great to have a snapshot to try it out. I'm happy with it though, I hadn't realised that wasn't clear

djspiewak · 2021-09-08T12:48:28Z

Actually published a snapshot quite a while ago. 😃 That's what the Specs2 integration is based on. I'll dig into munit.

djspiewak · 2021-09-08T14:39:16Z

New snapshot published as 3.3-115-fab668b

djspiewak · 2021-09-08T16:27:49Z

Odd failure:

[info] IO monad should
2869
[error]   ! evaluate fibers correctly in presence of a parasitic execution context
2870
[error]    java.util.concurrent.TimeoutException: null (Runners.scala:109)
2871
[error] cats.effect.Runners.$anonfun$timeout$1(Runners.scala:109)
2872
[info] Total for specification ParasiticECSpec

vasilmkd · 2021-09-08T17:47:33Z

Might not be an odd failure. The test uses ticker to evaluate equality between generated IO values.

vasilmkd

I am amazed by this PR and the scaladoc is very informative. Thank you for your effort. 🙏🏻

One question still remains, do people think we need to build a higher level ADT for making the different outcomes more explicit. I'm personally worried about Left(CancelationException) not being great from an ergonomics point of view.

vasilmkd · 2021-09-10T21:42:12Z

tests/jvm/src/test/scala/cats/effect/ParasiticECSpec.scala

+      val test = {
+        implicit val ticker = Ticker()

-      val test = IO(implicitly[Arbitrary[IO[Int]]].arbitrary.sample.get).flatMap { io =>
-        IO.delay(io.eqv(io))
+        IO(implicitly[Arbitrary[IO[Int]]].arbitrary.sample.get).flatMap { io =>
+          IO.delay(io.eqv(io))
+        }


I'm pretty sure this change can be reverted.

Baccata · 2021-09-11T06:08:56Z

One question still remains, do people think we need to build a higher level ADT for making the different outcomes more explicit. I'm personally worried about Left(CancelationException) not being great from an ergonomics point of view.

I agree. Even Option[Outcome[Id, Throwable, A]] would be more consistent with the rest of CE

vasilmkd · 2021-09-11T06:42:04Z

I don't hate that. I'm not sure if Daniel considers it a solution if we only do a last-mile wrapping of the Option[Either[Throwable, A]] into a Option[Outcome[Id, Throwable, A]].

Baccata · 2021-09-11T08:45:06Z

As long as cancellation is not surfaced as a cancellation exception, I'm happy 😉

vasilmkd · 2021-09-11T09:02:25Z

Most definitely, I'm really not comfortable with that.

djspiewak · 2021-09-11T15:04:29Z

Okay okay. 😛 I'll try the Option if Outcome approach…

SystemFw · 2021-09-12T17:54:26Z

Well, remember that the UX here is writing assertions, and I think Outcome might actually be the worst UX, given that writing assertions that involve pattern matching is generally a bit annoying. In some ways the most immediate return type is actually IO[Option[A]] (and CancelationException plus other exceptions getting thrown), which integrates with every test framework and favours the most common case

P.S. I'm integrating this into upperbound and the experience is otherwise great, especially the fact that you can just call TestControl.executeFully with no other changes to munit-cats-effect nor a specific test setup beyond basic IO suites

djspiewak · 2021-09-12T23:50:10Z

See Discord for a long diatribe on why we replaced executeFully with executeEmbed. :-)

djspiewak · 2021-09-13T01:05:31Z

Published as 3.3-161-09d830c

SystemFw · 2021-09-15T13:02:54Z

Ok, so I've tried executeEmbed

TL;DR: it's much better, but we should remove the Option from there as well.

Note that I'm only talking executeEmbed here, not stepping through. Here are my arguments (some of them might be repeated from Discord):

Most tests will only care about the actual results, which makes dealing with an Option pointless boilerplate.
If you are calling executeEmbed, you are running the IO to completion, and nontermination is an exceptional outcome, and should therefore be represented as an exception. Note that this isn't the case with results inside execute, since you are legitimately sampling there.
It's ok for the resulting Exception to be defined inside TestControl, e.g. as TestControl.NonTerminationException. This may appear bespoke, but being able to materialise nontermination (currently as an Option) is bespoke to TestControl, so a bespoke exception actually seems very appropriate.
Option is bad UX on failed tests . This might actually be the strongest argument. With Option, a non terminating program will fail as either The result was None or NoSuchElementException. Compare to a proper exception: TestControl.NonTerminationException: the program you have run through TestControl never completed, which might indicate deadlock etc, see these docs etc.

Hence, having executeEmbed: F[A] makes successful tests easier to write, and failed tests easier to understand, and should be preferred to the current API.

djspiewak · 2021-09-15T13:25:30Z

If you are calling executeEmbed, you are running the IO to completion, and nontermination is an exceptional outcome, and should therefore be represented as an exception. Note that this isn't the case with results inside execute, since you are legitimately sampling there.

I was thinking about it over the past couple days and came round to a similar opinion. I think this is probably the most compelling argument. Also I agree with the counterpoint on "bespokeness".

Option is bad UX on failed tests . This might actually be the strongest argument. With Option, a non terminating program will fail as either The result was None or NoSuchElementException. Compare to a proper exception: TestControl.NonTerminationException: the program you have run through TestControl never completed, which might indicate deadlock etc, see these docs etc.

I think it gets even worse than that if someone writes their assertions within the IO, but then that IO deadlocks before a failed assertion gets hit, and then they discard the Option (e.g. by just returning the results of executeEmbed from test). In that scenario, there's no failure, but the test is definitely failing. Arguably that's a test where someone failed to account for every possible failure scenario (in this case, nontermination), but the UX should guide them towards checking for such.

So overall I think I'm in alignment. I'll make the change.

djspiewak · 2021-09-16T16:13:01Z

Changes released in 3.3-162-2022ef9

djspiewak · 2021-09-19T20:49:24Z

Final thoughts @SystemFw or others?

SystemFw · 2021-09-19T21:59:21Z

Really excited this is pretty much done :)
I'm happy with the api, are we set on names? I was thinking about run and stepThrough, but I wouldn't be mad if we kept the current names at all

djspiewak · 2021-09-20T03:39:20Z

I actually really like the names. Honestly, run strikes me as a bit too innocuous. 🙂 Given the weirdness it's performing, I want people to be aware that they're flirting with dragons. Also the documentation pretty consistently refers to "execute" as the verb when running an IO (despite the fact that the production prefix is unsafeRun), so it fits pretty well.

The "Embed" suffix also works really well when comparing with the Fiber api, which is not an incorrect comparison. It also nudges people a bit to consider the stepping API first, which is good IMO even if it isn't the most commonly optimal function, simply because it invites people to understand a bit better what's going on under the surface.

I'm bikeshedding. 😃 Names matter but probably not as much as I'm implicitly asserting here. I just rather like the current ones.

SystemFw · 2021-09-20T16:38:26Z

I just rather like the current ones.

That's ok, let's go ahead with the current ones :)

I think this is a little stale now, and all comments have been addressed

bastewart · 2021-10-22T12:08:25Z

Useless comment: just wanted to say I really like this change (i've "borrowed" it temporarily, pending a release of 3.3)! It's a huge improvement over the hand-rolling required in the past to use TestContext. Very unrelated as well but I think it's a big validation of the design of CE3 that TestControl just exposes a function IO ~> IO and end-users don't have to do anything like overwriting the ContextShift and Timer.

yanns · 2021-11-28T08:15:18Z

This is a really great addition!!! 🥇

djspiewak added 3 commits August 29, 2021 22:57

Completely reworked TestContext API

6a81288

Added TestControl for creating and controlling a mock runtime

7c143dd

Added support for reading (and configuring) the source of randomness …

125ad7e

…in `TestContext`

djspiewak added 🍄 enhancement 🔬 experiment ❗ source breaking ❗ labels Aug 30, 2021

djspiewak added 2 commits August 29, 2021 23:11

Removed unnecessary import

c902d9e

Added mima exclusion for new TestContext constructor

579f468

This was referenced Aug 30, 2021

Experiment with time mocking APIs typelevel/munit-cats-effect#123

Closed

Experiment with time mocking APIs typelevel/cats-effect-testing#192

Closed

vasilmkd previously requested changes Aug 30, 2021

View reviewed changes

djspiewak added 3 commits August 30, 2021 08:43

Fixed scaladoc reference

0a62cbd

Adjusted bincompat shim to be closer to the same semantics

fe9502f

Made everything blocking and stuff

fc65cc2

Removed negative epoch since a) it was confusing, and b) we no longer…

9168ec3

… spin as aggressively

djspiewak added this to the v3.3.0 milestone Aug 30, 2021

alexandru reviewed Aug 31, 2021

View reviewed changes

djspiewak marked this pull request as ready for review August 31, 2021 12:15

SystemFw reviewed Aug 31, 2021

View reviewed changes

djspiewak removed the 🔬 experiment label Sep 1, 2021

Merge branch 'series/3.x' into feature/time-mocking

fab668b

Get a new ticker for each test iteration to avoid flakiness

851ce12

djspiewak requested review from vasilmkd and SystemFw September 10, 2021 20:46

vasilmkd reviewed Sep 10, 2021

View reviewed changes

djspiewak added 2 commits September 12, 2021 10:17

Replaced "unknown" with dots to mirror IO's Show

d227caa

Revised API to work in terms of Outcome rather than Either

154f2d1

djspiewak added 2 commits September 12, 2021 16:54

Replaced executeFully with executeEmbed

4e6597e

Merge branch 'series/3.x' into feature/time-mocking

09d830c

Adjusted the signature of executeEmbed to produce an IO[A]

2022ef9

SystemFw approved these changes Sep 20, 2021

View reviewed changes

djspiewak merged commit a2f2fb7 into typelevel:series/3.x Sep 20, 2021

armanbilge mentioned this pull request Nov 7, 2021

@implicitNotFound on Ticker says to use ticked but it's not part of the testkit #1875

Open

armanbilge mentioned this pull request Apr 17, 2022

Highlight the munitExecutionContext parasitic peculiarity in docs scalameta/munit#500

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added `TestControl`, with first-class support for time mocking #2276

Added `TestControl`, with first-class support for time mocking #2276

djspiewak commented Aug 30, 2021

djspiewak commented Aug 30, 2021

vasilmkd left a comment

djspiewak commented Aug 30, 2021 •

edited

Loading

alexandru Aug 31, 2021

SystemFw Aug 31, 2021

djspiewak Aug 31, 2021

djspiewak commented Sep 8, 2021

SystemFw commented Sep 8, 2021

djspiewak commented Sep 8, 2021

djspiewak commented Sep 8, 2021

djspiewak commented Sep 8, 2021

vasilmkd commented Sep 8, 2021

vasilmkd left a comment

vasilmkd Sep 10, 2021

Baccata commented Sep 11, 2021

vasilmkd commented Sep 11, 2021

Baccata commented Sep 11, 2021

vasilmkd commented Sep 11, 2021

djspiewak commented Sep 11, 2021

SystemFw commented Sep 12, 2021

djspiewak commented Sep 12, 2021

djspiewak commented Sep 13, 2021

SystemFw commented Sep 15, 2021

djspiewak commented Sep 15, 2021

djspiewak commented Sep 16, 2021

djspiewak commented Sep 19, 2021

SystemFw commented Sep 19, 2021

djspiewak commented Sep 20, 2021

SystemFw commented Sep 20, 2021

bastewart commented Oct 22, 2021 •

edited

Loading

yanns commented Nov 28, 2021

Added TestControl, with first-class support for time mocking #2276

Added TestControl, with first-class support for time mocking #2276

Conversation

djspiewak commented Aug 30, 2021

djspiewak commented Aug 30, 2021

vasilmkd left a comment

Choose a reason for hiding this comment

djspiewak commented Aug 30, 2021 • edited Loading

alexandru Aug 31, 2021

Choose a reason for hiding this comment

SystemFw Aug 31, 2021

Choose a reason for hiding this comment

djspiewak Aug 31, 2021

Choose a reason for hiding this comment

djspiewak commented Sep 8, 2021

SystemFw commented Sep 8, 2021

djspiewak commented Sep 8, 2021

djspiewak commented Sep 8, 2021

djspiewak commented Sep 8, 2021

vasilmkd commented Sep 8, 2021

vasilmkd left a comment

Choose a reason for hiding this comment

vasilmkd Sep 10, 2021

Choose a reason for hiding this comment

Baccata commented Sep 11, 2021

vasilmkd commented Sep 11, 2021

Baccata commented Sep 11, 2021

vasilmkd commented Sep 11, 2021

djspiewak commented Sep 11, 2021

SystemFw commented Sep 12, 2021

djspiewak commented Sep 12, 2021

djspiewak commented Sep 13, 2021

SystemFw commented Sep 15, 2021

djspiewak commented Sep 15, 2021

djspiewak commented Sep 16, 2021

djspiewak commented Sep 19, 2021

SystemFw commented Sep 19, 2021

djspiewak commented Sep 20, 2021

SystemFw commented Sep 20, 2021

bastewart commented Oct 22, 2021 • edited Loading

yanns commented Nov 28, 2021

Added `TestControl`, with first-class support for time mocking #2276

Added `TestControl`, with first-class support for time mocking #2276

djspiewak commented Aug 30, 2021 •

edited

Loading

bastewart commented Oct 22, 2021 •

edited

Loading