Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stateful tracing, revisted #108

Closed
wants to merge 4 commits into from
Closed

Stateful tracing, revisted #108

wants to merge 4 commits into from

Conversation

rossabaker
Copy link
Member

Inspired by @djspiewak's toot, here's IOLocal[Ref[IO, Scope]]. This structure passes the following:

  statefulIO(0).flatMap {
    (stateful: cats.mtl.Stateful[IO, Int]) =>
      stateful.set(1).race(IO.never) >>
      stateful.get.flatMap(IO.println)
  }

In so doing, it puts @iRevive's original stateful design back on the table, at the expense of Kleisli et al.

Performance is probably hot trash, and there are probably dragons with Resource#allocated and releasing in an order abominable unto the acquisitions. But it passes the interruptedScope test!

@armanbilge
Copy link
Member

Huh, does this actually work? I never understood how it's supposed to work with Ref. What happens if I do something like this:

trace.span("outer") {
  IO.both(
    trace.span("inner-left")(...),
    trace.span("inner-right")(...)
  )
}

@rossabaker
Copy link
Member Author

A mess happens. An absolute mess.

@rossabaker
Copy link
Member Author

Yeah, my implementation always calls .get on the IOLocal, such that it might as well just be a Ref.

@rossabaker
Copy link
Member Author

This old FiberRef is not far off what I did here, but it adds a locally method. To use that, we're back to passing the spanned effect instead of exposing resources, and I'm too tired to think of how we're any better off than we are with the "local semantics".

@armanbilge
Copy link
Member

armanbilge commented Feb 3, 2023

typelevel/cats-effect#3100 (comment) suggests that this "FiberRef" construct needs some sort of explicit fork operator.

So maybe my example would be:

trace.span("outer") {
  IO.both(
    trace.fork *> trace.span("inner-left")(...),
    trace.fork *> trace.span("inner-right")(...)
  )
}

Still, I'd worry this is brittle in practice—to fork or not to fork? Non-trace-aware libraries wouldn't be able to do this, so it would be responsibility of traced callers to apply forking at all the right places.

@rossabaker
Copy link
Member Author

5a7a771 was a mad dash to compilation before I go to my next appointment, but it does the right thing on @armanbilge's challenge.

@armanbilge
Copy link
Member

Ah, so the idea is we should fork every time we start a new span? 💡 😃

Comment on lines 86 to 90
private def createScope(scope: Scope): Resource[F, Unit] =
Resource(
local.modify(oldRef =>
(Ref.unsafe[F, Scope](scope), () -> local.set(oldRef).to[F])).to[F]
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, this one's cheating. It fails the interruptScope test. It's pretty, though.

Comment on lines +90 to +92
_ <- Resource
.make(local.set(newRef).to[F])(_ => local.set(oldRef).to[F])
ref <- Resource.eval(local.get.to[F])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ref is not guaranteed to be newRef!

@rossabaker
Copy link
Member Author

As discussed in typelevel/cats-effect#3405, there are things that can go wrong with allocated, when a library uses it under the covers. Even though this guards against the primary way FS2 uses it (or, pedantically, pattern matches against it), it is still trivial to corrupt the state with it (not compiled, I'm in a hurry):

for {
  a <- createScope(...).allocated
  b <- createScope(...).allocated
  _ <- a._2 // sets back to original scope
  _ <- b._2 // sets back to b's parent's scope, a
} yield ()

The stateful scopes are always going to inhabit a minefield that the local scopes do not. As much as I prefer this API, I'm about ready to concede.

@bpholt
Copy link
Member

bpholt commented Feb 6, 2023

The stateful scopes are always going to inhabit a minefield that the local scopes do not. As much as I prefer this API, I'm about ready to concede

Is there an updated overview anywhere of what you mean by local vs stateful scopes? I remember a table from some old Natchez issues but it would be helpful to me to have something more recent (and that incorporates everything you've learned since then).

@rossabaker
Copy link
Member Author

I'll riff on an old table.

Local Stateful
acquire/release under resource
use under resource 🚫 1
Supports IOLocal
Supports Kleisli 🚫 2
Supports Ref 🚫 ⚠️ 3
Supports ThreadLocal 🚫 ⚠️ 3
span[A](name: String)(fa: F[A]): F[A] ⚠️ 3
span(name: String): Resource[F, Unit] 🚫 ⚠️ 4
resource[A](name: String)(r: Resource[F, A]): Resource[F, A] ⚠️ 4
stream[A](name: String)(r: Stream[F, A]): Stream[F, A] 5
stream(name: String): Stream[F, Unit] 🚫 5

Footnotes

  1. wrapResource simulates this, but it doesn't return an actual Resource

  2. can probably transform it to Kleisli, but the environment is not the span context

  3. this will compile but do weird shit under concurrency 2 3

  4. this will compile but do weird shit with .allocated and FS2 2

  5. I haven't tried to break these yet but I'm pretty sure weird shit is possible here, too 2

@armanbilge
Copy link
Member

Supports ThreadLocal

I bet we can get Local to support ThreadLocal. The nice thing about Local is that the underlying implementation is always in terms of tracing F[A], so it is a lot easier to reason about.

@bpholt asked about this in typelevel/natchez#706 as well. Can someone point me at something that relies on ThreadLocals so I can see how it works?

Basically I imagine you want to transform foreignCode: F[A] to something like dumpLocal *> foreignCode <* retrieveLocal. There's some trickyness with thread scheduling though which is why I'd have to see some specific use-cases.

@rossabaker
Copy link
Member Author

ThreadLocal is potentially very useful for integration with synchronous Java libraries, but it's going to behave weirdly unless you can guarantee foreignCode is synchronous. It's going to be like fs2 and its hidden fiber races.

@armanbilge
Copy link
Member

I mean, if foreignCode is async, how does foreign library expect you to use thread locals in the first place? If we control the EC, then we can do tricky stuff to dump/retrieve thread locals before/after each runnable. It's all possible, just depends on the specifics.

@rossabaker
Copy link
Member Author

I'm not sure what we're disagreeing about. You can make an implementation of Local[ThreadLocal, E], but it won't work for arbitrary F[A]. It should work just fine for F[A] that doesn't hop threads.

@armanbilge
Copy link
Member

armanbilge commented Feb 6, 2023

I think, my claim is that you can make an implementation, even with the thread-hopping :) but this is better discussed/prototyped with a concrete foreign lib.

@rossabaker
Copy link
Member Author

I don't have a library handy but I have a stupid effect. It's a lawful local in that it won't be observed outside the scope, but it's a footgun in that it won't be observed inside the evalOn.

    implicit def threadLocalLocal[E](threadLocal: ThreadLocal[E]) = new Local[IO, E] {
      val applicative = Applicative[IO]
      def ask[E2 >: E] = IO(threadLocal.get())
      def local[A](fa: IO[A])(f: E => E): IO[A] = {
        IO {
          val p = threadLocal.get()
          threadLocal.set(f(p))
          p
        }.bracket(_ => fa)(p => IO(threadLocal.set(p)))
      }
    }

    IO {
      val tl = new ThreadLocal[Int]()
      tl.set(0)
      tl
    }.map(threadLocalLocal).flatMap { implicit local =>
      (
        local.scope(local.ask[Int])(1),
        local.scope(local.ask[Int].evalOn(scala.concurrent.ExecutionContext.global))(2)
      ).tupled
    }

@armanbilge
Copy link
Member

Yeah, ok, you can't make that work with vanilla IO—you'd need a newtype with a custom Async that does thread-local propagation at the evalOn.

But I think that's the wrong approach. I think what we should do is use an ordinary Local (either Kleisli or IOLocal) and whenever you need to do foreign interop, to ask the span and "load" it into a ThreadLocal right before calling the foreign APIs.

@rossabaker
Copy link
Member Author

The ThreadLocal is interesting, this branch no longer is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants