Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SplitByTypes observables #116

Merged
merged 13 commits into from
Nov 9, 2024
Merged

Add SplitByTypes observables #116

merged 13 commits into from
Nov 9, 2024

Conversation

HollandDM
Copy link
Contributor

@HollandDM HollandDM commented Jan 27, 2024

As discussed in #103, this is the implementation of SplitByType macro for observable.
The macro will expand into match/case definition, so it can takes advantage of exhaustive and reachable checking of the compiler.

@HollandDM HollandDM requested a review from raquo as a code owner January 27, 2024 05:10
@ngbinh
Copy link
Contributor

ngbinh commented May 29, 2024

@kitlangton @raquo is there anything left to do with this PR? I think it is quite useful

@raquo
Copy link
Owner

raquo commented May 29, 2024

It's definitely very useful, thanks for your work on this! I am yet to properly review this, but on the high level it looks good.

Sorry for such a delay, this was just too much for me to include in v17. I plan to merge this when I start working on the next Laminar release, likely around September-October. Since all this ties into Airstream with extension methods, I assume that you can use this code privately for now.

@ngbinh
Copy link
Contributor

ngbinh commented May 30, 2024

Awesome! Thanks @raquo

@HollandDM HollandDM force-pushed the split-by-type branch 2 times, most recently from e67afbc to ee67837 Compare May 30, 2024 16:05
@HollandDM HollandDM requested review from raquo and kitlangton May 31, 2024 03:26
@ngbinh
Copy link
Contributor

ngbinh commented Sep 13, 2024

@raquo https://github.com/felher/laminouter/#how-to-get-rid-of-the-asinstanceof is an example of why this PR can be useful. Can you please take another look? Thanks

@raquo
Copy link
Owner

raquo commented Sep 13, 2024

@ngbinh Sorry for such a long delay on this, it's definitely a very useful feature, the blame is entirely on my availability and schedule. I feel bad for letting such a good contribution go unmerged for so long. The reason is, I'm currently working overdrive on an unrelated project, and won't be able to spend a significant time on OSS until late October, maybe even early November. I'm itching very hard to publish this and other new features once I'm able to dedicate the time. Merging this PR will be one of my first priorities when I get back into it.

handleTypeImpl[Self, I, O, T]('{ matchSplitObservable }, '{ casePf })
}

inline def handleType[T]: MatchTypeObservable[Self, I, O, T] = handlePfType[T] { case t: T => t }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I need to take this "detour" is that, originally this should be handleType[T][O1 >:O](inline handleFn: ((T, Signal[T])) => O1). O1>:O here is very important because it helps refining the output type from the ground "Nothing" up to the output type we want. Sadly Scala 3 right now can only do handleType[T, O1 >:O], and this force user to explicitly named the output type of the handleFn function.
Until SIP-47 is widely adopted, we will have to stick into this detour

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that calling handleType[Foo[A]] is equivalent to the user writing handleType { case f: Foo[A @unchecked] => f } – that is, the Foo will be checked, but A will be unchecked? That's fine and that's what I would expect, given type erasure, just wanted to double check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I believe so, the end result of this macro will create a simple match/cases code, before any transformation happens. So it should follow the normal behavior of scala

@HollandDM HollandDM requested a review from raquo September 16, 2024 09:41
Comment on lines 32 to 37
val signal = myVar.signal
.splitMatch
.handleCase {
case Bar(Some(str)) => str
case Bar(None) => "null"
} { case (str, strSignal) =>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed that the public API requires the user to provide a function expecting a single argument – a tuple of (A, Signal[A]) – does it have to be like this, instead of accepting two arguments, A and Signal[A]?

The single tuple argument basically requires the end user to provide a partial function using case, where a total function is expected. If the function accepted two arguments, we could write the same but without the second case:

      .handleCase {
        case Bar(Some(str)) => str
        case Bar(None) => "null"
      } { (str, strSignal) =>
        ...
      }

Copy link
Contributor Author

@HollandDM HollandDM Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be changed, it's just an old code from the initial implementations that slips through

@raquo
Copy link
Owner

raquo commented Sep 24, 2024

If it's not too much to ask, could you please add another helper – handleValue[A, B](value: A)(result: B) = handleCase { case a if a == value => a } { _ => result } (EDIT - see correction below). This would help transition Waypoint to use these new methods.

* ```
*/

opaque type MatchValueObservable[Self[+_] <: Observable[_], I, O, V0, V1] = Unit
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V0 and V1 should never be different, but we cannot just do MatchValueObservable[Self[+_] <: Observable[_], I, O, V] because after type erasure, compiler cannot differentiates MatchValueObservable[Self[+_] <: Observable[_], I, O, V] from MatchTypeObservable[Self[+_] <: Observable[_], I, O, T].

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH the implications of this are over my head, but it sounds like it's just an unfortunate implementation detail that we have to deal with. In that case, fine by me, I certainly don't have better ideas on this.

I think we're all good now, thanks for all the updates!

I am super pumped for these new operators, this will be one of the first things that I will merge for 18.x, proooobably in a few weeks (but apologies in advance if my 18.x gets delayed a bit, honestly, it's possible, but I'll do my best to carve out some time for it asap).

Copy link
Contributor Author

@HollandDM HollandDM Sep 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ambiguity happens because of the opaque types. I changed them back to value classes to remove this implementation hack. It should be easier to read and understand now.

handleValueApplyImpl('{ matchValueObservable }, '{ handleFn })
}

inline def apply[O1 >: O](inline handleFn: Signal[V1] => O1): MatchSplitObservable[Self, I, O1] = deglate { (_, vSignal) => handleFn(vSignal) }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

difference in signature (Signal[V1] => O1 here) means we need another (quite redundant IMHO) dummy class and dummy "detour".

@HollandDM HollandDM requested review from raquo September 27, 2024 11:11
@HollandDM
Copy link
Contributor Author

HollandDM commented Sep 30, 2024

@raquo should we have another variant for Seq[Bar].
In my code, I use to create something like splitEithers, which is kind of like a split & splitEither hybrid, it split a Seq[Either[A, B]] in to Seq[render(Signal[A]) | render(Signal[B])], and use it quite alot.
But if raquo/Laminar#157 can be resolved, I think we can have some workarounds in this case

@HollandDM
Copy link
Contributor Author

Also, I see that Var can be splitted now, should we also support Var[Bar] too?

@raquo
Copy link
Owner

raquo commented Sep 30, 2024

should we also support Var[Bar] too?

@HollandDM Hmmm yeah, if it's actually possible, it would be nice...

trait Bar
case class Foo(int: Int) extends Bar
case obect Baz extends Bar

To mirror the normal Var.split API, ideally we would want to provide an API like the following, I guess?

val barVar: Var[Bar] = ???
barBar
  .matchSplit
  .handleCase { case Foo(num) => num } { (num: Int, numVar: Var[Int]) => ... }
  .handleValue(Baz) { (bazVar: Var[Baz]) => ... }

So then, we would need to e.g. create such a Var[Int] that would write Foo(int) into barVar – but I don't think that possible to automatically create such a Var, just from case Foo(num) => num – which is an arbitrary function.

So then, it looks like we would need to add another parameter, e.g.:

barBar
  .matchSplit
  .handleCase { case Foo(num) => num } { num => Foo(num) } { (num: Int, numVar: Var[Int]) => ... }
  .handleValue(Baz) { (bazVar: Var[Baz]) => ... }

Which is a bit verbose, but ultimately fine – it would be similar to how we have two params in the zoom method.

@raquo
Copy link
Owner

raquo commented Sep 30, 2024

PR for such a splitEithers method would be welcome. I guess if the user provides e.g. _.id as the splitting key, you would need to use v => (v.id, v.isRight) as the real key under the hood, and then call either the left or the right callback depending on isRight.

I do plan to resolve #157 eventually, although I'm not sure how it relates to splitEithers.

@HollandDM
Copy link
Contributor Author

HollandDM commented Oct 1, 2024

For Seq.splitMatch and Var.splitMatchOne , I believe we can support them. I spent some time last weekend to play around the macros and quite confident in the results. The only remain things are the signatures of these methods.

In Seq.splitMatch (I think I'll change the current splitMatch into splitMatchOne, as its closer to it underlying mechanism), I think we could go with:

  • handleCase { case A => B } { B => Key } { (B, Signal[B]) => O } or handleCase { case A => B }(key: B => Key, project: (B, Signal[B]) => O).
  • handleType[T](T => Key)((T, Signal[T]) => O) (looked pretty similar with the current split signature) or handleType[T](key: T => Key, project: (T, Signal[T]) => O).
  • handleValue(Bar)(=> Key)(Signal[Bar] => O) or handleValue(Bar)(key: => Key, project: Signal[T] => O).

For Var.splitMatch, in the codebase right now we only support split, not splitOne, so the signature would be more verbose if we introduce them. In case you decided to support splitOne for Var, I think (A => B)(B => A) is unavoidable.

Also, I think we will need a way to let users pass their distinctCompose and DuplicateKeysConfig into these methods. What do you think?

I do plan to resolve #157 eventually, although I'm not sure how it relates to splitEithers.

This is more of a Laminar only solution, as you can do:

children <-- eitherSeq.split(eitherToKey) { (_, _, eitherSignal) =>
   child <-- eitherSignal.splitEither(
       left = ???,
       right = ???
   )
}

@raquo
Copy link
Owner

raquo commented Oct 2, 2024

@HollandDM So uh this is a lot of stuff at once... sorry if I mixed anything up.

This new splitMatch that you're proposing (let's call it splitMatchSeq to differentiate from the current splitMatch aka splitMatchOne in discussions) would basically be a more generalized version of splitEithers, it would be to splitMatch what splitEithers would be to splitEither, and split to splitOne, ok. I like the idea.

Let's ignore the Var splitting component to this for now – one thing at a time.

How to define split keys in splitMatchSeq? One option is to define it as a single "global" config when calling splitMatchSeq:

val signalOfFoos: Signal[List[Foo]] = ???

signalOfFoos
  .splitMatchSeq(
    key = {
      case Foo(num) => num
      case Foo2(idStr) => idStr
      case Bar => ()
    },
    distinctCompose = ???,
    duplicateKeys = ???
  )
  .handleCase { case Foo(num) => num } { (num: Int, numSignal: Signal[Int]) => ... }
  .handleType[Foo2] { (foo2: Foo2, foo2Signal: Signal[Foo2])  => ... }
  .handleValue(Bar) { (barSignal: Signal[Bar.type]) => ... ]

This introduces some redundancy between the definition of key and the handleCase-s, however this simplifies the API of handleCase and other helpers, and lets us expand this syntax for Var-s as well (where we'll need to provide another B => A zoomOut callback). I think adding key callbacks to each handleCase would simply be too unwieldy – too many callbacks in one method invocation.

Perhaps the shared key definition could be simplified if the common Foo type had a property like id that all subtypes shared.

The real key used by split under the hood would need to be a tuple of (interrnalMacroCaseNumber, userKey). Thus, userKey could be Any, the keys for different cases don't need to be of the same type, they are simply compared with == internally.

And so, another downside of defining the keys separately like this is that we don't get the key in each render callback. But we do get the initial value (num or foo2 in my example), and the key can usually be trivially derived from that (often it's just initialValue.id or similar).

OTOH I think all of the above can be implemented just in terms of split similarly to how you implemented splitMatch in terms of splitOne. But if you wanted to e.g. provide a separatedistinctCompose to each handleCase, I don't know if that would be possible, it may require changes to the source of SplitSignal, which would be unpleasant.

Overall I think I still to prefer this syntax over specifying the key in every case. What do you think, would it fit your use cases? Am I missing some more ergonomic concerns?


Regarding the naming, I think splitMatchSeq is a better name for this, even if it does not align with the name of the split operator. Me naming that method split was a mistake, in hindsight it should have been splitSeq, then we could have named splitOne as simply split, and the names of all the other helpers like splitOption, splitMatch, etc. would also be consistent. I think eventually I'll rename split to splitSeq, but for now, I don't want to break the code in the videos that mention the split operator, so it's ok to name these new operators inconsistently with split / splitOne (but consistently with other split* helpers).

@HollandDM
Copy link
Contributor Author

using a single "global" function make can make the keys out of sync with the handles branch. E.g:

signalOfFoos
  .splitMatchSeq(
    key = {
      case Foo(num) if num > 0 => num
      case Foo2(idStr) => idStr
      case _ => ()
    },
    distinctCompose = ???,
    duplicateKeys = ???
  )
  .handleCase { case Foo(num) => num } { (num: Int, numSignal: Signal[Int]) => ... }
  .handleType[Foo2] { (foo2: Foo2, foo2Signal: Signal[Foo2])  => ... }
  .handleValue(Bar) { (barSignal: Signal[Bar.type]) => ... ]

This is a valid syntax, but keys and handleCases are out of sync.

Perhaps we should just separate the key definition here from the handle* calls, then internal macros can use the key (interrnalMacroCaseNumber, userKey) like you said, but with a known type for userKey, and we can also put the key in the handle callback as well.

For me personally, beside my custom splitEithers, I haven't find any other need for this splitMatchSeq yet, so my use cases is quite limited. I guess we kinda have to decide the signatures first and then see if it works well enough.

@raquo
Copy link
Owner

raquo commented Oct 2, 2024

This is a valid syntax, but keys and handleCases are out of sync.

I see that there is room for this, but in practice the keys will depend on the type only, so there should be no value-based conditions like if num > 0 even if they're present in the handleCase-s. And of course, using case _ => when the only remaining option is known is IMO a bad practice that bypasses exhaustiveness checks, that one is really on the user. I think Scala's built-in exhaustiveness check here provides enough of a safeguard to eliminate the vast majority of bugs.

I can imagine very complex cases where you would need to pay attention to get the keys right, but realistically, those would be very rare, and I'm ok to take the tradeoff of putting an ergonomic burden on those rare use cases to get a simpler API for the 99% of typical uses, because we don't have a perfect alternative that offers clean syntax and good ergonomics for the typical use cases, especially when we consider the eventual support for splitting Var-s with this operator.

but with a known type for userKey, and we can also put the key in the handle callback as well.

A shared type for userKey would only be useful in a narrow set of cases, but e.g. for an Either, it could easily end up Int | String, or more realistically it would resolve to Any without explicit type ascriptions. Ideally we would want key types to be specific to each handleCase, but that is not possible without moving key callbacks into individual handleCase calls.

I think in practice we can live without providing the key to the handle callback. Usually the key would be something obvious like _.id, so it's not a big deal just to get it from the initial value provided by the callback.

@HollandDM
Copy link
Contributor Author

HollandDM commented Oct 3, 2024

A shared type for userKey would only be useful in a narrow set of cases, but e.g. for an Either, it could easily end up Int | String, or more realistically it would resolve to Any without explicit type ascriptions. Ideally we would want key types to be specific to each handleCase, but that is not possible without moving key callbacks into individual handleCase calls.

I think Its quite impossible for key to be specific for each handle* method if we use this syntax, that why I suggest we don't associate key with handle* at all. For a M[I] input type, we make key to be I => K, and proceed with the handle* methods like this:

signalOfFoos
  .splitMatchSeq(
    key = Foo => K,
    distinctCompose = ???,
    duplicateKeys = ???
  )
  .handleCase { case Foo(num) => num } { (num: Int, numSignal: Signal[Int]) => ... }
  .handleType[Foo2] { (foo2: Foo2, foo2Signal: Signal[Foo2])  => ... }
  .handleValue(Bar) { (barSignal: Signal[Bar.type]) => ... ]

In the above snippet, key doesn't concern itself with what handle*s we're using.
I'm not too concern with initial key being provided or not. I just want to know if keys need to follow handle*s definitions or not.

@raquo
Copy link
Owner

raquo commented Oct 3, 2024

Ok, yes, I think we're on the same page.

If I understand correctly – no, the user-provided keys don't need to follow handleCase-s. Any Foo => K function will work, e.g. if Foo has a common id field, the key function could just be _.id. The only constraint on the keys is as always – each item in the Seq must have a unique key, but here, this constraint is even relaxed a bit – each item in the Seq must have a key that is unique only within the subset of items in the Seq that are rendered with same handleCase – because the real key used under the hood will have a macro-generated handleCaseIndex in it: (handleCaseIndex, userProvidedKey). Is that what you were asking?

Also, just wanted to mention that since the type K in Foo => K won't be exposed anywhere, it could just be Any, no need for a type param to encode it.

@HollandDM
Copy link
Contributor Author

Yeah, I just want to clarify the key point before committing, I see that we agree the key could be any function of type Foo => K.
I'll update the ticket to include splitMatchSeq implementations and testings first (ideally I would follow SplitSpec).
About Var, I think we can continue after splitMatchSeq is good to merge.

@raquo
Copy link
Owner

raquo commented Oct 3, 2024

Sounds good, thank you for all this 🙏

@HollandDM
Copy link
Contributor Author

HollandDM commented Oct 4, 2024

@raquo splitMatchSeq is pushed, please have a look.
I also changed the old splitMatch to splitMatchOne for user to distinguish them better.
The spec of splitMatchSeq follow SplitSignalSpec style, but I'm not copy/paste all of them because of redundancy, so please let me know if you want to include a specific test.

@HollandDM HollandDM changed the title Add SplitByType observables Add SplitByTypes observables Oct 4, 2024
@ngbinh
Copy link
Contributor

ngbinh commented Nov 5, 2024

@raquo is there anything else left to do for this PR? Thanks

Copy link
Owner

@raquo raquo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All is good, thanks! Super stoked to have this feature. I'm finally able to slowly start work on the next version, so I'll be merging this soon.

@raquo raquo merged commit b0fa6a1 into raquo:master Nov 9, 2024
@raquo
Copy link
Owner

raquo commented Nov 9, 2024

@HollandDM Sorry I was too eager to merge this, but now I see there are some issues yet remaining.

Note – I did not expect to need more changes, so I've squash-merged the PR, as I usually do to clean up git history, but because of that you'll need to force rebase on Airstream's master to reconnect to that history. Sorry for the inconvenience :|

Using too many cases at once slows down the compiler, eventually resulting in an OOO crash

To reproduce, add this test to SplitMatchOneSpec:

it("xxx") {
    val myVar = Var[Foo](Bar(Some("initial")))
    val signal = myVar.signal
      .splitMatchOne
      .handleCase { case Bar(Some(str)) => str } { (str, strSignal) => () }
      .handleCase { case Bar(Some(str)) => str } { (str, strSignal) => () }
      // !!! TODO: To crash the compiler, insert the same handleCase line 20-40 times here
      .handleCase { case Bar(Some(str)) => str } { (str, strSignal) => () }
      .handleCase { case Bar(Some(str)) => str } { (str, strSignal) => () }
      .toSignal
  }

For me it usually takes 10-20 seconds for the compiler to crash. Depends on CPU and heap I guess. If it doesn't crash for you, you may need to add more lines. It will still be slow though, and it may give different warnings:

In the last 10 seconds, 5.333 (53.8%) were spent in GC. [Heap: 0.00GB free of 1.00GB, max 1.00GB] Consider increasing the JVM heap using -Xmxor try a different collector, e.g.-XX:+UseG1GC, for better performance.

scala.tools.asm.MethodTooLargeException: Method too large: com/raquo/app/views$package$.<clinit> ()V

I ran into this issue when trying to convert these waypoint cases in laminar-demo app.

The new Airstream tests compile with Scala 3.3.3, but not with 3.3.4 or higher.

  • I get lots of "Macro expansion failed, please use handleCase instead" errors.

  • Also, Scala 3.3.4 starts throwing these warnings:

[warn] -- [E197] Potential Issue Warning: /Users/raquo/code/scala/airstream/src/main/scala-3/com/raquo/airstream/split/SplitMatchSeqMacros.scala:49:148
[warn] 49 |    inline def handleValue[V](inline v: V)(using inline valueOf: ValueOf[V]): SplitMatchSeqValueObservable[Self, I, K, O, CC, V] = handlePfValue[V] { case _: V => v }
[warn]    |                                                                                                                                                    ^
[warn]    |   New anonymous class definition will be duplicated at each inline site
[warn]    |----------------------------------------------------------------------------
[warn]    | Explanation (enabled by `-explain`)
[warn]    |- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
[warn]    | Anonymous class will be defined at each use site, which may lead to a larger number of classfiles.
[warn]    |
[warn]    | To inline class definitions, you may provide an explicit class name to avoid this warning.
[warn]     ----------------------------------------------------------------------------
[warn] four warnings found
  • The warnings were introduced here: Warn or optimize when inline given inlines anonymous class creation scala/scala3#16723
  • The Scala issue contemplates fixes like this, but if I understand correctly, I think the compiler is complaining that { case _: V => v } will create a new anonymous class every time the user calls handleValue(value). So I'm kind of not sure if this is a real problem in our case. Is there a way to see the scala code generated by macros, to confirm that it's generating the code we expect?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants