Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve jsoniter encoding performance #2181

Merged
merged 5 commits into from
Apr 8, 2024

Conversation

kyri-petrou
Copy link
Collaborator

TIL that the number of cases in when pattern-matching affects performance in an interesting way:

  1. When the number of cases is relatively low (~less than 10), the cost of pattern-matching is relatively constant.
  2. When using type-checking instead of unapply methods, we can increase the number of cases a bit while keeping the cost constant
  3. When the number of cases increases above a threshold (which is not always constant!), then the cost of pattern-matching increases by a fair bit

I'm not sure exactly why this happens, but my guess is that the JVM manages to JIT/inline the pattern matching when the number of cases is relatively low.

Unfortunately, the only way to keep the number of cases below this threshold (12 type-checked cases in our case) was to remove the type-checking on StreamValue and use the .toString method. This should work OK as long as we don't need to extend ResponseValue with another kind of value, and forget to add it to the pattern matching. Given that the chances of adding another kind of ResponseValue type are very very small, I think this approach should be okay

With these changes, we see ~20% improvement in throughput when encoding ResponseValues 🎉 🚀

PS: I also noticed that we were parsing floats into a BigDecimalNumber for all floats, so I changed it to first try and parse them into a Double which is more memory efficient, and fallback to BigDecimalNumber only in case of an error

@@ -5,6 +5,7 @@ import caliban._
import caliban.parsing.adt.LocationInfo
import com.github.plokhotnyuk.jsoniter_scala.core._

import scala.annotation.nowarn
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What makes a warning?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, dev leftover

case v: FloatValue.FloatNumber => out.writeVal(v.value)
case v: FloatValue.DoubleNumber => out.writeVal(v.value)
case v: FloatValue.BigDecimalNumber => out.writeVal(v.value)
case v => out.writeVal(v.toString)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to bite us some day 🥲
Couldn't we group the remaining IntValue (and FloatValue) into one case and do a second pattern matching?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking the same thing, but then I thought it's very unlikely we'll add more ResponseValue types. But the more I think of it, I think you're right, better be safe than sorry.

Copy link
Collaborator Author

@kyri-petrou kyri-petrou Apr 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a couple of different ways in order to fully preserve type-safety, and the only feasible way to do it was to have a "full encoder" to fallback to. Encoding non-common types won't be too great performance-wise, but at least we get really good performance when for common types, and we get to preserve type-safety

Copy link
Owner

@ghostdogpr ghostdogpr Apr 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the issue with

case v: IntValue    => some method that only pattern match on IntValue
case v: FloatValue  => some method that only pattern match on FloatValue

at the end and no case _ =>?
We can still keep IntNumber at the top.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the issue with IntValue is that both IntNumber and LongNumber are commonly used (LongNumber is used a fair bit in some wrappers like ApolloTracing), so handling it generically doesn't make sense cause the only case left is BigIntNumber.

I ended up handling only FloatValue generically, while leaving DoubleNumber in the main pattern match cause I think that's something that's somewhat commonly used. Running the benchmarks shows that we're still OK with the number of cases that we have this way

case v: IntValue.LongNumber => out.writeVal(v.value)
case v: IntValue.BigIntNumber => out.writeVal(v.value)
case v: FloatValue.FloatNumber => out.writeVal(v.value)
case v: FloatValue.DoubleNumber => out.writeVal(v.value)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn’t we put the cases that are not included in the fast encoder first? Or does it not make a difference to the compiler?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up only handling FloatValue in a separate method, seems that the number of cases is still low enough to get the ~20% improvement

@kyri-petrou kyri-petrou merged commit 79e363a into series/2.x Apr 8, 2024
10 checks passed
@kyri-petrou kyri-petrou deleted the improve-jsoniter-encoding branch April 8, 2024 01:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants