-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slowdown in JSON3 from 1.8 to 1.9 #48229
Comments
I can confirm that running the original issue on JSON3 1.12 is fixed on master: julia> @time json3_readlines(kprm_file_json)
0.776450 seconds (3.40 M allocations: 759.053 MiB, 8.05% gc time, 1.24% compilation time) but not on the current |
(I could probably do the bisection process again if helpful; i.e. identify which commit introduced the regression and then which commit fixed, if that would be helpful for trying to figure out a backport solution here) |
I’ll have time to take a look this weekend. Both bisecting and subtyping MWE would be appreciated. |
Here's the smallest repro I could come up with; going to try and do a bisection using it today: using Pkg
Pkg.add(Pkg.PackageSpec(;name="JSON3", version="1.12.0"))
using JSON3
JSON3.read("[]"); # warmup
t = @elapsed JSON3.read("[]");
if t > 0.0001
# bad
exit(1)
else
exit(0)
end |
Bisect identifies 19f44b6 as the first bad commit |
Ooof, so this is really probably just a dup of #48612, where the answer is "won't fix" for 1.9? |
Hmmm, hold up just a bit; it looks like the 633 commit helped, but on latest master, I'm actually still seeing a fairly sizeable regression. Let me dig in a bit more. |
The runtime sparam computation is still there on master. I tried some benchmark locally, which shows that the regression got migrated due to some performance tuning on Line 868 in f61bbfb
But the cost of the following subtyping keeps the same. |
Thanks for the help investigating @N5N3; I'm a little unclear what your concluding here though. Are you confirming what I mentioned? That this still seems regressed on master, even though it's slightly better than current release-1.9? Also curious if you have thoughts on what to do now? It sounds like further bisection isn't really valuable; I could probably come up with a smaller reproduction if helpful? Anything else I could do? Or anyone else we could ping to have help here? |
Yes.
|
Due to several 1.9 performance regressions that heavily impact JSON3, I decided to just dig in and see if we can fix in on our side since the JuliaLang changes seem pretty involved. (see issues [here](JuliaLang/julia#50762) and [here](JuliaLang/julia#48229) for context). From what I can tell, the issue really boils down to this one `getvalue` internal method on `JSON3.Array` where it was having to do some expensive "sparams" computation to instantiate the `JSON3.Array` type. By manually unrolling this method a bit and fully specifying all the type parameters this seems to resolve the performance issues. Benchmarks look like: Julia 1.8: ```julia julia> @benchmark JSON3.defaultminimum(x) BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 31.625 μs … 140.959 μs ┊ GC (min … max): 0.00% … 0.00% Time (median): 32.250 μs ┊ GC (median): 0.00% Time (mean ± σ): 32.602 μs ± 1.562 μs ┊ GC (mean ± σ): 0.00% ± 0.00% ▂█▃▂▃ ▁▂▅██████▆▄▅▂▂▂▂▂▃▂▂▃▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂ 31.6 μs Histogram: frequency by time 36.6 μs < Memory estimate: 6.70 KiB, allocs estimate: 84. ``` Julia 1.9: ```julia julia> @benchmark JSON3.defaultminimum(x) BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 165.584 μs … 10.519 ms ┊ GC (min … max): 0.00% … 94.46% Time (median): 168.166 μs ┊ GC (median): 0.00% Time (mean ± σ): 170.886 μs ± 103.652 μs ┊ GC (mean ± σ): 0.58% ± 0.94% ▄█▆▃ ▁▂▇████▇▇▅▄▄▄▄▃▃▂▂▂▃▃▃▄▄▃▃▃▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂ 166 μs Histogram: frequency by time 185 μs < Memory estimate: 10.11 KiB, allocs estimate: 166. ``` This PR: ```julia julia> @benchmark JSON3.defaultminimum(x) BenchmarkTools.Trial: 10000 samples with 3 evaluations. Range (min … max): 8.431 μs … 497.806 μs ┊ GC (min … max): 0.00% … 95.49% Time (median): 8.820 μs ┊ GC (median): 0.00% Time (mean ± σ): 9.157 μs ± 9.056 μs ┊ GC (mean ± σ): 1.92% ± 1.91% ▂▄▃▂ ▅██▆▄▂▁▃▅▆▅▄▃▂▂▂▂▂▂▁▁▁ ▂ ▅████▇▆██████████████████████▇▇▇▇▇▇▇▇▇▇▆▇▇▅▆▆▅▆▆▆▅▄▅▅▅▄▄▅▅▄ █ 8.43 μs Histogram: log(frequency) by time 10.8 μs < Memory estimate: 6.20 KiB, allocs estimate: 72. ```
* Refactor internal getvalue for JSON3.Array to improve performance Due to several 1.9 performance regressions that heavily impact JSON3, I decided to just dig in and see if we can fix in on our side since the JuliaLang changes seem pretty involved. (see issues [here](JuliaLang/julia#50762) and [here](JuliaLang/julia#48229) for context). From what I can tell, the issue really boils down to this one `getvalue` internal method on `JSON3.Array` where it was having to do some expensive "sparams" computation to instantiate the `JSON3.Array` type. By manually unrolling this method a bit and fully specifying all the type parameters this seems to resolve the performance issues. Benchmarks look like: Julia 1.8: ```julia julia> @benchmark JSON3.defaultminimum(x) BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 31.625 μs … 140.959 μs ┊ GC (min … max): 0.00% … 0.00% Time (median): 32.250 μs ┊ GC (median): 0.00% Time (mean ± σ): 32.602 μs ± 1.562 μs ┊ GC (mean ± σ): 0.00% ± 0.00% ▂█▃▂▃ ▁▂▅██████▆▄▅▂▂▂▂▂▃▂▂▃▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂ 31.6 μs Histogram: frequency by time 36.6 μs < Memory estimate: 6.70 KiB, allocs estimate: 84. ``` Julia 1.9: ```julia julia> @benchmark JSON3.defaultminimum(x) BenchmarkTools.Trial: 10000 samples with 1 evaluation. Range (min … max): 165.584 μs … 10.519 ms ┊ GC (min … max): 0.00% … 94.46% Time (median): 168.166 μs ┊ GC (median): 0.00% Time (mean ± σ): 170.886 μs ± 103.652 μs ┊ GC (mean ± σ): 0.58% ± 0.94% ▄█▆▃ ▁▂▇████▇▇▅▄▄▄▄▃▃▂▂▂▃▃▃▄▄▃▃▃▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂ 166 μs Histogram: frequency by time 185 μs < Memory estimate: 10.11 KiB, allocs estimate: 166. ``` This PR: ```julia julia> @benchmark JSON3.defaultminimum(x) BenchmarkTools.Trial: 10000 samples with 3 evaluations. Range (min … max): 8.431 μs … 497.806 μs ┊ GC (min … max): 0.00% … 95.49% Time (median): 8.820 μs ┊ GC (median): 0.00% Time (mean ± σ): 9.157 μs ± 9.056 μs ┊ GC (mean ± σ): 1.92% ± 1.91% ▂▄▃▂ ▅██▆▄▂▁▃▅▆▅▄▃▂▂▂▂▂▂▁▁▁ ▂ ▅████▇▆██████████████████████▇▇▇▇▇▇▇▇▇▇▆▇▇▅▆▆▅▆▆▆▅▄▅▅▅▄▄▅▅▄ █ 8.43 μs Histogram: log(frequency) by time 10.8 μs < Memory estimate: 6.20 KiB, allocs estimate: 72. ``` * Remove print-related test that breaks post 1.9
This is a write up from the issue reported in https://twitter.com/mberesewicz/status/1612721251209908224.
To repro add version 1.12.0 of JSON3::
Repro script:
On 1.9 this takes about:
while on 1.8 it takes:
Looking at the profile, we have on 1.9 a bunch of subtyping that comes in the call to
read
:which seems to come from the constructor in https://github.com/quinnj/JSON3.jl/blob/faa521e37317826a38bd211898b512be592be5c7/src/JSON3.jl#L22. This is not present in 1.8:
An interesting thing is that the performance issue has been somewhat fixed on the main branch of JSON3 by commit quinnj/JSON3.jl@faa521e:
However, the difference this commit made what I can see is actually worse type inference https://github.com/quinnj/JSON3.jl/blob/faa521e37317826a38bd211898b512be592be5c7/src/read.jl#L65 now gets inferred as (looking at optimized IR):
(1.12.0) (slow)
main (fast)
So in the fast case we only know that the second type parameter
S
is<: AbstractVector{UInt8}
but in the slow case we know thatS == Base.CodeUnits{UInt8, String}
.It seems bad here that more precise type information seems to cause there to be a bunch of subtyping that slows down the code.
The text was updated successfully, but these errors were encountered: