Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.11 generates 50% larger cache files #53570

Open
joa-quim opened this issue Mar 3, 2024 · 14 comments
Open

v1.11 generates 50% larger cache files #53570

joa-quim opened this issue Mar 3, 2024 · 14 comments
Labels
performance Must go faster regression Regression in behavior compared to a previous version

Comments

@joa-quim
Copy link

joa-quim commented Mar 3, 2024

THe compile time is quite variable but the differences are comparable to this.

  | | |_| | | | (_| |  |  Version 1.10.2 (2024-03-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> cd("c:/v"); @time using GMT
Precompiling GMT
  1 dependency successfully precompiled in 48 seconds. 87 already precompiled.
 49.657821 seconds (4.51 M allocations: 328.234 MiB, 0.21% gc time, 1.84% compilation time)
  | | |_| | | | (_| |  |  Version 1.11.0-alpha1 (2024-03-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> cd("c:/v"); @time using GMT
Precompiling GMT
  1 dependency successfully precompiled in 60 seconds. 112 already precompiled.
 61.731323 seconds (4.22 M allocations: 263.147 MiB, 0.35% gc time, 1.69% compilation time: 15% of which was recompilation)

v1.11 cache file -> ~89.5 MB
v1.10 -> ~59.5 Mb

Load times

  | | |_| | | | (_| |  |  Version 1.10.2 (2024-03-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> @time_imports using GMT
               ┌ 2.7 ms SuiteSparse_jll.__init__()
     25.6 ms  SuiteSparse_jll 85.64% compilation time
               ┌ 5.0 ms SparseArrays.CHOLMOD.__init__() 98.93% compilation time
    123.1 ms  SparseArrays 3.99% compilation time
      0.7 ms  Statistics
      0.2 ms  DataValueInterfaces
      0.6 ms  DataAPI
      0.2 ms  IteratorInterfaceExtensions
      0.2 ms  TableTraits
      6.3 ms  Tables
      0.2 ms  Reexport
     12.1 ms  Preferences
      0.3 ms  PrecompileTools
      5.7 ms  StringManipulation
     10.3 ms  Crayons
      0.6 ms  LaTeXStrings
     63.7 ms  PrettyTables
               ┌ 17.4 ms GMT.Gdal.__init__()
               ├ 16.6 ms GMT.__init__()
    319.5 ms  GMT
  | | |_| | | | (_| |  |  Version 1.11.0-alpha1 (2024-03-01)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> @time_imports using GMT
      0.7 ms  Statistics
      0.3 ms  DataValueInterfaces
      0.5 ms  DataAPI
      0.2 ms  IteratorInterfaceExtensions
      0.2 ms  TableTraits
      6.8 ms  Tables
      0.4 ms  Reexport
      8.9 ms  Preferences
      0.4 ms  PrecompileTools
      5.8 ms  StringManipulation
     11.7 ms  Crayons
      0.6 ms  LaTeXStrings
     69.4 ms  PrettyTables
               ┌ 18.5 ms GMT.Gdal.__init__()
               ├ 108.3 ms GMT.__init__() 88.15% compilation time (100% recompilation)
    814.4 ms  GMT 54.63% compilation time (59% recompilation)
@oscardssmith oscardssmith added performance Must go faster regression Regression in behavior compared to a previous version labels Mar 3, 2024
@KristofferC KristofferC added this to the 1.11 milestone Mar 3, 2024
@JeffBezanson
Copy link
Member

JeffBezanson commented Mar 8, 2024

Looks like, partly, new invalidations in GMT.__init__()? (and therefore possibly in other things too)

@JeffBezanson
Copy link
Member

Maybe similar to #53511 ?

@KristofferC
Copy link
Member

Would be interesting to put back all stdlibs into the sysimage and re-time this to see if the effect is purely for moving out stdlibs or if there are other reasons as well.

@mkitti
Copy link
Contributor

mkitti commented May 5, 2024

Would be interesting to put back all stdlibs into the sysimage and re-time this to see if the effect is purely for moving out stdlibs or if there are other reasons as well.

We should consider building this and releasing it as an artifact. It would be using at least for testing and some loading sensitive applications as well.

@jaakkor2
Copy link
Contributor

jaakkor2 commented May 6, 2024

Worst offender in the precompiled size I have seen is https://github.com/Gnimuc/GLTF.jl. v1.11.0-beta1 (211 MB) is about 5x bigger than v1.10.3 (43 MB).

@KristofferC
Copy link
Member

As you showed in quinnj/JSON3.jl#279, the issue there seems to be egregious use of @inline on large functions. It isn't obvious why that would change between 1.10 to 1.11 but one reason could be that we are better at precompiling now so more of the (very big due to `@inline) functions get saved to the image file.

@KristofferC
Copy link
Member

This seems to be more or less fixed on 1.11 backport branch. This is using the master branch of GMT.jl:

julia> @time using GMT
  0.638436 seconds (668.47 k allocations: 51.051 MiB, 5.54% gc time, 2.08% compilation time)

julia> VERSION
v"1.10.3"
julia> @time using GMT
  0.657463 seconds (730.44 k allocations: 49.958 MiB, 1.27% gc time, 1.48% compilation time)

julia> VERSION
v"1.11.0-beta1.40"

@joa-quim
Copy link
Author

joa-quim commented May 6, 2024

But the precompiled image is still ~50% larger.

@mkitti
Copy link
Contributor

mkitti commented May 6, 2024

@KristofferC KristofferC changed the title v1.11 Takes longer to compile, generates 50% larger cache files and loads slower v1.11 generates 50% larger cache files May 6, 2024
@KristofferC KristofferC removed this from the 1.11 milestone May 6, 2024
@KristofferC KristofferC reopened this May 6, 2024
@joa-quim
Copy link
Author

joa-quim commented May 6, 2024

Try https://github.com/timholy/PkgCacheInspector.jl

I did some time ago ans spent quite some effort trying to improve the situation (forgot many of the details). But now I can't even use that package anymore.

  | | |_| | | | (_| |  |  Version 1.10.0 (2023-12-25)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> cd("c:/v"); @time using GMT
Precompiling GMT
  1 dependency successfully precompiled in 44 seconds. 87 already precompiled.
 46.397674 seconds (4.63 M allocations: 340.133 MiB, 0.28% gc time, 1.79% compilation time)

julia> using PkgCacheInspector

julia> info_cachefile("GMT")
ERROR: Error reading package image file.
  | | |_| | | | (_| |  |  Version 1.11.0-beta1 (2024-04-10)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> using PkgCacheInspector

julia> info_cachefile("GMT")
ERROR: MethodError: no method matching parse_cache_header(::IOStream)
The function `parse_cache_header` exists, but no method is defined for this combination of argument types.

But note that this cache size issue is common to other packages. For example Makie cache is 60% larger in 1.11 vs 1.10

@fatteneder
Copy link
Member

That last error is due to #49866 in which the signature of an internal method was changed, which is used by PkgCacheInspector.jl. Should be straightforward to fix, will make a PR later.

@KristofferC
Copy link
Member

1.11:

Contents of /Users/kristoffercarlsson/.julia/compiled/v1.11/GMT/EoU0j_u25Qm.dylib:
  modules: Any[GMT.Gdal, GMT.Drawing, GMT]
  init order: Any[GMT.Gdal, GMT]
  1503 external methods
  22730 new specializations of external methods (Base 80.0%, Base.Broadcast 14.5%, Base.Iterators 2.5%, ...)
  1371 external methods with new roots
  36342 external targets
  28454 edges
  file size:   92678656 (88.385 MiB)
  Segment sizes (bytes):
  system:      25084036 ( 29.27%)
  isbits:      57449436 ( 67.04%)
  symbols:       149474 (  0.17%)
  tags:          410413 (  0.48%)
  relocations:  2523569 (  2.94%)
  gvars:          44312 (  0.05%)
  fptrs:          33536 (  0.04%)

1.10:

Contents of /Users/kristoffercarlsson/.julia/compiled/v1.10/GMT/EoU0j_Vg0I0.dylib:
  modules: Any[GMT.Gdal, GMT.Drawing, GMT]
  init order: Any[GMT.Gdal, GMT]
  1503 external methods
  40014 new specializations of external methods (Base 72.8%, Base.Broadcast 14.0%, GMT 5.9%, ...)
  1157 external methods with new roots
  31428 external targets
  24264 edges
  file size:   56215744 (53.612 MiB)
  Segment sizes (bytes):
  system:      18509732 ( 36.03%)
  isbits:      30370140 ( 59.12%)
  symbols:       150178 (  0.29%)
  tags:          271567 (  0.53%)
  relocations:  1986075 (  3.87%)
  gvars:          50072 (  0.10%)
  fptrs:          30080 (  0.06%)

Haven't done any more analysis than that. Seems a bit strange that 1.11 has way fewer specializations but still larger file size.

@fatteneder
Copy link
Member

Try https://github.com/timholy/PkgCacheInspector.jl

I did some time ago ans spent quite some effort trying to improve the situation (forgot many of the details). But now I can't even use that package anymore.

@joa-quim Please update PkgCacheInspector.jl to v1.0.1 and try again. JuliaRegistries/General#106291

@joa-quim
Copy link
Author

joa-quim commented May 6, 2024

Thanks. It works now ... but only once. Anyway, I don't know how to use the info to understand what makes the cache file so big.

julia> using PkgCacheInspector

julia> x = info_cachefile("GMT")
Contents of C:\Users\j\.julia\compiled\v1.11\GMT\EoU0j_tYaDV.dll:
  modules: Any[GMT.Gdal, GMT.Drawing, GMT]
  init order: Any[GMT.Gdal, GMT]
  1503 external methods
  22890 new specializations of external methods (Base 80.4%, Base.Broadcast 14.0%, Base.Iterators 2.5%, ...)
  1384 external methods with new roots
  36769 external targets
  28713 edges
  file size:   102150656 (97.418 MiB)
  Segment sizes (bytes):
  system:      25201556 ( 29.22%)
  isbits:      57871532 ( 67.10%)
  symbols:       147304 (  0.17%)
  tags:          412695 (  0.48%)
  relocations:  2534014 (  2.94%)
  gvars:          44560 (  0.05%)
  fptrs:          33416 (  0.04%)


julia> x = info_cachefile("GMT")
ERROR: Error reading package image file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster regression Regression in behavior compared to a previous version
Projects
None yet
Development

No branches or pull requests

7 participants