-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use a Dict instead of an IdDict for caching of the cwstring
for Windows env variables
#52758
Conversation
…dows env variables `IdDict` methods have a bunch of `nospecialize` annotations which specifically when using an `Array` value causes methods to be compiled that are very vulnerable to invalidations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike IdDict, the Dict code is not thread-safe for concurrent lookups, so will need to move those lookup access behind a lock
moving the accesses behind a lock could slow things down a bit right? Choosing dict types based just on opaque non-local invalidation effects seems like a kinda unhappy situation. I wonder if there's a way to keep the |
For my learning, what causes this unsafeness?
I agree, but the current situation is also unhappy.
The |
#52760 for example. |
What about going back to the non-Dict original form of #51371 and just store the cwstring for |
The lock performance should be fine. The function you call here (the environment variables lookup) also already always needed to take a lock internally on all platforms, so the added overhead should be slight. |
Lock-version does not compile, output I defined |
…dows env variables (#52758) Should fix #52711. My analysis of the invalidation is as follows: We added code to cache the conversion to `cwstring` in env handling on Windows (#51371): ```julia const env_dict = IdDict{String, Vector{UInt16}}() function memoized_env_lookup(str::AbstractString) ... env_dict[str] = cwstring(str) ... end function access_env(onError::Function, str::AbstractString) var = memoized_env_lookup(str) ... end ``` Since `IdDict` has `@nospecialize` on `setindex!` we compile this method: ```julia setindex!(::IdDict{String, Vector{UInt16}}, ::Any, ::Any) ``` which has an edge to: ```julia convert(Type{Vector{Int64}}, Any}) ``` But then StaticArrays comes along and adds a method ```julia convert(::Type{Array{T, N}}, ::StaticArray) ``` which invalidates the `setindex!` (due to the edge to `convert`) which invalidates the whole env handling on Windows which causes 4k other methods downstream to be invalidated, in particular, the artifact string macro which causes a significant delay in the next jll package you load after loading StaticArrays. There should be no performance penalty to this since strings already does a hash for their `objectid`. (cherry picked from commit b7c24ed)
Backported PRs: - [x] #51095 <!-- Fix edge cases where inexact conversions to UInt don't throw --> - [x] #52583 <!-- Don't access parent of triangular matrix in powm --> - [x] #52645 <!-- update --gcthreads section in command line options --> - [x] #52423 <!-- update nthreads info in versioninfo --> - [x] #52721 <!-- inference: Guard TypeVar special case against vararg --> - [x] #52637 <!-- fix finding bundled stdlibs even if they are e.g. devved in an environment higher in the load path --> - [x] #52752 <!-- staticdata: handle cycles in datatypes --> - [x] #52758 <!-- use a Dict instead of an IdDict for caching of the `cwstring` for Windows env variables --> - [x] #51375 <!-- Insert hardcoded backlinks to stdlib doc pages --> - [x] #52994 <!-- place work-stealing queue indices on different cache lines to avoid false-sharing --> - [x] #53015 <!-- Add type assertion in iterate for logicalindex --> - [x] #53032 <!-- Fix a list in GC devdocs --> - [x] #52748 - [x] #52856 - [x] #52878 - [x] #52754 - [x] #52228 - [x] #52924 - [x] #52569 <!-- Fix GC rooting during rehashing of iddict --> - [x] #52605 <!-- Default uplo in symmetric/hermitian --> - [x] #52618 <!-- heap snapshot: add gc roots and gc finalist roots to fix unrooted nodes --> - [x] #52781 <!-- fix type-stability bugs in Ryu code --> - [x] #53055 <!-- Profile: use full terminal cols to show function name --> - [x] #53096 - [x] #53076 - [x] #52841 <!-- Extensions: make loading of extensions independent of what packages are in the sysimage --> - [x] #52078 <!-- Replace `⇔` by `↔` in documentation --> - [x] #53035 <!-- use proper cache-line size variable in work-stealing queue --> - [x] #53066 <!-- doc: replace harr HTML entity by unicode --> - [x] #52996 <!-- Apple silicon has 128 byte alignment so fix our defines to match --> - [x] #53121 Non-merged PRs with backport label: - [ ] #52694 <!-- Reinstate similar for AbstractQ for backward compatibility --> - [ ] #51479 <!-- prevent code loading from lookin in the versioned environment when building Julia -->
…dows env variables (JuliaLang#52758) Should fix JuliaLang#52711. My analysis of the invalidation is as follows: We added code to cache the conversion to `cwstring` in env handling on Windows (JuliaLang#51371): ```julia const env_dict = IdDict{String, Vector{UInt16}}() function memoized_env_lookup(str::AbstractString) ... env_dict[str] = cwstring(str) ... end function access_env(onError::Function, str::AbstractString) var = memoized_env_lookup(str) ... end ``` Since `IdDict` has `@nospecialize` on `setindex!` we compile this method: ```julia setindex!(::IdDict{String, Vector{UInt16}}, ::Any, ::Any) ``` which has an edge to: ```julia convert(Type{Vector{Int64}}, Any}) ``` But then StaticArrays comes along and adds a method ```julia convert(::Type{Array{T, N}}, ::StaticArray) ``` which invalidates the `setindex!` (due to the edge to `convert`) which invalidates the whole env handling on Windows which causes 4k other methods downstream to be invalidated, in particular, the artifact string macro which causes a significant delay in the next jll package you load after loading StaticArrays. There should be no performance penalty to this since strings already does a hash for their `objectid`. (cherry picked from commit b7c24ed)
Should fix #52711.
My analysis of the invalidation is as follows:
We added code to cache the conversion to
cwstring
in env handling on Windows (#51371):Since
IdDict
has@nospecialize
onsetindex!
we compile this method:which has an edge to:
convert(Type{Vector{Int64}}, Any})
But then StaticArrays comes along and adds a method
which invalidates the
setindex!
(due to the edge toconvert
) which invalidates the whole env handling on Windows which causes 4k other methods downstream to be invalidated, in particular the artifact string macro which causes a significant delay in the next jll package you load after loading StaticArrays.There should be no performance penalty to this since strings already does a hash for their
objectid
.cc @jaakkor2, could you try this PR out?
cc @MasonProtter