-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a hashtable to lookup method roots #52073
base: master
Are you sure you want to change the base?
Conversation
Is there a reason this isn't just using an |
Didn't know I could in C. |
I suppose I can draw from how |
so on the C side, IDDicts are just |
I tried to move to the iddict, and it seemed to be working, but then when I cleaned and rebuilt I started getting segfaults. Probably some subtle aspect of the API I overlooked? |
The GC checker tells me I am "Passing non-rooted value as argument to function that may GC" but I'm not sure what I should be doing there. |
Hmm. Using iddict and making it part of the type, the sysimage expands 15 MB. Is it possible to avoid serializing this entry and regenerate it on load? I presume I was accidentally doing that previously, with the table not included as part of the Julia type. Also, unlike the C |
I suspect not storing this will be the actual answer (regenerating it at the start of compressing IR, and discarding at the end), though it is still questionable that we should be having any extra roots being created, as those can interact very badly with pkgimage too. However, I also happened to be simultaneously working on an update to the smallintset code (which is a general wrapper that allows adding a compact index to an exist array) to make it a general-purpose IdSet implementation, with the intent to save on space. I hope to have that in a PR form very soon. |
Fair enough, not married to that idea. Moved to generating the iddict at the top of |
I just noticed that there are some arrays storing reusable boxed integers in Was curious to find out if the boxing operation in here is less costly than I assumed. |
This is looking quite reasonable. It is pretty cheap to box them, but this exact use is also what |
@vtjnash I'm trying the idset out, and maybe I'm using it wrong, but values 254 and over aren't being stored or retrieved properly? If I try to retrieve an element with values in that range immediately after I store it, |
Seems to work for me?
Anyways, typical usage for your case would be: assert(m->roots->ref.mem->ptr == m->roots->ref.ptr_or_offset); // expected offset == 0
s.roots_ids = jl_idset_put_idx(m->roots->ref.mem, jl_an_empty_memory_any, -jl_array_nrows(m->roots)); |
Hmm, I don't know...
If I pass |
You need |
Sorry for the confusion, I was just going with the negative because you included it above. So breaking down the conditions further, it isn't all values that fail... 2 or 3 insert/retrieve fine, then one will fail to be found. And it appears that the roots that fail are always
|
I can't tell what is incorrect about it without a link to the code |
Thanks, but no worries I got it working. Not entirely sure what was wrong, but it got there. I love C 😶. I'll benchmark later. |
02acf38
to
2134d46
Compare
This PR implements a hash table for looking up method roots. Compression of methods with many roots should be sped up, removing a barrier to adding additional roots (such as to address #41099, #50082).
Currently, each time a root is added, all previous roots have to be checked, leading to O(n^2) runtime.
The hash table adds minimal memory overhead for each method (as a hidden field) to convert this step to O(n). The table is implemented with the existingImplemented with an IdDict as a field of the temporaryhtable_t
type, usingjl_object_id
as the hash function andjl_egal
as the equality function.jl_ircode_state
struct generated during compression.Some notes:
smallintset
/IdSet, after planned updates