Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory usage in nested JSON parser - tree generation #11864
Reduce memory usage in nested JSON parser - tree generation #11864
Changes from 19 commits
a75b0a5
4abfb51
efb6621
5f250cb
49cb0d7
02a7b5b
9243d89
7efc890
6d3a166
bbcbffa
483abf1
f9f0926
f851232
55369c9
5eefd64
3bb54f4
b70669d
5a0a9a7
8578a22
a356ea0
7116570
8e0c85f
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will probably be a good point to do some precomputations (hashmap or sparse bitmap) if this kernel becomes a performance issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one takes ~20% of the parent_node_ids computation time. it take significant time, but not the critical one.
hash_map takes extra memory. Tried it now. memory increases from 7.97 GiB to 9.271 GiB. It's much slower than lower_bound; 12 ms lower_bound, vs 133 ms hash_map.
I am interested to learn about the "sparse bitmap" approach .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an approach I often use to speed up binary search-like lookups where the indices are unique. The fundamental idea is a bitvector with rank support:
You store a bitvector containing a 1 if the corresponding token is a node and 0 in groups of 32 bit words and compute an exclusive prefix sum over the popcount of the words. Computing the lower bound is then
prefix_sum[i / 32] + popcnt(prefix_mask(i % 32) & bitvector[i / 32])
whereprefix_mask(i) = (1u<< i) - 1
has all bits smaller than its parameter set. Overall, this uses 2 bits of storage per token. If the number of tokens is much larger than the number of nodes, you can make the data structure even sparser if you only store 32 bit words that are not all 0 (basically a reduce_by_key over the tokens) and use a normal bitvector to store a bit for each 32 bit word denoting whether it was non-zero. Then you have a two-level lookup (though the second level could also be another lookup data structure like a hashmap). The data structure has pretty good caching properties, since locality in indices translates to locality in memory, which hashmaps purposefully don't have.But with the number of nodes being not smaller than the number of tokens by a huge factor, I don't think this would be worth the effort.