-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add VarZeroLengthlessSlice, use in MultiFieldsULE to optimize #5593
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deceptively simple
let len = self.fields.len(); | ||
if len == 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Praise: Not many changes in this file making it easy to review
/// # Panics | ||
/// | ||
/// Panics if the buffer is not exactly the correct length. | ||
pub fn write_serializable_bytes<T, A, F>(elements: &[A], output: &mut [u8]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought: It might be nice to refactor this code into an EncodeAsVarULE
impl, which has well-defined safety requirements and reduces indirection (not now, maybe later)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure that makes sense, I think this is the thing that underlies the EncodeAsVarULE infrastructure. But maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is VarZeroSlice, which is one layer above EncodeAsVarULE, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is VarZeroVecComponents; which is extremely low level. components.rs is basically where all of the actual VZV format algorithms live.
Co-authored-by: Shane F. Carr <[email protected]>
It is impossible for an IndexN array to need more than a length integer of size N, anyway, the max index is always `>=` the length. Part of #5523 Builds on #5593 We could in theory just have a `VZVFormatCombo<Index, Len>` type that allows free selection, however I'm trying to keep this minimal. Overall the main use case for that is picking things like "a small array of ;argely-sized elements" and we could just expose Index16Len8 for that. I can see that being useful for things like #5580, though it also feels like a data microoptimization. The "total" lines in fingerprints.csv are interspersed in giant diffs, and this basically only gets a max of 2-6 byte wins per data, but the overall data size went down by 200KB. Not amazing, not terrible. ```rust [18:26:22] मanishearth@manishearth-glaptop2 ~/dev/icu4x/provider/data ^_^ $ rg total | awk '{ gsub(/B,/, "", $3); s +=$3} END{print s}' 23501369 [18:26:08] मanishearth@manishearth-glaptop2 ~/dev/icu4x/provider/data ^_^ $ rg total | awk '{ gsub(/B,/, "", $3); s +=$3} END{print s}' 23391499 ```
Fixes #5378, #5240