perf: improve deserialization of value encoding for struct/list #5897

waruto210 · 2022-10-18T06:39:16Z

I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.

What's changed and what's your intention?

improve deserialization of value encoding for struct/list by preallocating vector.

The benchmark result is as follows, the deserialization time was reduced by about 40-50%

bench Struct of Bool (len = 100) (key encoding deserialization)
                        time:   [2.0542 µs 2.0583 µs 2.0625 µs]
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

bench Struct of Bool (len = 100) (value encoding deserialization)
                        time:   [1.1362 µs 1.1388 µs 1.1414 µs]
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild

bench List of Bool (len = 100) (key encoding deserialization)
                        time:   [1.1913 µs 1.1944 µs 1.1975 µs]
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) low mild
  2 (2.00%) high mild

bench List of Bool (len = 100) (value encoding deserialization)
                        time:   [1.1403 µs 1.1436 µs 1.1476 µs]
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

Checklist

~~- [ ] I have written necessary rustdoc comments~~
~~- [ ] I have added necessary unit tests and integration tests~~

All checks passed in ./risedev check (or alias, ./risedev c)

Refer to a related PR or issue link (optional)

#5713

codecov · 2022-10-18T06:57:19Z

Codecov Report

Merging #5897 (180dc10) into main (6d8c3c5) will increase coverage by 0.06%.
The diff coverage is 66.55%.

@@            Coverage Diff             @@
##             main    #5897      +/-   ##
==========================================
+ Coverage   74.85%   74.92%   +0.06%     
==========================================
  Files         922      923       +1     
  Lines      146824   146931     +107     
==========================================
+ Hits       109911   110082     +171     
+ Misses      36913    36849      -64

Flag	Coverage Δ
rust	`74.92% <66.55%> (+0.06%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
src/common/src/types/interval.rs	`83.79% <0.00%> (-3.34%)`	⬇️
src/common/src/types/mod.rs	`64.58% <0.00%> (ø)`
src/common/src/util/mod.rs	`0.00% <ø> (ø)`
src/common/src/util/value_encoding/mod.rs	`46.66% <0.00%> (+0.28%)`	⬆️
src/expr/src/vector_op/cast.rs	`84.29% <0.00%> (-2.20%)`	⬇️
src/frontend/src/handler/util.rs	`77.35% <0.00%> (-0.99%)`	⬇️
src/stream/src/from_proto/agg_common.rs	`0.00% <0.00%> (ø)`
src/stream/src/from_proto/global_simple_agg.rs	`0.00% <0.00%> (ø)`
src/stream/src/from_proto/hash_agg.rs	`0.00% <0.00%> (ø)`
src/utils/pgwire/src/pg_field_descriptor.rs	`63.54% <37.50%> (ø)`
... and 24 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

BugenZhao · 2022-10-18T07:08:55Z

This is interesting. I find that the lower bound of the size hint in GenericShunt is always set to 0.

impl<I, R> Iterator for GenericShunt<'_, I, R>
where
    I: Iterator<Item: Try<Residual = R>>,
{
    type Item = <I::Item as Try>::Output;

    fn next(&mut self) -> Option<Self::Item> {
        self.try_for_each(ControlFlow::Break).break_value()
    }

    fn size_hint(&self) -> (usize, Option<usize>) {
        if self.residual.is_some() {
            (0, Some(0))
        } else {
            let (_, upper) = self.iter.size_hint();
            (0, upper)
        }
    }

So when we're calling Result::from_iter, it won't hit the specialization of initializing with capacity in Vec::from_iter (in spec_from_iter_nested.rs). Therefore, the performance is worse than manually initializing and pushing in a for loop.

impl<T, I> SpecFromIterNested<T, I> for Vec<T>
where
    I: Iterator<Item = T>,
{
    default fn from_iter(mut iterator: I) -> Self {
        // Unroll the first iteration, as the vector is going to be
        // expanded on this iteration in every case when the iterable is not
        // empty, but the loop in extend_desugared() is not going to see the
        // vector being full in the few subsequent loop iterations.
        // So we get better branch prediction.
        let mut vector = match iterator.next() {
            None => return Vec::new(),
            Some(element) => {
                let (lower, _) = iterator.size_hint();
                let initial_capacity =
                    cmp::max(RawVec::<T>::MIN_NON_ZERO_CAP, lower.saturating_add(1));
                let mut vector = Vec::with_capacity(initial_capacity);
                unsafe {
                    // SAFETY: We requested capacity at least 1
                    ptr::write(vector.as_mut_ptr(), element);
                    vector.set_len(1);
                }
                vector
            }
        };
        // must delegate to spec_extend() since extend() itself delegates
        // to spec_from for empty Vecs
        <Vec<T> as SpecExtend<T, I>>::spec_extend(&mut vector, iterator);
        vector
    }
}

So almost all try_collect in our codebase suffer from the same issue. We lack an "optimistic try_collect". 😄

waruto210 · 2022-10-18T07:14:34Z

So almost all try_collect in our codebase suffer from the same issue. We lack an "optimistic try_collect". 😄

That's too bad, I have seen so many try_collect in our codebase.

waruto210 · 2022-10-18T08:01:44Z

So when we're calling Result::from_iter, it won't hit the specialization of initializing with capacity in Vec::from_iter (in spec_from_iter_nested.rs). Therefore, the performance is worse than manually initializing and pushing in a for loop.

There is already an issue here rust-lang/rust#48994

improve value encoding for struct/list

f284636

waruto210 requested review from fuyufjh and BugenZhao October 18, 2022 06:39

github-actions bot added the type/perf label Oct 18, 2022

BugenZhao requested a review from TennyZhuang October 18, 2022 07:09

waruto210 mentioned this pull request Oct 18, 2022

common: value encoding for list and struct is slower than key encoding #5713

Closed

BugenZhao approved these changes Oct 18, 2022

View reviewed changes

waruto210 added the mergify/can-merge label Oct 18, 2022

Merge branch 'main' into waruto/improve_value_encoding

180dc10

mergify bot merged commit 52e0ca4 into main Oct 18, 2022

mergify bot deleted the waruto/improve_value_encoding branch October 18, 2022 08:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: improve deserialization of value encoding for struct/list #5897

perf: improve deserialization of value encoding for struct/list #5897

waruto210 commented Oct 18, 2022

codecov bot commented Oct 18, 2022 •

edited

Loading

BugenZhao commented Oct 18, 2022 •

edited

Loading

waruto210 commented Oct 18, 2022

waruto210 commented Oct 18, 2022

perf: improve deserialization of value encoding for struct/list #5897

perf: improve deserialization of value encoding for struct/list #5897

Conversation

waruto210 commented Oct 18, 2022

What's changed and what's your intention?

Checklist

Refer to a related PR or issue link (optional)

codecov bot commented Oct 18, 2022 • edited Loading

Codecov Report

BugenZhao commented Oct 18, 2022 • edited Loading

waruto210 commented Oct 18, 2022

waruto210 commented Oct 18, 2022

codecov bot commented Oct 18, 2022 •

edited

Loading

BugenZhao commented Oct 18, 2022 •

edited

Loading