Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
36919: exec: add null ranges and optimize nulls in the vect. merge joiner r=georgeutsin a=georgeutsin Added a new member function to coldata.Nulls to allow setting a range of null values, given a start and end index. This is useful for example in the merge joiner when we are building the left groups, since it is faster to write nulls using a bitwise operation compared to setting each row to null one by one. This required refactoring Nulls to use a slice of uint64 instead of int64 for the null bitmap under the hood, since right shifting a mask (commonly -1) doesn't work as expected for a signed number, as the sign digit does not get shifted. Some benchmarks: ``` name old time/op new time/op delta MergeJoiner/rows=1024-8 40.0µs ± 1% 39.8µs ± 1% -0.57% (p=0.023 n=10+10) MergeJoiner/rows=4096-8 152µs ± 0% 150µs ± 1% -1.12% (p=0.000 n=10+10) MergeJoiner/rows=16384-8 591µs ± 3% 580µs ± 1% -1.88% (p=0.000 n=9+10) MergeJoiner/rows=1048576-8 37.1ms ± 3% 35.4ms ± 2% -4.48% (p=0.000 n=9+10) MergeJoiner/oneSideRepeat-rows=1024-8 41.3µs ± 3% 40.2µs ± 0% -2.49% (p=0.000 n=10+10) MergeJoiner/oneSideRepeat-rows=4096-8 154µs ± 1% 150µs ± 1% -2.21% (p=0.000 n=9+10) MergeJoiner/oneSideRepeat-rows=16384-8 592µs ± 1% 587µs ± 1% -0.76% (p=0.008 n=9+10) MergeJoiner/oneSideRepeat-rows=1048576-8 37.6ms ± 7% 37.4ms ± 3% ~ (p=0.853 n=10+10) MergeJoiner/bothSidesRepeat-rows=1024-8 43.0µs ± 0% 42.7µs ± 0% -0.87% (p=0.000 n=8+9) MergeJoiner/bothSidesRepeat-rows=4096-8 189µs ± 1% 186µs ± 1% -1.59% (p=0.000 n=10+9) MergeJoiner/bothSidesRepeat-rows=16384-8 1.42ms ± 2% 1.39ms ± 0% -2.03% (p=0.000 n=10+10) MergeJoiner/bothSidesRepeat-rows=32768-8 4.86ms ± 2% 4.80ms ± 1% -1.26% (p=0.001 n=10+10) name old speed new speed delta MergeJoiner/rows=1024-8 1.64GB/s ± 1% 1.65GB/s ± 1% +0.58% (p=0.023 n=10+10) MergeJoiner/rows=4096-8 1.73GB/s ± 0% 1.75GB/s ± 1% +1.14% (p=0.000 n=10+10) MergeJoiner/rows=16384-8 1.77GB/s ± 3% 1.81GB/s ± 1% +1.90% (p=0.000 n=9+10) MergeJoiner/rows=1048576-8 1.81GB/s ± 3% 1.89GB/s ± 2% +4.68% (p=0.000 n=9+10) MergeJoiner/oneSideRepeat-rows=1024-8 1.59GB/s ± 3% 1.63GB/s ± 0% +2.53% (p=0.000 n=10+10) MergeJoiner/oneSideRepeat-rows=4096-8 1.71GB/s ± 1% 1.75GB/s ± 1% +2.25% (p=0.000 n=9+10) MergeJoiner/oneSideRepeat-rows=16384-8 1.77GB/s ± 1% 1.79GB/s ± 1% +0.76% (p=0.008 n=9+10) MergeJoiner/oneSideRepeat-rows=1048576-8 1.79GB/s ± 6% 1.80GB/s ± 3% ~ (p=0.853 n=10+10) MergeJoiner/bothSidesRepeat-rows=1024-8 1.52GB/s ± 0% 1.54GB/s ± 0% +0.88% (p=0.000 n=8+9) MergeJoiner/bothSidesRepeat-rows=4096-8 1.38GB/s ± 1% 1.41GB/s ± 1% +1.61% (p=0.000 n=10+9) MergeJoiner/bothSidesRepeat-rows=16384-8 738MB/s ± 2% 753MB/s ± 0% +2.07% (p=0.000 n=10+10) MergeJoiner/bothSidesRepeat-rows=32768-8 431MB/s ± 2% 437MB/s ± 1% +1.26% (p=0.001 n=10+10) name old alloc/op new alloc/op delta MergeJoiner/rows=1024-8 7.40B ±32% 6.20B ±45% ~ (p=0.370 n=10+10) MergeJoiner/rows=4096-8 27.0B ± 0% 27.0B ± 0% ~ (all equal) MergeJoiner/rows=16384-8 104B ±31% 90B ± 0% ~ (p=0.211 n=10+10) MergeJoiner/rows=1048576-8 5.45kB ± 0% 5.45kB ± 0% ~ (all equal) MergeJoiner/oneSideRepeat-rows=1024-8 9.00B ± 0% 9.00B ± 0% ~ (all equal) MergeJoiner/oneSideRepeat-rows=4096-8 27.0B ± 0% 27.0B ± 0% ~ (all equal) MergeJoiner/oneSideRepeat-rows=16384-8 90.0B ± 0% 90.0B ± 0% ~ (all equal) MergeJoiner/oneSideRepeat-rows=1048576-8 6.90kB ±32% 5.45kB ± 0% ~ (p=0.087 n=10+10) MergeJoiner/bothSidesRepeat-rows=1024-8 9.00B ± 0% 9.00B ± 0% ~ (all equal) MergeJoiner/bothSidesRepeat-rows=4096-8 27.0B ± 0% 27.0B ± 0% ~ (all equal) MergeJoiner/bothSidesRepeat-rows=16384-8 272B ± 0% 272B ± 0% ~ (all equal) MergeJoiner/bothSidesRepeat-rows=32768-8 908B ± 0% 908B ± 0% ~ (all equal) name old allocs/op new allocs/op delta MergeJoiner/rows=1024-8 0.00 0.00 ~ (all equal) MergeJoiner/rows=4096-8 0.00 0.00 ~ (all equal) MergeJoiner/rows=16384-8 0.00 0.00 ~ (all equal) MergeJoiner/rows=1048576-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) MergeJoiner/oneSideRepeat-rows=1024-8 0.00 0.00 ~ (all equal) MergeJoiner/oneSideRepeat-rows=4096-8 0.00 0.00 ~ (all equal) MergeJoiner/oneSideRepeat-rows=16384-8 0.00 0.00 ~ (all equal) MergeJoiner/oneSideRepeat-rows=1048576-8 1.40 ±43% 1.00 ± 0% ~ (p=0.087 n=10+10) MergeJoiner/bothSidesRepeat-rows=1024-8 0.00 0.00 ~ (all equal) MergeJoiner/bothSidesRepeat-rows=4096-8 0.00 0.00 ~ (all equal) MergeJoiner/bothSidesRepeat-rows=16384-8 0.00 0.00 ~ (all equal) MergeJoiner/bothSidesRepeat-rows=32768-8 0.00 0.00 ~ (all equal) ``` Release note: None 36934: testutils: add a helper for removing setup code from -memprofile r=tbg a=danhhz AllocProfileDiff writes two alloc profiles, one before running closure and one after. This is similar in spirit to passing the -memprofile flag to a test or benchmark, but make it possible to subtract out setup code, which -memprofile does not. Example usage: setupCode() AllocProfileDiff(t, "mem.before", "mem.after", func() { interestingCode() }) The resulting profiles are then diffed via: go tool pprof -base mem.before mem.after I've wanted this a number of times when working on bulkio stuff, but it was recently so useful that I decided to clean it up and check it in. The result of -memprofile on BenchmarkImportWorkload/AddSSTable: (pprof) top10 Showing nodes accounting for 1921.59MB, 74.06% of 2594.70MB total Dropped 496 nodes (cum <= 12.97MB) Showing top 10 nodes out of 119 flat flat% sum% cum cum% 463.28MB 17.85% 17.85% 463.28MB 17.85% github.com/cockroachdb/cockroach/pkg/ccl/workloadccl/format.ToSSTable.func2 266.46MB 10.27% 28.12% 415.37MB 16.01% github.com/cockroachdb/cockroach/vendor/github.com/golang/leveldb/table.(*Reader).readBlock 256.60MB 9.89% 38.01% 289.10MB 11.14% github.com/cockroachdb/cockroach/pkg/sql.GenerateInsertRow 241.35MB 9.30% 47.32% 241.35MB 9.30% github.com/cockroachdb/cockroach/pkg/storage.defaultSubmitProposalLocked 239.81MB 9.24% 56.56% 239.81MB 9.24% github.com/cockroachdb/cockroach/pkg/storage/storagepb.(*ReplicatedEvalResult_AddSSTable).Unmarshal 148.91MB 5.74% 62.30% 148.91MB 5.74% github.com/cockroachdb/cockroach/vendor/github.com/golang/snappy.Decode 79.94MB 3.08% 65.38% 79.94MB 3.08% github.com/cockroachdb/cockroach/pkg/storage/engine.gobytes 78.72MB 3.03% 68.41% 78.72MB 3.03% github.com/cockroachdb/cockroach/pkg/ccl/importccl.(*rowConverter).sendBatch 77.51MB 2.99% 71.40% 84.61MB 3.26% github.com/cockroachdb/cockroach/pkg/workload/tpcc.(*tpcc).tpccOrderLineInitialRowBatch 69.02MB 2.66% 74.06% 69.02MB 2.66% github.com/cockroachdb/cockroach/pkg/roachpb.(*Value).SetTuple The result of using this and diffing the profiles: (pprof) top10 Showing nodes accounting for 299.54MB, 99.02% of 302.50MB total Dropped 9 nodes (cum <= 1.51MB) Showing top 10 nodes out of 66 flat flat% sum% cum cum% 95MB 31.40% 31.40% 139.67MB 46.17% github.com/cockroachdb/cockroach/vendor/github.com/golang/leveldb/table.(*Reader).readBlock 79.94MB 26.43% 57.83% 79.94MB 26.43% github.com/cockroachdb/cockroach/pkg/storage.defaultSubmitProposalLocked 79.94MB 26.43% 84.26% 79.94MB 26.43% github.com/cockroachdb/cockroach/pkg/storage/storagepb.(*ReplicatedEvalResult_AddSSTable).Unmarshal 44.67MB 14.77% 99.02% 44.67MB 14.77% github.com/cockroachdb/cockroach/vendor/github.com/golang/snappy.Decode 0 0% 99.02% 2.45MB 0.81% github.com/cockroachdb/cockroach/pkg/ccl/importccl_test.BenchmarkImportWorkload.func1.1 0 0% 99.02% 2.45MB 0.81% github.com/cockroachdb/cockroach/pkg/ccl/importccl_test.benchmarkAddSSTable 0 0% 99.02% 63.22MB 20.90% github.com/cockroachdb/cockroach/pkg/ccl/importccl_test.benchmarkAddSSTable.func1 0 0% 99.02% 79.94MB 26.43% github.com/cockroachdb/cockroach/pkg/internal/client.(*CrossRangeTxnWrapperSender).Send 0 0% 99.02% 79.94MB 26.43% github.com/cockroachdb/cockroach/pkg/internal/client.(*DB).AddSSTable 0 0% 99.02% 79.94MB 26.43% github.com/cockroachdb/cockroach/pkg/internal/client.(*DB).Run Release note: None Co-authored-by: George Utsin <[email protected]> Co-authored-by: Daniel Harrison <[email protected]>
- Loading branch information