-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(serverv2): add benchmarks of (old) cacheKV vs branch #22497
Conversation
📝 Walkthrough📝 WalkthroughWalkthroughThe changes introduce enhancements to benchmarking tests in the Changes
Sequence Diagram(s)sequenceDiagram
participant B as Benchmark
participant S as Store
participant C as CacheStack
B->>C: Benchmark_CacheStack_Set()
C->>S: Set(key, value)
S-->>C: Result (stored in sink)
B->>C: Benchmark_Get()
C->>S: Get(key)
S-->>C: Result (stored in sink)
B->>C: Benchmark_GetSparse()
C->>S: Get(unique_key)
S-->>C: Result (stored in sink)
B->>C: Benchmark_Iterate()
C->>S: Iterate()
S-->>C: Key-Value pairs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
@testinginprod your pull request is missing a changelog! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (8)
server/v2/stf/branch/bench_test.go (3)
35-35
: Consider adding more detailed performance metrics.Given the significant performance regression (320x slower) mentioned in the PR objectives, consider adding:
- Memory statistics using
b.ReportAllocs()
- Sub-benchmarks for different key patterns
- CPU profiling for detailed analysis
Example enhancement:
func Benchmark_Get(b *testing.B) { var sink any for _, stackSize := range stackSizes { b.Run(fmt.Sprintf("StackSize%d", stackSize), func(b *testing.B) { bs := makeBranchStack(b, stackSize) + // Add CPU profiling + if stackSize == 100 { // Profile only larger sizes + f, _ := os.Create("cpu.prof") + pprof.StartCPUProfile(f) + defer pprof.StopCPUProfile() + } b.ResetTimer() b.ReportAllocs() for i := 0; i < b.N; i++ { sink, _ = bs.Get([]byte{0}) } }) } if sink == nil { b.Fatal("prevent compiler optimization") } }Also applies to: 42-48
51-72
: LGTM! Consider adding parallel benchmark variant.The benchmark is well-structured with proper key generation outside the timed section. Given the significant performance improvement (11.77x faster), it would be valuable to also test parallel access patterns.
Add a parallel variant:
+func Benchmark_GetSparse_Parallel(b *testing.B) { + for _, stackSize := range stackSizes { + b.Run(fmt.Sprintf("StackSize%d", stackSize), func(b *testing.B) { + bs := makeBranchStack(b, stackSize) + keys := make([][]byte, b.N) + for i := 0; i < b.N; i++ { + keys[i] = numToBytes(i) + } + b.ResetTimer() + b.ReportAllocs() + b.RunParallel(func(pb *testing.PB) { + var sink any + i := 0 + for pb.Next() { + sink, _ = bs.Get(keys[i%len(keys)]) + i++ + } + if sink == nil { + b.Fatal("prevent compiler optimization") + } + }) + }) + } +}
118-120
: LGTM! Consider adding documentation.The numToBytes function is well-implemented using generics and efficient binary encoding.
Add documentation to clarify the encoding format:
+// numToBytes converts an integer to its big-endian binary representation. +// The function always returns an 8-byte slice regardless of the input value. func numToBytes[T ~int](n T) []byte { return binary.BigEndian.AppendUint64(nil, uint64(n)) }store/cachekv/branch_bench_test.go (5)
13-17
: Consider using constants and adding documentation.The global variables would benefit from being declared as constants since they're not modified. Additionally, adding documentation would improve clarity about their purpose in the benchmarks.
-var ( +const ( stackSizes = []int{1, 10, 100} elemsInStack = 10 ) + +// stackSizes defines the different stack depths to benchmark +// elemsInStack defines the number of elements to populate in each stack layer
18-29
: Consider enhancing benchmark descriptions.While the benchmark implementation is solid, the sub-benchmark descriptions could be more descriptive to better indicate what's being tested.
-b.Run(fmt.Sprintf("StackSize%d", stackSize), func(b *testing.B) { +b.Run(fmt.Sprintf("StackSize=%d/SingleKeySet", stackSize), func(b *testing.B) {
74-96
: Add documentation for sink variables.The iterator benchmark correctly uses sink variables to prevent compiler optimizations, but their purpose should be documented.
-var keySink, valueSink any +// keySink and valueSink prevent the compiler from optimizing away the iterator operations +var keySink, valueSink any
98-112
: Remove unused parameter in makeBranchStack.The testing.B parameter is not used in the function.
-func makeBranchStack(_ *testing.B, stackSize int) *cachekv.Store { +func makeBranchStack(stackSize int) *cachekv.Store {
114-116
: Add documentation for numToBytes function.The generic function should have documentation explaining its purpose and encoding details.
+// numToBytes converts a number to its big-endian byte representation. +// It uses a generic constraint to accept any integer type. func numToBytes[T ~int](n T) []byte {
📜 Review details
Configuration used: .coderabbit.yml
Review profile: CHILL
📒 Files selected for processing (3)
server/v2/stf/branch/bench_test.go
(4 hunks)server/v2/stf/branch/store.go
(1 hunks)store/cachekv/branch_bench_test.go
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
server/v2/stf/branch/bench_test.go (2)
Pattern **/*.go
: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.
Pattern **/*_test.go
: "Assess the unit test code assessing sufficient code coverage for the changes associated in the pull request"
server/v2/stf/branch/store.go (1)
Pattern **/*.go
: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.
store/cachekv/branch_bench_test.go (2)
Pattern **/*.go
: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.
Pattern **/*_test.go
: "Assess the unit test code assessing sufficient code coverage for the changes associated in the pull request"
🔇 Additional comments (6)
server/v2/stf/branch/bench_test.go (2)
107-107
: LGTM! Key generation ensures uniqueness across branches.
The key generation logic correctly combines branch and element indices using the efficient numToBytes function.
18-18
: LGTM! Compiler optimization prevention is properly implemented.
The sink variable pattern is correctly used to prevent the compiler from optimizing away the benchmark operations.
Let's verify that the benchmark results are stable and not affected by compiler optimizations:
Also applies to: 25-25, 29-31
✅ Verification successful
Sink variable usage follows established patterns across the codebase
The sink variable declaration and usage in server/v2/stf/branch/bench_test.go
follows the consistent pattern found across the codebase's benchmark tests. This is a well-established practice in the project, with similar implementations in other benchmark files like store/cachekv/store_bench_test.go
and math/dec_test.go
.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Run the benchmark multiple times to check result stability
go test -bench=Benchmark_CacheStack_Set -count=5 ./server/v2/stf/branch/... | tee results.txt
# Check if results are consistent (should have similar numbers)
awk '/Benchmark_CacheStack_Set/ { print $3 }' results.txt | sort -n | uniq -c
Length of output: 398
Script:
#!/bin/bash
# Let's verify the sink variable usage pattern in the benchmark tests
rg -U "var\s+sink\s+.*?\n.*?=.*?sink" --type go
# Check for other benchmark functions and their sink usage
ast-grep --pattern 'func Benchmark$_($_) {
$$$
}'
# Look for any other sink variable declarations
rg "var\s+sink\s+" --type go
Length of output: 666
server/v2/stf/branch/store.go (2)
Line range hint 32-36
: LGTM: Comment changes improve clarity
The updated comment accurately describes the control flow of the Get method.
Line range hint 32-36
: Consider optimizing Get method performance
According to the PR objectives, there's a significant performance regression (320× slower) in _GetCached
operations. Consider implementing the following optimizations:
- Add read-through caching to reduce parent store lookups
- Implement bulk operations for batch retrievals
- Add prefetching mechanisms for frequently accessed keys
Let's analyze the usage patterns to identify optimization opportunities:
store/cachekv/branch_bench_test.go (2)
1-11
: LGTM! Package structure and imports are well-organized.
The package follows Go conventions with proper import grouping.
31-47
: Verify benchmark coverage matches PR objectives.
The Get and GetSparse benchmarks align with the PR objectives showing the 320× slowdown and 11.77× speedup respectively. However, consider adding more granular test cases to investigate the significant performance regression in Get operations.
Consider adding benchmarks with different key sizes and value sizes to better understand the performance characteristics:
+var (
+ benchmarkKeySizes = []int{8, 64, 256} // in bytes
+ benchmarkValSizes = []int{8, 1024, 4096} // in bytes
+)
Also applies to: 49-72
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a drive-by code review for proper benchmarks, you need to make the sink a global and also reset it to nil after every benchmark.
store/cachekv/branch_bench_test.go
Outdated
|
||
// Gets the same key from the branch store. | ||
func Benchmark_Get(b *testing.B) { | ||
var sink any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to global and to be reset after use.
@testinginprod could you implement @odeke-em comments? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (7)
server/v2/stf/branch/bench_test.go (2)
51-74
: Add documentation explaining the sparse testing strategyWhile the implementation is solid, please add a comment explaining the purpose of sparse key testing and how it differs from regular Get benchmarks.
+// Benchmark_GetSparse measures the performance of Get operations with sparsely distributed keys, +// which helps evaluate the efficiency of the key lookup mechanism under more realistic conditions. func Benchmark_GetSparse(b *testing.B) {
124-126
: Add documentation about endianness choicePlease add a comment explaining the choice of big-endian encoding for consistency and compatibility.
+// numToBytes converts a number to its big-endian byte representation for consistent +// and platform-independent key generation in benchmarks. func numToBytes[T ~int](n T) []byte {store/cachekv/branch_bench_test.go (5)
13-16
: Consider using constants for benchmark configurationThese values represent configuration that shouldn't change during runtime. Consider:
- Using
const
instead ofvar
- More descriptive names (e.g.,
benchmarkStackSizes
,elementsPerStack
)-var ( +const ( - stackSizes = []int{1, 10, 100} - elemsInStack = 10 + benchmarkStackSizes = []int{1, 10, 100} + elementsPerStack = 10 )
18-29
: Add cleanup between benchmark runsConsider adding
b.Cleanup()
to ensure proper cleanup of resources between benchmark runs, especially since we're dealing with store operations.
49-100
: Add benchmark methodology documentationConsider adding documentation that explains:
- The purpose of different stack sizes
- The methodology for sparse key generation
- The expected performance characteristics
This will help future maintainers understand the benchmark design decisions.
🧰 Tools
🪛 golangci-lint
77-77: File is not
gofumpt
-ed with-extra
(gofumpt)
102-120
: Enhance helper functions with documentation and error handling
- Add documentation explaining the branch stack structure and its relationship to the benchmarks
- Consider adding bounds checking in
numToBytes
for negative numbers
77-77
: Fix formatting with gofumptRun
gofumpt -w -extra
on the file to ensure consistent formatting.🧰 Tools
🪛 golangci-lint
77-77: File is not
gofumpt
-ed with-extra
(gofumpt)
📜 Review details
Configuration used: .coderabbit.yml
Review profile: CHILL
📒 Files selected for processing (2)
server/v2/stf/branch/bench_test.go
(5 hunks)store/cachekv/branch_bench_test.go
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
server/v2/stf/branch/bench_test.go (2)
Pattern **/*.go
: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.
Pattern **/*_test.go
: "Assess the unit test code assessing sufficient code coverage for the changes associated in the pull request"
store/cachekv/branch_bench_test.go (2)
Pattern **/*.go
: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.
Pattern **/*_test.go
: "Assess the unit test code assessing sufficient code coverage for the changes associated in the pull request"
🪛 golangci-lint
store/cachekv/branch_bench_test.go
77-77: File is not gofumpt
-ed with -extra
(gofumpt)
🔇 Additional comments (4)
server/v2/stf/branch/bench_test.go (4)
24-27
: LGTM: Proper error handling added
Good addition of error handling for the Set operation, ensuring benchmark reliability.
33-34
: LGTM: Global sink variable properly implemented
The global sink variable correctly prevents compiler optimizations in benchmarks.
46-48
: LGTM: Proper benchmark validation
Good addition of benchmark validation check with clear error message.
Line range hint 76-102
: LGTM: Well-structured iterator benchmark
Good implementation with proper sink management and benchmark validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a comment but thank you @testinginprod and thanks for the nudge @julienrbrt.
} | ||
}) | ||
} | ||
if sink == nil { | ||
b.Fatal("benchmark did not run") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add sink = nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
server/v2/stf/branch/bench_test.go (1)
52-75
: Consider optimizing key generation for large NThe pre-generation of keys is good practice, but the keys slice grows with
b.N
which could impact memory usage for large benchmark runs.Consider using a fixed-size key pool or generating keys on-the-fly using modulo to limit memory usage:
- keys := func() [][]byte { - var keys [][]byte - for i := 0; i < b.N; i++ { - keys = append(keys, numToBytes(i)) - } - return keys - }() - b.ResetTimer() - b.ReportAllocs() - for _, key := range keys { + b.ResetTimer() + b.ReportAllocs() + for i := 0; i < b.N; i++ { + key := numToBytes(i % 1000) // Use a reasonable upper bound sink, _ = bs.Get(key) - }
📜 Review details
Configuration used: .coderabbit.yml
Review profile: CHILL
📒 Files selected for processing (1)
server/v2/stf/branch/bench_test.go
(5 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
server/v2/stf/branch/bench_test.go (2)
Pattern **/*.go
: Review the Golang code for conformity with the Uber Golang style guide, highlighting any deviations.
Pattern **/*_test.go
: "Assess the unit test code assessing sufficient code coverage for the changes associated in the pull request"
🔇 Additional comments (5)
server/v2/stf/branch/bench_test.go (5)
24-27
: LGTM: Proper error handling added
The error handling for bs.Set
is implemented correctly using b.Fatal
, which is the appropriate way to handle errors in Go benchmarks.
33-34
: LGTM: Proper benchmark result verification
The implementation correctly:
- Uses a global sink variable to prevent compiler optimizations
- Verifies benchmark execution with nil check
- Resets the sink after use
Also applies to: 46-49
77-80
: LGTM: Consistent benchmark verification pattern
The implementation correctly follows the established pattern of:
- Using global sinks to prevent compiler optimizations
- Verifying benchmark execution
- Cleaning up global variables
Also applies to: 99-103
114-114
: LGTM: Improved key generation
The key generation is now more consistent and type-safe using the numToBytes
function.
125-127
: LGTM: Well-implemented generic helper
The numToBytes
function is well-implemented:
- Uses generics appropriately
- Ensures consistent byte ordering with BigEndian
- Has a clear, single responsibility
* main: feat(serverv2): add benchmarks of (old) cacheKV vs branch (#22497) fix: match comets bls implmentation (#22613) feat(math): Upstream GDA based decimal type (#21982) fix(client/v2/autocli): prevent duplicate addition of customCommands (#22576) build(deps): Bump cosmossdk.io/math from 1.3.0 to 1.4.0 (#22580)
Description
Summary of Benchmark Performance Comparison
Bench Test Type:
_CacheStack_Set
:This sets the same key in state. Note: Stack size should have no impact here since the top branch is the one where the cache is being set.
StackSize1-14
StackSize10-14
StackSize100-14
Bench Test Type:
_GetCached
Gets the same key.
NOTE: CacheKV caches the key on the first get, making getting the same key expensive only the first time but in-expensive other times. This works well only when we fetch the same key. GetSparse shows the opposite view.
StackSize1-14
StackSize10-14
StackSize100-14
Bench Test Type:
_GetSparse
Fetching always new key from storage.
StackSize1-14
StackSize10-14
StackSize100-14
Bench Test Type:
_Iterate
StackSize1-14
StackSize10-14
StackSize100-14
Overall Performance
Memory Usage
Bytes per Operation (B/op):
_CacheStack_Set
_GetCached
_GetSparse
_Iterate
Allocations per Operation (allocs/op):
_CacheStack_Set
_GetSparse
_Iterate
Conclusion:
_CacheStack_Set
,_GetSparse
, and_Iterate
benchmarks, especially with larger stack sizes._GetCached
benchmarks, where the current code is slower than the previous version.This summary provides a detailed comparison of the benchmark tests, highlighting where the current code has improved or declined in performance relative to the previous version.
Author Checklist
All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.
I have...
!
in the type prefix if API or client breaking changeCHANGELOG.md
Reviewers Checklist
All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.
Please see Pull Request Reviewer section in the contributing guide for more information on how to review a pull request.
I have...
Summary by CodeRabbit
New Features
cachekv
package, measuring performance for setting and retrieving values, as well as iteration over key-value pairs.Bug Fixes
Get
method within theStore
struct to better reflect the logic flow.These enhancements aim to improve performance evaluation and code clarity for users interacting with the caching system.