-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: use fast unsafe bytes->string convertion #525
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you provide some before and after benchmarks to verify the improvements
Benchmark:
|
turns out that it's significant optimization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @robert-zaremba . Great work on this optimization. I had a couple of questions. Please let me know what you think
// UnsafeStrToBytes uses unsafe to convert string into byte array. Returned bytes | ||
// must not be altered after this function is called as it will cause a segmentation fault. | ||
func UnsafeStrToBytes(s string) []byte { | ||
var buf []byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about the following to avoid copying the slice internals by directly casting the string:
func unsafeGetBytes(s string) []byte {
return (*[0x7fff0000]byte)(unsafe.Pointer(
(*reflect.StringHeader)(unsafe.Pointer(&s)).Data),
)[:len(s):len(s)]
}
I'm not sure if there are any constraints preventing that but this could be a further optimization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is [0x7fff0000]byte
? Looks like a big static array. Where did you get that solution from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is the maximum process address space value
I grabbed this suggestion from here:
https://stackoverflow.com/questions/59209493/how-to-use-unsafe-get-a-byte-slice-from-a-string-without-memory-copy
Original source: https://groups.google.com/g/golang-nuts/c/Zsfk-VMd_fU/m/O1ru4fO-BgAJ
@@ -95,14 +97,14 @@ func (nc *lruCache) Len() int { | |||
} | |||
|
|||
func (c *lruCache) Remove(key []byte) Node { | |||
if elem, exists := c.dict[string(key)]; exists { | |||
if elem, exists := c.dict[ibytes.UnsafeBytesToStr(key)]; exists { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised that this is helpful because, from my understanding, copying during []byte
to string
conversion when accessing a map by key should be optimized by the Go compiler.
Sources:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if the benchmark would remain the same as it is right now if the map[string]
changes are reverted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can check, however for consistency I prefer to keep the casting here.
Co-authored-by: Roman <[email protected]>
After removing the fast convertion, there is a slight performance degradation:
I repeated it few times and the memory allocation results were similar (there was more variation in the ns/op) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for trying the map benchmarks.
Let's go with the approach that benchmarks show is best.
We can investigate this further separately
Avoid unnecessary allocation by using fast unsafe bytes -> string conversion
Notes: