Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#3544 RV64: Optimize private memcpy and memset #6800

Merged
merged 8 commits into from
May 10, 2024

Conversation

chenhy0106
Copy link
Contributor

@chenhy0106 chenhy0106 commented May 8, 2024

  1. Optimize private memcpy and memset for RV64.
  2. Add test to compare private and libc memset.
  3. Compare private memcpy with libc memcpy on more small sizes.
  4. Fix a bug of core/CMakeLists.txt. For unit_tests, to compare private and libc memcpy, we should link unit_tests to drmemfuncs but not link to libc.

Compare original memcpy&memset, optimized private memcpy&memset and glibc memcpy&memset.

Test command:

./bin64/unit_tests

When we use original memcpy and memset, outputs:

our_memcpy_time: size=1 time=0
libc_memcpy_time: size=1 time=2
our_memcpy_time: size=4 time=2
libc_memcpy_time: size=4 time=2
our_memcpy_time: size=128 time=16
libc_memcpy_time: size=128 time=4
our_memcpy_time: size=512 time=57
libc_memcpy_time: size=512 time=7
our_memcpy_time: size=8192 time=824
libc_memcpy_time: size=8192 time=79
our_memcpy_time: size=20480 time=2080
libc_memcpy_time: size=20480 time=183
our_memset_time: 4129
libc_memset_time: 292
io all done
testing string
done testing string

When we use optimized memcpy and memset, outputs:

our_memcpy_time: size=1 time=1
libc_memcpy_time: size=1 time=2
our_memcpy_time: size=4 time=1
libc_memcpy_time: size=4 time=3
our_memcpy_time: size=128 time=2
libc_memcpy_time: size=128 time=3
our_memcpy_time: size=512 time=7
libc_memcpy_time: size=512 time=7
our_memcpy_time: size=8192 time=72
libc_memcpy_time: size=8192 time=69
our_memcpy_time: size=20480 time=184
libc_memcpy_time: size=20480 time=175
our_memset_time: 307
libc_memset_time: 302
io all done
testing string
done testing string

Issue: #3544

chenhy0106 added 2 commits May 8, 2024 01:18
1. Refer to musl lib to optimize private memcpy and memset.
2. Add test to compare private and libc memset.
3. Compare our memcpy with libc memcpy on more small sizes.
4. Fix a bug of core/CMakeLists.txt. For unit_tests, to
compare private and libc memcpy, we should link unit_tests
to drmemfuncs but not link to libc.

Issue: DynamoRIO#3544
Copy link
Contributor

@ksco ksco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@ksco ksco requested a review from derekbruening May 8, 2024 11:32
core/arch/riscv64/memfuncs.asm Outdated Show resolved Hide resolved
@chenhy0106 chenhy0106 requested a review from derekbruening May 9, 2024 06:32
core/io.c Outdated Show resolved Hide resolved
core/arch/riscv64/memfuncs.asm Outdated Show resolved Hide resolved
core/arch/riscv64/memfuncs.asm Outdated Show resolved Hide resolved
core/arch/riscv64/memfuncs.asm Show resolved Hide resolved
core/arch/riscv64/memfuncs.asm Show resolved Hide resolved
core/arch/riscv64/memfuncs.asm Show resolved Hide resolved
core/arch/riscv64/memfuncs.asm Show resolved Hide resolved
core/arch/riscv64/memfuncs.asm Show resolved Hide resolved
@ksco
Copy link
Contributor

ksco commented May 10, 2024

The newly committed comments look good to me, and PR is approved, I'm going to merge this. Thanks.

@ksco ksco merged commit ef1cd6f into DynamoRIO:master May 10, 2024
17 checks passed
@chenhy0106 chenhy0106 deleted the i#3544 branch May 17, 2024 05:27
ksco added a commit that referenced this pull request Jul 11, 2024
We optimized RV64 memfuncs in PR #6800. However, there was a mistake
where the return value of memcpy was incorrectly set to ARG2 which led
to segfaults in the release build. This PR addresses this issue.
Additionally, our unit_tests did not catch this error because
`ret = our_memcpy(&i, &j, sizeof(i));` was optimized by the compiler to
`i = j; ret = &i;`. We resolved this by utilizing the `noinline` attribute.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants