-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make strlen
more like musl version
#3
Conversation
I didn't add tests directly but I do did some unit tests: https://play.rust-lang.org/?gist=1953956b2b941525bb89abcda6925db1&version=nightly&backtrace=0&run=1 |
c07ace3
to
e8e7353
Compare
Thanks for the PR! I need to document this, but I'm relying on the test suite used for musl to check any changes. If you're on Linux (or a Linux VM), you should be able to run |
pub unsafe extern "C" fn strlen(mut s: *const c_schar) -> size_t { | ||
macro_rules! align { () => { core::mem::size_of::<size_t>() } } | ||
macro_rules! ones { () => { !0usize } } | ||
macro_rules! highs { () => { usize::wrapping_mul(ones!(), usize::MAX)/2 + 1 } } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these three items (align, ones, highs) would be more appropriate as const's instead of macros.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid CTFE of Rust is not powerful enough.
See rust-lang/rfcs#322
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, size_of won't be const for a while. That said, I still don't think that macros add anything here, and make it less readable. Perhaps these could be instantiated as stack variables instead?
I think there are 2 tricks here: * If you read from memory, it is best that the memory address is aligned to word-boundary * If you check a string of memory and you can check N bytes within the same time, you are N times faster than you check one byte at a time
} | ||
return usize::MAX; | ||
// compare with word-length | ||
w = s as _; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, inferred casts are not particularly well aligned with one of the things I've enjoyed about writing in Rust -- it encourages explicit casts to make it clear what conversions are involved.
Nits addressed |
http://doc.rust-lang.org/reference.html#conditional-compilation |
Ah I forgot to add tests. |
https://crates.io/crates/pointer-width |
@bombless Isn't this what |
Thanks @eddyb, but isn't this feature deprecated? |
Looks great, thanks! I'll run this through the test suite soon. |
pub unsafe extern "C" fn strlen(mut s: *const c_schar) -> size_t { | ||
|
||
#[inline] | ||
fn align() -> usize { core::mem::size_of::<size_t>() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this isn't a root module, the core
namespace isn't automatically available, and this won't build. Changing it to this works fine:
#[inline]
fn align() -> usize {
use core::mem;
mem::size_of::<size_t>()
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done that.
But I'm not sure how it fix the problem.
I see core::usize
is at the top of this strlen.rs
originally, so I guess ::core
is already available?
I should check the project structure at first though, I'll spend some time to investigate.
Looks wierd? |
Unfortunately, musl doesn't pass all of it's own tests (here's a list of ones which fail on x86_64 with the pure C version: https://github.com/dikaiosune/rusl/blob/master/baseline_failures). The tests actually "pass" in that they don't regress any further. The CI build failed because I just now added a script which gates the build on rustfmt having nothing to change. Could you run the patch through rustfmt 0.5.0 ( EDIT: Also, sorry for not giving you a heads up on the change, I just did it earlier tonight and was in the process of closing other issues. |
Also, another heads up -- I'm going to be including clippy as a build dependency, so you may wish to hold off on further attempts until I've got that sorted out and you can rebase with the changes. |
I'll find a Linux machine tomorrow, guess that will make things more smoothly :-) |
OK, sounds good. I'll probably have clippy set up by then, so I imagine you'll want to |
|
||
#[inline] | ||
fn has_zero(x: usize) -> bool { | ||
usize::wrapping_sub(x, ones() & !x & highs()) != 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't ones() & x == x
? Seems unnecessary here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
has_zero
is easily answered: a binary numer has a zero if it's not completely ones, so x != ones()
is all it takes for this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, but you're looking for zero bytes. Maybe call the function has_zero_byte
? ;-)
I went back and looked at the C version: #define ALIGN (sizeof(size_t))
#define ONES ((size_t)-1/UCHAR_MAX)
#define HIGHS (ONES * (UCHAR_MAX/2+1))
#define HASZERO(x) ((x)-ONES & ~(x) & HIGHS) That's quite different from what's in this PR. In Rust, it would be: let align = mem::size_of::<usize>();
let ones = usize::MAX / (u8::MAX as usize);
let highs = ones.wrapping_mul((u8::MAX / 2 + 1) as usize);
let has_zero = |x: usize| (x.wrapping_sub(ones) & !x & highs) != 0; However, this can be made much simpler: const ONES: usize = 0x0101010101010101_u64 as usize;
const HIGHS: usize = 0x8080808080808080_u64 as usize;
let has_zero = |x: usize| (x.wrapping_sub(ONES) & !x & HIGHS) != 0; |
You can also use |
I think there are 2 tricks here: