-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster whitespace detection #26
Comments
fancy |
Did he explain it to you? He talked me through it - it's simple, cunning and opportunistic. :) |
I don't remember exactly how it worked. I will try to reverse engineer it (or ask him) later. |
The basic idea, IIRC, was that because all the whitespace characters are low in the ASCII table, you can shift 1.byte left by each byte, AND it with a bitmask containing a few strategically-placed 1s, and if the answer is nonzero, then it's whitespace. I forget how much of a saving this was, but Matthias had worked out exactly what the difference was at the machine level, and it saved a tiny amount in a particularly hot hotspot... |
@propensive If you can a link to some cod that does this I'd be appreciative. My instinct was first to just try an initial |
I don't remember exactly -- I only saw the code on his computer, but recollection is that it was something like this:
where I calculated that magic number from Though do you need it to fail fast? Could you do the shift (as above), and OR the result from each byte into an accumulator which you check once at the end of parsing, something like this? Just throwing ideas out there... I've no idea if this actually works, or profiles well... |
At nescala Matthias suggested there was a faster way to check for possible whitespace characters using bitwise logic. We should look into that.
The text was updated successfully, but these errors were encountered: