Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MD5 SIMD #56558

Open
ronag opened this issue Jan 11, 2025 · 10 comments
Open

MD5 SIMD #56558

ronag opened this issue Jan 11, 2025 · 10 comments
Labels
feature request Issues that request new features to be added to Node.js.

Comments

@ronag
Copy link
Member

ronag commented Jan 11, 2025

What is the problem this feature will solve?

MD5 hashing is quite commonly used for file handling in HTTP. However, the algorithm is quite slow and is not able to fully utilize modern hardware.

What is the feature you are proposing to solve the problem?

While it's not possible to do much more to optimize a single hashing instance there are techniques such as https://github.com/minio/md5-simd which is able to run multiple hashing instances over SIMD which can process 8 MD5 hashes in parallel using SIMD instructions.

In a real world application such as a http file server/client (using Content-MD5) with many parallel request this would provide a 2-6x real-world performance improvement. In some of our applications the MD5 hash takes more than 50% of cpu time.

This should be possible to implement without changing our API's.

What alternatives have you considered?

Run hashing in a thread pool. One does not necessarily exclude the other. Using a thread pool would be more about avoiding latency spikes as in terms of throughput just forking the http server provides similar results.

@ronag ronag added the feature request Issues that request new features to be added to Node.js. label Jan 11, 2025
@github-project-automation github-project-automation bot moved this to Awaiting Triage in Node.js feature requests Jan 11, 2025
@ronag
Copy link
Member Author

ronag commented Jan 11, 2025

@nodejs/performance @mcollina @anonrig @lemire

@mcollina
Copy link
Member

How are you doing md5? In order support simd in this fashion, you'd need to batch the md5 requests and execute them in a separate thread anyway (maybe a libuv worker). It has to be asynchronous because you want other http requests to come in and request md5 hashing.

Literally this is worth doing only if your application is doing many md5 hashes concurrently (to the point it's mostly doing md5s).

Have you tried wrapping that module in a native addon and experimenting?

@tniessen
Copy link
Member

In a real world application such as a http file server/client (using Content-MD5) with many parallel request this would provide a 2-6x real-world performance improvement. In some of our applications the MD5 hash takes more than 50% of cpu time.

Could you share the JavaScript/Node.js code that led to these numbers?

@lemire
Copy link
Member

lemire commented Jan 11, 2025

@tniessen My suspicion is that @ronag refers to credible/potential gains and not to an actual Node.js implementation.

@mcollina
Copy link
Member

I think the information and discussion in fastify/fastify-etag#91 can be relevant.

@tniessen
Copy link
Member

FWIW, MD5 is not a good hash function by any means and new applications should avoid it except to interact with older protocols. That being said, even for hash functions that should be used, I am not sure that this kind of optimization makes sense within Node.js. I am not particularly optimistic about OpenSSL adding this internally, but that still seems like a better place to start optimizing the hash function implementation.

@ronag
Copy link
Member Author

ronag commented Jan 13, 2025

FWIW, MD5 is not a good hash function by any means and new applications should avoid it except to interact with older protocols.

Indeed. However, Content-MD5 is commonly used https://datatracker.ietf.org/doc/html/rfc1864

I am not sure that this kind of optimization makes sense within Node.js.

Why?

I am not particularly optimistic about OpenSSL adding this internally,

Why does it need to live in OpenSSL?

but that still seems like a better place to start optimizing the hash function implementation.

I don't think the MD5 hash implementation can be much more optimized than it already is in terms of processing a single stream.

@lemire
Copy link
Member

lemire commented Jan 13, 2025

I don't think the MD5 hash implementation can be much more optimized than it already is in terms of processing a single stream.

On ARM hardware, Bun is measurably faster than Node at MD5 hashing.

JavaScript hashing speed comparison: MD5 versus SHA-256

So I would not rule out small gains (e.g. 20%).

@tniessen
Copy link
Member

Indeed. However, Content-MD5 is commonly used https://datatracker.ietf.org/doc/html/rfc1864

Yes, this falls under the "older protocols" I mentioned above, but of course I did not mean to imply that these older protocols are not being used anymore. The unfortunate popularity of MD5 is also why OpenSSL has not removed it yet despite being fundamentally broken and slow.

Why does it need to live in OpenSSL?

We already use OpenSSL's implementation of MD5, so if that can be made faster for all downstream consumers, that'd be my first approach. Avoiding yet another dependency or even our own bespoke implementation of cryptographic primitives is a good rule of thumb in my opinion.

@joyeecheung
Copy link
Member

joyeecheung commented Jan 14, 2025

On ARM hardware, Bun is measurably faster than Node at MD5 hashing.

If you are using crypto.createHash() the overhead primarily comes from wrapper management/V8's GC performance, instead of having something to do with actual hashing, like what we already found in nodejs/performance#136 - to eliminate these factors, use crypto.hash() (which is why it's significantly faster than crypto.createHash() for one off hashes, though this doesn't matter all that much when hashing a big stream in chunks which is what crypto.createHash() is really for). Even using cppgc-based wrappers would be enough to make crypto.createHash() 8-10% faster in a microbenchmark that misuse it for one-off hashing, which shows how little this have to do with actual hashing and more having to do with GC...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Issues that request new features to be added to Node.js.
Projects
Status: Awaiting Triage
Development

No branches or pull requests

5 participants