[Foundation] Data: Hash the entire contents, not just an arbitrary subset #23876
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This was cherry picked from commit b711ed9 in #23832.
Data
already implementshash(into:)
, so arguably this change doesn't belong in that PR anyway.This is one case where the existing hashing implementation deliberately fails to feed all relevant information into the hasher. This makes it trivial to induce an arbitrary number of collisions, despite all the effort that went into implementing a high-quality universal hash function.
This seems like a particularly bad idea for
Data
, which is the standard type for safely exchanging byte buffers across module boundaries. It is reasonable to expectData
to implement the same high quality hashing asArray
does; silently doing otherwise is not a good idea.We previously discussed this issue in #21754.