Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix - [CodeQL:Warning]: SM04514: Weak hashes (in make-util.js) - AB#2236221 #20728

Merged
merged 2 commits into from
Dec 16, 2024

Conversation

AdityaMankal-MS
Copy link
Contributor

@AdityaMankal-MS AdityaMankal-MS commented Dec 12, 2024

Context

  • In the make-util.js, MD5 was being used to solve the problem of creating a unique, filesystem-safe identifier for each archive based on its URL. This identifier is used to name the directory where the archive is downloaded and extracted. Using a hash algorithm solves the following problems.
    1. Ensure Filename Uniqueness: The hash algorithm creates a unique identifier for each URL to avoid filename collisions, even if sanitized URLs are similar.
    2. Filesystem Safety: Hashing ensures the generated filenames are free of invalid characters that could cause issues on different filesystems.
    3. Compact Representation: The hash produces a short, consistent, and manageable string, avoiding overly long or complex filenames derived directly from URLs.
    4. Consistency Across Platforms: The hash guarantees uniform directory names regardless of the operating system or filesystem.

Change Description

This PR Addresses the CodeQL warning of using a weak Hash algorithm in make-util.js by replacing the MD5 algorithm with the SHA256 algorithm as per the recommendation - [SM04514] Weak hashes.

Considerations

The length of targetPath will increase proportionally with the hash output length.

  • MD5: Shortest at 32 characters.
  • SHA-256: Doubles the hash length to 64 characters.
  • SHA-384: Triples it to 96 characters.
  • SHA-512: Quadruples it to 128 characters.

Most modern filesystems (e.g., NTFS, ext4) support filenames up to 255 characters and full paths up to ~32,000 characters. However, if our Agent runs in environments with stricter limits, this could be a concern.

Change Validation

Created a simple NodeJS project and tested the downloadArchiveAsync function with MD5 and SHA algorithms -
javascript-project-crypto.zip

Observation: Length of the name of the archive when using SHA256 hash is double when compared to the MD5 hash.

Note: The '.completed' suffix has nothing to do with the chosen algorithm. This string is being appended in the script.

Task Name: NA

Documentation changes required: (Y/N) N

Added unit tests: (Y/N) N

Attached related issue: (Y/N) Y

Checklist:

  • Task version was bumped - please check instruction how to do it
  • Checked that applied changes work as expected

Related Links

  1. [SM04514] Weak hashes

@tarunramsinghani tarunramsinghani enabled auto-merge (squash) December 16, 2024 06:03
@tarunramsinghani tarunramsinghani merged commit 3224341 into master Dec 16, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants