Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected archive maxDepth behavior #2942

Open
rgmz opened this issue Jun 7, 2024 · 4 comments
Open

Unexpected archive maxDepth behavior #2942

rgmz opened this issue Jun 7, 2024 · 4 comments
Labels

Comments

@rgmz
Copy link
Contributor

rgmz commented Jun 7, 2024

Please review the Community Note before submitting

TruffleHog Version

3.78.0

Trace Output

2024-06-07T10:23:31-04:00       info-0  trufflehog      running source  {"source_manager_worker_id": "43W0T", "with_units": true}
2024-06-07T10:23:31-04:00       info-0  trufflehog      archiver.Decompressor   {"source_manager_worker_id": "43W0T", "unit": "frobnitz-1.2.3.tgz", "unit_kind": "unit", "timeout": 30, "depth_before": 0, "depth_after": 1}
2024-06-07T10:23:31-04:00       info-0  trufflehog      archiver.Extractor      {"source_manager_worker_id": "43W0T", "unit": "frobnitz-1.2.3.tgz", "unit_kind": "unit", "timeout": 30, "depth_before": 1, "depth_after": 2}
2024-06-07T10:23:31-04:00       info-0  trufflehog      archiver.Decompressor   {"source_manager_worker_id": "43W0T", "unit": "frobnitz-1.2.3.tgz", "unit_kind": "unit", "timeout": 30, "filename": "mariner-4.3.2.tgz", "size": 905, "depth_before": 2, "depth_after": 3}
2024-06-07T10:23:31-04:00       info-0  trufflehog      archiver.Extractor      {"source_manager_worker_id": "43W0T", "unit": "frobnitz-1.2.3.tgz", "unit_kind": "unit", "timeout": 30, "filename": "mariner-4.3.2.tgz", "size": 905, "depth_before": 3, "depth_after": 4}
2024-06-07T10:23:31-04:00       info-0  trufflehog      archiver.Decompressor   {"source_manager_worker_id": "43W0T", "unit": "frobnitz-1.2.3.tgz", "unit_kind": "unit", "timeout": 30, "filename": "mariner-4.3.2.tgz", "size": 905, "filename": "albatross-0.1.0.tgz", "size": 321, "depth_before": 4, "depth_after": 5}
2024-06-07T10:23:31-04:00       error   trufflehog      error unarchiving chunk.        {"source_manager_worker_id": "43W0T", "unit": "frobnitz-1.2.3.tgz", "unit_kind": "unit", "timeout": 30, "error": "error extracting archive with format: .tar: handling file: frobnitz/charts/mariner-4.3.2.tgz: error extracting archive with format: .tar: handling file: mariner/charts/albatross-0.1.0.tgz: max archive depth reached"}
2024-06-07T10:23:31-04:00       info-0  trufflehog      finished scanning       {"chunks": 10, "bytes": 979, "verified_secrets": 0, "unverified_secrets": 0, "scan_duration": "7.209832ms", "trufflehog_version": "dev"}

Expected Behavior

Max archive depth should reflect the real depth of an archive. e.g., depth = 5 should correspond to 5 nested archives.

if depth >= maxDepth {
h.metrics.incMaxArchiveDepthCount()
return ErrMaxDepthReached

Actual Behavior

An archive with 2 levels of nesting is being reported as 5 levels.

error extracting archive with format: .tar: handling file: frobnitz/charts/mariner-4.3.2.tgz: 
-> error extracting archive with format: .tar: handling file: mariner/charts/albatross-0.1.0.tgz: max archive depth reached

Proof:

# First archive (depth 0)
$ tar -vtf frobnitz-1.2.3.tgz
drwxr-xr-x mattbutcher/staff 0 2016-05-24 13:42 frobnitz/
...
-rw-r--r-- mattbutcher/staff 905 2016-05-25 18:24 frobnitz/charts/mariner-4.3.2.tgz

# Second archive (depth 1)
$ tar -vtf mariner-4.3.2.tgz
drwxr-xr-x mattbutcher/staff 0 2016-05-24 13:38 mariner/
...
-rw-r--r-- mattbutcher/staff 321 2016-05-25 18:24 mariner/charts/albatross-0.1.0.tgz

# Third archive (depth 2)
$ tar -vtf albatross-0.1.0.tgz
drwxr-xr-x mattbutcher/staff 0 2016-05-25 18:20 albatross/
-rw-r--r-- mattbutcher/staff 81 2016-05-24 13:37 albatross/Chart.yaml
drwxr-xr-x mattbutcher/staff  0 2016-05-24 13:37 albatross/charts/
drwxr-xr-x mattbutcher/staff  0 2016-05-24 13:37 albatross/templates/
-rw-r--r-- mattbutcher/staff 19 2016-05-25 18:20 albatross/values.toml

Steps to Reproduce

  1. Download https://github.com/helm/helm/blob/25053e6adabd4d31edd036514b21527a384cea4f/pkg/chartutil/testdata/frobnitz-1.2.3.tgz
  2. Run TruffleHog: ./trufflehog/trufflehog filesystem frobnitz-1.2.3.tgz
  3. Observe a "max archive depth reached" error for the albatross-0.1.0.tgz file.
    2024-06-07T10:21:40-04:00       error   trufflehog      error unarchiving chunk.        {"source_manager_worker_id": "UkFZY", "unit": "frobnitz-1.2.3.tgz", "unit_kind": "unit", "timeout": 30, "error": "error extracting archive with format: .tar: handling file: frobnitz/charts/mariner-4.3.2.tgz: error extracting archive with format: .tar: handling file: mariner/charts/albatross-0.1.0.tgz: max archive depth reached"}
    

Environment

N/A

Additional Context

It appears that each archive gets passed to h.openArchive twice

  1. Triggers archiver.Decompressor
  2. Triggers archiver.Extractor

Each time, the depth is incremented.

References

N/A

@rgmz rgmz added the bug label Jun 7, 2024
@rgmz
Copy link
Contributor Author

rgmz commented Jun 7, 2024

This doesn't seem to occur if the order of archiver.Decompressor and archiver.Extractor are swapped, per #2928 (comment). Although, that may be a symptom rather than the root cause.

@ahrav
Copy link
Collaborator

ahrav commented Jun 9, 2024

We likely need to refine the depth incrementing logic to make it smarter. I'll look into this. Thanks for raising the issue.

@rgmz
Copy link
Contributor Author

rgmz commented Jun 13, 2024

This doesn't seem to occur if the order of archiver.Decompressor and archiver.Extractor are swapped, per #2928 (comment). Although, that may be a symptom rather than the root cause.

This isn't a feasible solution because it breaks compressed zips.

func TestHandleCompressedZip(t *testing.T) {

@baloo
Copy link

baloo commented Aug 15, 2024

I'm hitting this issue.
According to git-bisect, this would have been caused by http://github.com/trufflesecurity/trufflehog/pull/2387

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants