-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent symlinks causing duplicate package-file relationships #1168
Prevent symlinks causing duplicate package-file relationships #1168
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good - I can add a test for this new behavior so we explicitly show future developers that duplicate ID are supposed to be filtered out.
As symlinks are traversed as part of file resolution, a scenario in which a package owns a file and its respective symlinks, causes multiple relationships to be created between the package and the file (as the symlinks do not appear in the list of files in the output). We prevent these files from being confused with each other by de-duplicating the files at the point of creating ownerships, and removing duplicate coordinates. This ensures we only get a single copy of each relationship. Signed-off-by: Justin Chadwell <[email protected]>
6dab642
to
6481206
Compare
That would be awesome :) I'm definitely not as familiar with how syft does testing, so any help is massively appreciated 🎉 |
* main: Update syft bootstrap tools to latest versions. (anchore#1171) Fix update-bootstrap-tools workflow (anchore#1170) workflow to create automated PRs to update bootstrap tools (anchore#1167) feat: add support for licenses in package-lock json v2 (anchore#1164) External sources configuration (anchore#1158) feat: add support for pnpm (anchore#1166) Prevent symlinks causing duplicate package-file relationships (anchore#1168) Associate node package licenses from node_modules (anchore#1152) Give the contributing guide a substantial rework (anchore#1155) Signed-off-by: Christopher Phillips <[email protected]>
* main: Update syft bootstrap tools to latest versions. (#1176) enhance development support on macOS ARM (#1163) Capture if a node module is private (#1161) Find version numbers from jars with different naming conventions (#1174) Update syft bootstrap tools to latest versions. (#1171) Fix update-bootstrap-tools workflow (#1170) workflow to create automated PRs to update bootstrap tools (#1167) feat: add support for licenses in package-lock json v2 (#1164) External sources configuration (#1158) feat: add support for pnpm (#1166) Prevent symlinks causing duplicate package-file relationships (#1168) Associate node package licenses from node_modules (#1152)
As symlinks are traversed as part of file resolution, a scenario in which a package owns a file and its respective symlinks, causes multiple relationships to be created between the package and the file (as the symlinks do not appear in the list of files in the output). This seems to have been introduced in #782.
For example, see the following scan of an alpine image that contains the
libz
package:As you can see, we have two copies of the exact same relationship, one for the symlink
/lib/libz.so.1
and one for the regular filelibz.so.1.2.12
- which are both owned by thelibz
package. With #1156 merged, this also reflects in incorrect output for thehasFiles
field. As of v0.3.0 the golang SPDX parser produces an incorrect result, and produces anil
value in thePackage.Files
field: see here.We prevent these files from being confused with each other by de-duplicating the files at the point of creating ownerships, and
removing duplicate coordinates. This ensures we only get a single copy of each relationship.