Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instructions for Identifying Dependencies Unclear #631

Open
RespiteSage opened this issue Jun 6, 2017 · 9 comments
Open

Instructions for Identifying Dependencies Unclear #631

RespiteSage opened this issue Jun 6, 2017 · 9 comments
Labels
bug dependencies documentation live-online-scan Anything that requires a live, online netwrokd access (and would not workd in an isolated network) must have package scan

Comments

@RespiteSage
Copy link

The project description lists dependencies as one of the main categories of items Scancode detects; however, it is not clear from the wiki, the readme, or the command-line help how one actually uses Scancode to detect dependencies or whether this functionality is still missing or in development. Please update the readme or wiki to accurately reflect the state of dependency-checking in Scancode.

@pombredanne
Copy link
Member

@KinXer Thanks for chiming and this is a fair point, this is not much documented: this coming from the package scan. I will make sure this is clear in the upcoming doc for release 2.0. (e.g. both in the wiki and the CLi help and the readme)

@pombredanne
Copy link
Member

@KinXer if this was not clear it comes with the default scan or the --package option

@RespiteSage
Copy link
Author

RespiteSage commented Jun 18, 2017

@pombredanne Your previous reply made that clear; thank you for the replies. I do not, however, see how the packages section of the output is particularly useful for identifying dependencies. If that is a misunderstanding on my part, I ask that you make the documentation clear on that point.

@pombredanne
Copy link
Member

pombredanne commented Jun 23, 2017

ATM npm (JS), composer (PHP) and maven (POMs) direct dependencies are collected in the "dependencies" subsection of a given package entry.

Other are in the works (such as Godeps and Rubygems in https://github.com/nexB/scancode-toolkit-contrib/tree/develop/src/packagedcode2)

BUT there is bug.... this is not wired correctly!
You should see this for a Maven pom and you do not see this for instance:

$ ./scancode -p -f json-pp tests/packagedcode/data/m2/p6spy/p6spy/1.3/p6spy-1.3.pom
Scanning files for: packages with 1 process(es)...
Scanning files...
[####################] 1                       
Scanning done.
Scan statistics: 1 files scanned in 0s.
Scan options:    packages with 1 process(es).
Scanning speed:  2.39 files per sec.
Scanning time:   0s.
Indexing time:   0s.
Saving results.
{
  "scancode_notice": "Generated with ScanCode and provided on an \"AS IS\" BASIS, WITHOUT WARRANTIES\nOR CONDITIONS OF ANY KIND, either express or implied. No content created from\nScanCode should be considered or used as legal advice. Consult an Attorney\nfor any legal advice.\nScanCode is a free software code scanning tool from nexB Inc. and others.\nVisit https://github.com/nexB/scancode-toolkit/ for support and download.",
  "scancode_version": "2.0.0",
  "scancode_options": {
    "--package": true,
    "--license-score": 0,
    "--ignore": [],
    "--format": "json-pp"
  },
  "files_count": 1,
  "files": [
    {
      "path": "1.3/p6spy-1.3.pom",
      "scan_errors": [],
      "packages": [
        {
          "type": "Apache Maven",
          "name": "p6spy:p6spy",
          "version": "1.3",
          "primary_language": "Java",
          "packaging": "archive",
          "summary": "P6Spy",
          "description": "P6Spy is an open source framework for applications that intercept and optionally modify database statements.",
          "payload_type": null,
          "size": null,
          "release_date": null,
          "authors": [
            {
              "type": "person",
              "name": "Alan Arvesen",
              "email": "[email protected]",
              "url": null
            },
            {
              "type": "person",
              "name": "Bradley Johnson",
              "email": "[email protected]",
              "url": null
            },
            {
              "type": "person",
              "name": "Frank Quatro",
              "email": "[email protected]",
              "url": null
            },
            {
              "type": "person",
              "name": "Jeff Goke",
              "email": "[email protected]",
              "url": null
            },
            {
              "type": "person",
              "name": "thinknot",
              "email": "[email protected]",
              "url": null
            }
          ],
          "maintainers": [],
          "contributors": [],
          "owners": [],
          "packagers": [],
          "distributors": [],
          "vendors": [],
          "keywords": [],
          "keywords_doc_url": null,
          "metafile_locations": [],
          "metafile_urls": [],
          "homepage_url": "http://www.p6spy.com/",
          "notes": null,
          "download_urls": [],
          "download_sha1": null,
          "download_sha256": null,
          "download_md5": null,
          "bug_tracking_url": null,
          "support_contacts": [],
          "code_view_url": null,
          "vcs_tool": null,
          "vcs_repository": null,
          "vcs_revision": null,
          "copyright_top_level": null,
          "copyrights": [],
          "asserted_licenses": [
            {
              "license": "The P6Spy Software License, Version 1.1",
              "url": "http://cvs.sourceforge.net/viewcvs.py/*checkout*/p6spy/p6spy/license.txt?rev=HEAD",
              "text": null,
              "notice": null
            }
          ],
          "legal_file_locations": [],
          "license_expression": null,
          "license_texts": [],
          "notice_texts": [],
          "dependencies": {
            "compile": [
              {
                "name": "regexp:regexp",
                "version": null,
                "version_constraint": "1.3"
              },
              {
                "name": "gnu-regexp:gnu-regexp",
                "version": null,
                "version_constraint": "1.1.4"
              },
              {
                "name": "log4j:log4j",
                "version": null,
                "version_constraint": "1.2.8"
              },
              {
                "name": "ant:ant",
                "version": null,
                "version_constraint": "1.6.2"
              },
              {
                "name": "oracle:classes12",
                "version": null,
                "version_constraint": "9.2.0.5"
              },
              {
                "name": "jboss:jboss",
                "version": null,
                "version_constraint": "2.4.6"
              }
            ]
          },
          "related_packages": []
        }
      ]
    }
  ]
}

I need to fix this ASAP!

pombredanne added a commit that referenced this issue Jun 23, 2017
 * rename class to MavenPomPackage for clarity
 * fix the metafiles and the recognition for #631
 * add additional api and recognize tests
 * clean up the API doc and otehr minor refactorings

Signed-off-by: Philippe Ombredanne <[email protected]>
@pombredanne
Copy link
Member

This latest commit fixes the lack of Maven package collection. Other should work OK. Out of curiosity what are the package managers/formats you work with?

@RespiteSage
Copy link
Author

Mostly Maven and Pip. Thank you for the Maven update. It should be very helpful.

@pombredanne
Copy link
Member

@KinXer Thanks you ++ for bringing it up ... I cannot fathom how the whole code was there but not wired in properly and that there was no proper tests on the CLI and package recognition side :|

Note a couple things:

  1. on the python side, we are need to improve the code for installed packages detection (e.g. dist-info, egg-info etc) in Recognize Python packages (wheels, eggs, various dist as archives or installed) #253 and properly add the dependencies in Add dependencies for Pypi packages #653

  2. eventually there will be a tool to also resolve dependencies (including querying remote repos) in https://github.com/nexB/dependentcode/ : Some details are in Add draft conventions documentation for AboutCode Data (i.e. ABCD) aboutcode#2 (comment) and some ongoing discussion in Some ideas for package manager-independent dep resolution dependency-inspector#1

@pombredanne
Copy link
Member

The maven dependencies should be properly collected in develop now. .... Still need to add proper docs.

@pombredanne pombredanne removed this from the v3.3 milestone Sep 24, 2021
@pombredanne pombredanne added the live-online-scan Anything that requires a live, online netwrokd access (and would not workd in an isolated network) label Feb 2, 2022
@pombredanne
Copy link
Member

Some updates on how we handle dependencies now, repasting from #3828 (comment)

Just a bit of updates there:

  1. we detect direct dependencies in manifests and lockfiles now in ScanCode toolkit
  2. deplock in https://github.com/nexB/dependency-inspector/ can generate missing dependency lockfiles for parsing with 1.
  3. PurlDB can scan and store scan results for source and binaries for the packages
  4. ScanCode.io can detect the dependencies like ScanCode toolkit parsing the lockfile eventually generated by deplock
  5. We can also match other non-documented dependencies using matchcode (backed by PurlDB signatures)
  6. ScanCode.io can also find "hidden" dependencies in binaries using the "map deploy to devel" pipeline.

A simple process to scan all the dependencies:

  1. run deplock
  2. then scan your project in ScanCode.io to detect the packages
  3. add also the populate purldb pipeline: this will trigger a full source and binary scan of all the dependencies
  4. enrich the scan results with a purldb lookup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug dependencies documentation live-online-scan Anything that requires a live, online netwrokd access (and would not workd in an isolated network) must have package scan
Projects
None yet
Development

No branches or pull requests

2 participants