Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: audit assertions #422

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions accepted/9999-npm-audit-assertions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# `npm audit` assertions

## Summary

A mechanism for module maintainers to assert that their modules are not impacted by advisories.

```
npm audit assert --module=<module-name> --id=<advisory identifier> --impactful=<boolean> --comment=<string with reasoning>
```

bnb marked this conversation as resolved.
Show resolved Hide resolved
## Motivation

It's relatively common that `npm audit` creates a painful experience for maintainers of transitive dependencies and for end-users, with a seemingly high likelihood to have a disproportionately large negative impact on beginners who don't actually know what's happening.

Further, the amount of noise that `npm audit` generates from transitive dependencies that rely on a "vulnerable" module where there is no world in which the "vulnerability" can be exploited leads to people ignoring `npm audit` rather than leveraging it to actually address vulnerabilities.

Enabling maintainers of dependencies - most importantly, dependencies that are most often transitive - to assert that their dependencies are not impacted by known vulnerabilities is a way to dramatically reduce the amount of pain that `npm audit` creates.

Further, this creates an additional cascading signal of _validity_ when the maintainers of a transitive dependency mark the advisory as impactful. As maintainers make assertions, we get more context about the impact of advisories across the ecosytem, which - if surfaced to end-users through `npm audit` - helps provide more context to make an informted decision about how to address advisories that they're seeing.

## Detailed Explanation

- `npm audit assert` will be added with the following arguments:
- `--module=<module-name>` **(required)** where `<module-name>` is the name of the module having an assertion made. If I was a maintainer of `fastify` and wanted to make an assertion about it, I'd use `--module=fastify`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) Why module not package? I thought NPM calls them packages everywhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed; a module is a file; vulns are reported against packages.

- `--id=<advisory identifier>` **(required)** is the identifier of the advisory that impacts my module that I'm making an assertion about. As a hypothetical maintainer of `fastify` wanting to make to make an assertion about the npm advisory `1726`, I'd use `--id=1726`. This also potentially allows for future expansion into other identifiers. For example, if I wanted to make an assertion about `CVE-2021-28562` advisory, I'd use `--id=CVE-2021-28562`. I'd love to see namespacing, like `npm-1726`, but I could understand why people might be against that.
Copy link

@naugtur naugtur Jul 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep dive on the granularity used for targeting vulnerabilities/dependencies.

TL;DR: module+id may not be enough for the most widespread usecase.

If assertions are only identified by Advisory ID and module name, it makes only one kind of assertions possible - counter-claims by the maintainer on CVEs reported against the package.
While this is simple to define and immensely valuable for the particular usecase, I'd argue it's not the majority usecase.

What seems more likely is a major package with active maintainers depends (directly or not) on a small package with hobby-level support. Now a CVE reported against the small package is more likely to be a result of a script scanning for potentially vulnerable code and on top of that - it's often the case that the small package is not used by the major package in a way that's exploitable (or at all). And last but not least, it's the major package that's likely to have the resources to investigate.

Specific example of a relationship with a vulnerable package:
Imagine a valid CVE for a small package A which is an indirect dependency of a large and popular package B.
The vulnerability is there and no valid counter-claim can be made for package A.
Package B though has a dependency Q in dependency W that only uses package A for one functionality that B never uses (Q is a toolbox style library)
Maintainer of A is not in a position to counter anything (report is valid). Maintainer of B needs to assert that the specific path in their dependencies: B:W>Q>A with specific CVE ID is not affecting the operations of package B.

The ability to declare a vulnerable dependency is not affecting a major package is what can make a noticeable dent in the number of vulnerabilities reported by npm audit even without counter-claims being widely adopted. And it's not my main usecase - it's the first thing I heard when I talked to Package Maintenance WG a while back.

I'm close to publishing the initial proposal for the format to store and exchange counter-claims (I call them decisions, because they're a bit more than a boolean with explanation).
openjs-foundation/pkg-vuln-collab-space#11

I suggest assertions accept:

  • module+id when conter-claiming a CVE reported for the package itself
  • module+id+path when counter-claiming a CVE reported for a dependency

- `--impactful=<boolean>` **(required)** this should just be `true` or `false`. If it's `true`, then an assertion is made that the advisory is impactful to the module passed to `--module`. If it's `false`, then an assertion is made that the advisory is not impactful to the module passed to `--module`.
- `--comment=<string>` **(required)** is a comment that, if false, explains why the vulnerability cannot be exploited and, if true, explains why the vulnerability can be exploited.
- `npm audit` will need to be updated to both consume additional data and provide additional filters.
- by default, `npm audit` should
- surface advisories that have no assertions or `--impactful=true` assertions
- ignore advisories that have `--impactful=false` assertions, **presuming** there are no alternative paths to the advisories that have no assertions or `--impactful=true` assertions
- display maintainer-provided comments when an assertion has been made.
- the `--assertions` flag should be added
- it should take the following values:
- `only`: only show the advisories that have have assertions, `true` or `false`
- `only-true`: only show the advisories that have `--impactful=true` assertions
- `only-false`: only show the advisories that have `--impactful=false` assertions
- `unfiltered`: show all the advisories, regardless of whether they have assertions, `true` or `false`
- these should probably be sent to the server to be filtered server-side and enable smaller responses.
- the UI should also display if an assertion has been made, and what that assertion is.
- `--json` should include assertions, if any exist.
- The addition of `npm audit assert` will require some kind of change to the server that serves the information that powers `npm audit`. Specifically, this information will need to be accepted when `npm audit assert` is run, and returned with `npm audit` information. Additionally, depending on how this is implemented, this might reduce the number of requests made and data returned to end-users. Specifically, if the CLI assumes by default that users want to ignore vulnerabilities that have been marked as `--impactful=false` by transitive (or direct) dependencies **and** that is sent to the server, the server can return only what's needed rather than the full set of data.
- This should hit a registry endpoint that is routed through the registry the user has defined. This should be done with both the explicit goal of allowing push/pull of _internal_ assertions, provided that the regsitry the user is talking to support this, and enabling third-party registries to cache the assertions on the public registry.

## Rationale and Alternatives

- Do nothing: The current state of the world that has caused [pain](https://overreacted.io/npm-audit-broken-by-design/).
- Allow reporters to mark which modules are impacted: not really scalable, especially since reporters won't be experts in the same way that maintainers are.
- Do something in a neutral space, run by an independent organization: There's no reason this couldn't eventually be implemented under this API. It is contingent on that work being done and being successfully adopted by others. That shouldn't necessarily prevent us from implementing this and mapping it over to that later.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what this gets at is that there may other tools that will want to use the "asserted" data. The npm repository seems like a reasonable option for the data as it already the source of truth for the package itself. A few ideas off the top of my head:

  • provide a way to get the asserted data for a module without using the npm client (existing UI?, REST enpoint?)
  • Include the data when a module is installed/updated, that would like the local copy be used by other tools.
    allow other tools to leverage the data.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assume this data would be included with existing package metadata and therefore accessible how people already access package metadata.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree here mostly. To get the broader industry support I think this needs to be successful, we will need a well documented schema and set of api's. This feedback is something I have heard from multiple people.

I think the schema and api to load these counter claims needs to be platform agnostic. I think that even "publish this to the npm registry" should be an implementation detail for npm. Preferably npm would just build a client for this contract, or at maybe we need an "extensions" to the api so that npm could say "instead of a GET request to your provider server, install package foo and read file claims.json". Again, this is just rough ideas.

This solution both begins to address the problem that exists presently, doesn't exclude potential external solutions/patterns from being adopted in the future, and allows an implementation to be implemented with _relative_ haste.

## Implementation

{{Give a high-level overview of implementation requirements and concerns. Be specific about areas of code that need to change, and what their potential effects are. Discuss which repositories and sub-components will be affected, and what its overall code effect might be.}}

{{THIS SECTION IS REQUIRED FOR RATIFICATION -- you can skip it if you don't know the technical details when first submitting the proposal, but it must be there before it's accepted}}

## Prior Art

I've not seen other tools doing this, but I would be surprised if there's not prior art. After some searching, I can't find anthing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@naugtur links to your work as prior art would be good here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on a clean schema to PR to pkg collab space. Maybe I'll finish on Friday.
I thought this RFC is kept as a historical record. I should create a new one or rewrite this totally.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The closest that I'm aware of is Snyk's "Runtime prioritization":

Prioritize your fixes based on an analysis of the vulnerabilities that are called at runtime of the application and bear a higher risk

One way of defining "impactful" could be "is an affected function called by the dependency: if it isn't, then it's presumably not an 'impactful' vulnerability".

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, it's a different RFC 😅
I'll share links soon then. Thx

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This explains plans and links to most of the stuff
https://dev.to/naugtur/do-you-need-help-with-your-npm-audit-3olf

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And a direct link to the tool
https://www.npmjs.com/package/npm-audit-resolver

Copy link

@naugtur naugtur Aug 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Format for storing and exchanging assertions/decisions/resolutions/thingies

openjs-foundation/pkg-vuln-collab-space#11

## Unresolved Questions and Bikeshedding

- The potential values for the `--assertions` filter on `npm audit` could probably be bikeshedded a bit. I'm unattached to the names.
- UI in `npm audit` outpit could be bikeshedded as well.
- It would be cool if there was the ability to have multiple maintainers run `npm audit assert` for a given `id` and surface that _multiple_ people signed off on the assertion as a way to begin combatting some forms of social engineering attacks we've seen in the past, but that might be too complex for now.
- There's some need for granularity that is not presently defined by this RFC. Choosing how to address that is... likely necessary.
- semver ranges, dep type (direct, dev, optional, peer), binary impact vs. non-binary impact, and more.
- There's been an expressed desire for trusting third-parties *outside* of direct maintainers to provide asssertions. This should be discussed.