Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: audit assertions #422

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

RFC: audit assertions #422

wants to merge 9 commits into from

Conversation

bnb
Copy link

@bnb bnb commented Jul 28, 2021

There's been an interest expressed in the ecosystem of having some form of counterclaim for advisories surfaced by npm audit. There's been some discussion of a potential counterclaim mechanism for some time, but I've not seen a solution proposed.

The intent of this RFC is to do that - propose a solution. I do not expect that this solution will go through unanimously and unchanged, but I'd like to get something up that can be talked about and addressed both by the ecosystem and by those thinking about Security in the registry and CLI. If the answer to this is "no" with some set of reasons, that's a perfectly reasonable outcome.

I'm particularly interested in feedback from security researchers on why this is / is not a good solution, and what alternatives might be implemented or what tweaks might be made to make it a better solution. I appreciate feedback from developers who care, but also want folks to keep in mind that this has the potential for massively negative impact should a vulnerability get through that should not get through.

Copy link
Contributor

@ljharb ljharb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems awesome.

What would be most ideal, combined with this, is a way for maintainers to get disclosures that these advisories impact their packages before muggles do, so that there’s a window in which the assertion can be made before any noise has been unnecessarily spread.

accepted/9999-npm-audit-assertions.md Outdated Show resolved Hide resolved
@darcyclarke darcyclarke added the Agenda will be discussed at the Open RFC call label Jul 28, 2021
@wesleytodd
Copy link

A couple of initial thoughts, but I will want to think on this approach a bit more from a holistic sense as well.

  1. impactful=true|false might not be enough granularity
  2. we need a range specifier for sure (either a part of --module or a separate flag)
  3. we need a reason/explanation, otherwise we will loose valuable context. This should be required.
  4. we should differentiate between usage as a prod/dev/peer dep as well since in some cases that will matter
  5. on the audit/consumer side, we need a mechanism for defining trust.

I think num 5 is my largest concern, all the rest are really implementation details. My first thought on 5 is that these opinions need to be published as structured packages on the registry. Maybe the default one is provided by npm, and is what these claims are registered against, but I know for a fact we will need to override this trust. I think this will be really important to iron out before this lands.

- Do nothing: The current state of the world that has caused [pain](https://overreacted.io/npm-audit-broken-by-design/).
- Allow reporters to mark which modules are impacted: not really scalable, especially since reporters won't be experts in the same way that maintainers are.
- Do something in a neutral space, run by an independent organization: There's no reason this couldn't eventually be implemented under this API. It is contingent on that work being done and being successfully adopted by others. That shouldn't necessarily prevent us from implementing this and mapping it over to that later.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what this gets at is that there may other tools that will want to use the "asserted" data. The npm repository seems like a reasonable option for the data as it already the source of truth for the package itself. A few ideas off the top of my head:

  • provide a way to get the asserted data for a module without using the npm client (existing UI?, REST enpoint?)
  • Include the data when a module is installed/updated, that would like the local copy be used by other tools.
    allow other tools to leverage the data.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assume this data would be included with existing package metadata and therefore accessible how people already access package metadata.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree here mostly. To get the broader industry support I think this needs to be successful, we will need a well documented schema and set of api's. This feedback is something I have heard from multiple people.

I think the schema and api to load these counter claims needs to be platform agnostic. I think that even "publish this to the npm registry" should be an implementation detail for npm. Preferably npm would just build a client for this contract, or at maybe we need an "extensions" to the api so that npm could say "instead of a GET request to your provider server, install package foo and read file claims.json". Again, this is just rough ideas.

## Prior Art

I've not seen other tools doing this, but I would be surprised if there's not prior art. After some searching, I can't find anthing.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@naugtur links to your work as prior art would be good here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on a clean schema to PR to pkg collab space. Maybe I'll finish on Friday.
I thought this RFC is kept as a historical record. I should create a new one or rewrite this totally.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The closest that I'm aware of is Snyk's "Runtime prioritization":

Prioritize your fixes based on an analysis of the vulnerabilities that are called at runtime of the application and bear a higher risk

One way of defining "impactful" could be "is an affected function called by the dependency: if it isn't, then it's presumably not an 'impactful' vulnerability".

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, it's a different RFC 😅
I'll share links soon then. Thx

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This explains plans and links to most of the stuff
https://dev.to/naugtur/do-you-need-help-with-your-npm-audit-3olf

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And a direct link to the tool
https://www.npmjs.com/package/npm-audit-resolver

Copy link

@naugtur naugtur Aug 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Format for storing and exchanging assertions/decisions/resolutions/thingies

openjs-foundation/pkg-vuln-collab-space#11


- The potential values for the `--assertions` filter on `npm audit` could probably be bikeshedded a bit. I'm unattached to the names.
- UI in `npm audit` outpit could be bikeshedded as well.
- It would be cool if there was the ability to have multiple maintainers run `npm audit assert` for a given `id` and surface that _multiple_ people signed off on the assertion as a way to begin combatting some forms of social engineering attacks we've seen in the past, but that might be too complex for now.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good if there was a way for third parties to make assertions, and then end users decide which third parties they would trust.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with this, but I could see a potential argument for it.

It seems like a product space that could be explored by companies (for example, Sonatype Nexus or jFrog Artifactory) but adding in additional complexity of "trust external authorities" to users who are already innundated with options is complexity that I'm not sure I want to make users have to care about. What happens when someone trusts giithub rahter than github, synk rather than snyk, or IḄM rather than IBM? There are solutions here but it's just unnecessary complexity in a solution that's inteded to help solve complexity.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I was meaning in my number 5 point above. Having multiple sources of trust here is also important. I even think we might need to go so far as to provide a way to trust opinions in a "scope". For example to say "I trust the express team on modules provided by them, but not on the ones provided by the react team which might be shared".

And to go even one step further, I think you would want to only trust the express maintainers for transitives imported as part of their dep tree exclusivly. So if there was a shared dependency express > path-to-regexp and react-router > path-to-regexp then you would still need to report on path-to-regexp even if the express counter claim said it could be ignored.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who do you trust is a crucial point here.
In general maintainers don't want to include vulnerabilities in their packages, but they do, on accident. That's why it's great that there are neutral people looking at these packages and creating CVE-s.

In the same fashion, most maintainers don't want to sign off dependencies as safe when they are not, but again: mistakes happen. We are trusting the same person not to make a mistake, but this time there is no third party to verify the claim (of course, we could have CVE-s listing false claims, but that would recreate the situation we have today, or even worse).

Not to mention that maintainers have strong incentives to sign off packages with minimal time spent looking into all the possible ramifications: if your package produces lots of npm audit hits, you receive lots of issues. If you claim that all your dependencies are safe, you get less issues...

@thescientist13
Copy link

thescientist13 commented Jul 28, 2021

Apologies if I'm speaking out of turn here, but in regards this proposal at a high level, but I suppose specifically around this flag in particular (i think this is what @wesleytodd was getting at here maybe? - #422 (comment))

  • --impactful=<boolean> this should just be true or false. If it's true, then an assertion is made that the advisory is impactful to the module passed to --module. If it's false, then an assertion is made that the advisory is not impactful to the module passed to --module.

Isn't the nuance in Dan's post that it really all comes down to the runtime context of the vulnerability in question that matters the most? For example, if there is a critical vulnerability in lodash that is based on malicious user input; if Gatsby is using it for its dev server, then user input likely never enters the equation. But if I am using it in my client side and say supplying user input because i'm using it for a filter tab in my UI from user data, then the vulnerability would apply.

So I guess, who is the intended user / audience of this command? Is it even limited to a certain type of user?

  1. Package authors
  2. Direct package consumers
  3. Transitive package consumers

I would argue it's only likely users 2 + 3 that are in the best context to determine the scope of the impact, but it seems that this command is intended to be run by users 1, but aren't they limited by the lack of their awareness to the direct proximity of the actual target site, if it even applies to them in that case?

I guess I'm not sure if this RFC aims to reconcile that at all, or it is literally just for exposing a way to counter a claim, with the scope of who should run it, is not intended here. Or is the intent that this would be more of a crowd sourcing mechanism, so based on real user input, the severity of a vulnerability could be challenged? Or that the CVE itself would be updated for more context based on the incoming reports?

Apologies for the questions, or if this isn't the right forum, happy to post this somewhere else. 👍

@ljharb
Copy link
Contributor

ljharb commented Jul 28, 2021

Package authors absolutely can know - just not always.

@bnb
Copy link
Author

bnb commented Jul 28, 2021

What would be most ideal, combined with this, is a way for maintainers to get disclosures that these advisories impact their packages before muggles do, so that there’s a window in which the assertion can be made before any noise has been unnecessarily spread.

@ljharb I am extremely against a system like this. It creates a weird caste system that's incredibly easy to game. There's a company that's done a similar thing, and it could not have been more end-user hostile and easy to get around to get early access to information.

If security researchers disagree with me, I'm more than happy to concede to their recommendation.

@asciimike
Copy link

So I guess, who is the intended user / audience of this command? Is it even limited to a certain type of user?

  1. Package authors
  2. Direct package consumers
  3. Transitive package consumers

I would argue it's only likely users 2 + 3 that are in the best context to determine the scope of the impact, but it seems that this command is intended to be run by users 1, but aren't they limited by the lack of their awareness to the direct proximity of the actual target site, if it even applies to them in that case?

IMO the point of this command is to filter out the noise of "not-actionable" vulnerabilities for package consumers, so I view it as: 1 is the group best able to set the impactful flag (e.g. they will know best if the vulnerability is going to impact everyone, only appears in their dev server, doesn't appear at all, etc.), while 2 and 3 are in the best position to use it (combined with 4: tools that help resolve vulnerabilities [disclosure: I work on Dependabot]).

@thescientist13
Copy link

thescientist13 commented Jul 28, 2021

1 is the group best able to set the impactful flag (e.g. they will know best if the vulnerability is going to impact everyone, only appears in their dev server, doesn't appear at all, etc.)

But how would they know all that and / or communicate that without knowing the usage context? (is what I'm not understanding here atm it appears.) Unless there is more nuance in this reporting structure / mechanism that I am missing?

@asciimike
Copy link

What would be most ideal, combined with this, is a way for maintainers to get disclosures that these advisories impact their packages before muggles do, so that there’s a window in which the assertion can be made before any noise has been unnecessarily spread.

@ljharb I am extremely against a system like this. It creates a weird caste system that's incredibly easy to game. There's a company that's done a similar thing, and it could not have been more end-user hostile and easy to get around to get early access to information.

If security researchers disagree with me, I'm more than happy to concede to their recommendation.

I totally agree that we don't want to create a "these people are special and get to see vulnerabilities first", but I definitely want to see if we can get maintainers (ideally first of direct dependencies) to signal if the dependency impacts them ASAP.

I view the current noise problem as: $DEEP_DEPENDENCY_IN_POPULAR_PACKAGE is potentially affected by some vulnerability in $DEEPER_DEPENDENCY and today we notify everyone in $POPULAR_PACKAGE about things that often don't affect them. IMO, it's better to get the maintainer of $DEEP_DEPENDENCY to say "this does/doesn't affect my consumers" than to immediately send it to $POPULAR_PACKAGE's maintainers to try and sort out.

@asciimike
Copy link

asciimike commented Jul 28, 2021

1 is the group best able to set the impactful flag (e.g. they will know best if the vulnerability is going to impact everyone, only appears in their dev server, doesn't appear at all, etc.)

But how would they know all that and / or communicate that without knowing the usage context? (is what I'm not understanding here atm it appears.) Unless there is more nuance in this reporting structure / mechanism that I am missing?

Ah, I think I may be thinking of maintainer/direct/indirect differently, so I'll provide an example. Let's imagine that I maintain $PACKAGE (I think this is "direct"?) and you maintain $DEPENDENCY (I think this is "maintainer"), which I depend on. Let's also imagine that $POPULAR_PACKAGE (which is "indirect") depends on my $PACKAGE.

If someone reports a vulnerability on #DEPENDENCY, you are definitionally impacted by this, so impactful=true by default (though this probably brings up the "is a boolean the right choice" question), thus you won't really be using this tool at all (which I think is what you mentioned). It's now up to me, the maintainer of $PACKAGE to determine if it's impactful in my use case (which is what I meant by package maintainers of direct dependencies, apologies). So maintainers of directs would be the ones who set the flag, and consumers of dependencies where impactful=true would be the ones consuming it.

@bnb
Copy link
Author

bnb commented Jul 28, 2021

Just for clarification:

The goal is to get maintainers (in a previous comment, group 1) to run this command to provide a signal to their users about whether or not a (transitive) dependency is impactful to a given module.

@thescientist13
Copy link

thescientist13 commented Jul 28, 2021

Sorry, yeah, those labels weren't super explicit but I think we're mostly still talking about the same thing: some package (1) <- some meta framework (2) <- user of said meta framework (3)

But I think @bnb 's comment helps some I think

The goal is to get maintainers (in a previous comment, group 1) to run this command to provide a signal to their users about whether or not a (transitive) dependency is impactful to a given module.

Would be curious to know how this command gets scoped so that only that particular maintainer (user #2 in my fictional scenario) is indeed them and they are in the "right" to signal this claim.

@wesleytodd
Copy link

I am extremely against a system like this. It creates a weird caste system that's incredibly easy to game.

I think this already exists and is made up of the module authors already. This is also the same group @ljharb proposes to get the "early access" so I am not sure it this holds up. I totally understand where you are coming from and agree that a caste system is not what we want.

Maybe a middle ground would be a grace period after the cve is registered in the database where it is only reported on if a flag is added? Even a 1 day grace here could do a lot to avoid the storm that can happen when a cve report goes live.

@wesleytodd
Copy link

After thinking about this more and reading all of these comments I think that the npm RFC portion of this work should a client built on top of some other systems. That said, if we were to choose a shorter term approach of proving out a method we plan to up level to a more industry applicable solution that would be fine to. I still think that we would want to design the short term solution in a way to actually prove out the larger concerns though, and in that regard this RFC is not yet all we need.

So, to satisfy that I think we need a proposal for an agnostic api for a "source of trust" provider. This api should be simple to implement and scalable (think like how the plain documents work to form the registry api's we use). Additionally it should specify at least a starter schema for what a counter claim looks like. Once we have that, I think that an npm specific implementation were the source of trust could be a published package would be awesome.

Additionally, a command in the cli would act as a client for this api. For the filing of a counter claim we would want the interface to provide all the necessary schema parts, but other than that I imagine we could leave the handling of that up to the implementers. In this way, we would get npm native control over how it manages it's own default trust but also enable third parties and enterprise use cases to have a clear path to implement their own counter claim ingestion model.


A deep dive would be awesome to sort this all out, unfortunately I think I will have a difficult time making next week. I am on vacation and will be in Yellowstone where from what I hear any connectivity is hard to come by. If that timeline is what folks want, I can write up my thoughts as best I can, but I was hoping we could have a few discussions in the collab space before going too far on solutions.

@bnb
Copy link
Author

bnb commented Jul 28, 2021

Would be curious to know how this command gets scoped so that only that particular maintainer (user #2 in my fictional scenario) is indeed them and they are in the "right" to signal this claim.

presumably if you can npm publish to a module (in the RFC, I use the example that I am theoretically a maintainer of fastify), these rights would be granted to you as well. The authentication mechanism would be the same: npm login (the real name is npm adduser but lol).

@thescientist13
Copy link

Would be curious to know how this command gets scoped so that only that particular maintainer (user #2 in my fictional scenario) is indeed them and they are in the "right" to signal this claim.

presumably if you can npm publish to a module (in the RFC, I use the example that I am theoretically a maintainer of fastify), these rights would be granted to you as well. The authentication mechanism would be the same: npm login (the real name is npm adduser but lol).

Gotcha that makes sense. So if npm audit submits to NPM, then we should be good to go then. Is npm where these reports would be submitted? Or would this local check just be a step 1 in the process?


Also, I think I understand a bit better the mechanism you are proposing now after your replies, so effectively from your example, running

$ npm audit assert --module=fastify --id=CVE-xxxx --impactful=<boolean>

The CVE in question then would be against one of fastify's dependencies (e.g. something from its package.json), not itself.

If so, I wonder if there's any additional context / heuristics that could be gleamed from looking at package.json or package-lock.json / yarn.lock to provide additional context at the time npm audit is run?

  • is said vulnerability in dependencies / devDependencies / peerDependencies
  • if not directly defined in a manifest then, "mark" it as a transitive dependency (of some sort of top level) and factor that in perhaps

Some of that info could be useful for helping others become more aware of in what specific way fastify (in this case) is using said dependency, and thus if it even falls into the scope of the CVE (or at least help paint a more accurate picture of usage). Maybe with some sort of confidence score, much like dependabot now when submitting a PR to a repo.

@iamwillbar
Copy link

I've been doing some work with the National Telecommunications and Information Administration (NTIA) around Software Bill of Materials, and they've also been working on some requirements for sharing vulnerability exploitability which aligns with some of the goals of this RFC.

They've been working with the CSAF 2.0 project in OASIS to contribute a VEX profile that allows you to communicate whether a particular vulnerability is exploitable in a particular software product.

Looking at the information they capture the only thing missing from this RFC would be the assertion specifying why the vulnerability can't be exploited: "SHALL contain a a description why the vulnerability cannot be exploited". So as @wesleytodd suggested above I recommend that this information be captured as a required field.

@ljharb
Copy link
Contributor

ljharb commented Jul 28, 2021

@bnb i'm not sure what you mean. there's already such a system - dependency maintainers versus end users. This is both correct and necessary.

fwiw, CVE disclosure already works this way for the vulnerable package. I was only suggesting that this exclusivity be slightly loosened to include authors that depend on the vulnerable package.

@bnb
Copy link
Author

bnb commented Jul 28, 2021

fwiw, CVE disclosure already works this way for the vulnerable package.

yes, the vulnerable package maintainers need to know their package is vulnerable. Not everyone can sign up randomly to be a maintainer of that package and get early access to the information before it's publish, presumably?

I was only suggesting that this exclusivity be slightly loosened to include authors that depend on the vulnerable package.

how do you implement this in a way that isn't effectively the exact same as publishing publicly?

@bnb
Copy link
Author

bnb commented Jul 28, 2021

Looking at the information they capture the only thing missing from this RFC would be the assertion specifying why the vulnerability can't be exploited: "SHALL contain a a description why the vulnerability cannot be exploited". So as @wesleytodd suggested above I recommend that this information be captured as a required field.

@iamwillbar I've added --comment=<string> as a required flag. Glad to know that I've already gotten pretty far along what those other grpups are already working towards <3

@asciimike
Copy link

asciimike commented Aug 12, 2021

I think some of the concerns expressed earlier could be mitigated if there was more choice in terms of which asserts to trust. For example there are many maintainers that I would trust to make good assertions even if I would be more cautious with others. I might also trust come organizations etc.

IIRC this came up in the Open RFC meeting discussion last week w.r.t. https://github.com/naugtur/npm-audit-resolver and #18.

I have been thinking about this in the context of "I trust GitHub's security lab", "I trust my organization's infosec team", or even "I trust the maintainers of sufficiently large packages", but "I don't trust any arbitrary maintainer."

IMO, the tricky thing is that the average npm user probably hasn't thought deeply about who they trust, so having sane defaults here is going to be key to actually reducing the noise. Potentially something like "maintainers of direct dependencies" (or even "trust maintainers by default" since you're already trusting them by default by using their packages, as brought up earlier in this discussion), top N consumers of a package, certain companies, etc.

Individuals could override it with something like npm audit --trust-assertions-from=npmjs.org,[email protected] or via the presence of a file like audit-resolve.json (note the flag's name is terrible, but for the sake of clarity it's included in full).

@Tofandel
Copy link

Tofandel commented Aug 13, 2021

What about adding a categorization of vulnerabilities?

Having categories for vulnerabilities affecting nodejs servers, affecting the browser, affecting code build tool, etc

And then the users specify in their local package what exactly the app does and npm audit should only show the corresponding vulnerabilites eg: build a bundle for the browser locally => only show vulnerabilites affecting the browser

Run a nodejs server => show those vulnerabilities

Run a nodejs server where users can submit and build code dynamically => show all the vulnerabilites

There isn't a ton of use cases for npm so I think it could easily fit in less than 5 categories to be defined

@ljharb
Copy link
Contributor

ljharb commented Aug 13, 2021

There are innumerable use cases for npm, I can think of dozens of categories off the top of my head.

@bnb
Copy link
Author

bnb commented Aug 20, 2021

It's on/off. You have to trust all maintainers or none. I think some of the concerns expressed earlier could be mitigated if there was more choice in terms of which asserts to trust. For example there are many maintainers that I would trust to make good assertions even if I would be more cautious with others. I might also trust some organizations etc.

@mhdawson I get concerned about reducing the viability of this feature by extending it to be configurable in this way. For example, if the ecosystem starts going down the path of recommending to ignore all maintainers' assertions then we're back in the situation we're in now.

Further, thinking about this in the context of scalability I'm not sure if this is actually helpful: you and I may know that someone like Sindre is probably trustworthy/knowledgeable and coolGuyMaintainer420 who happens to maintain one module with 400k downloads maybe isn't. I don't think it's reasonable to expect a random infrastructure team at JetBlue to be able to reasonably make that trust judgement.

This is explicitly why the unfiltered option exists: it still provides the information that a maintainer made an assertion, but shows it to the end-user and allows them to do what they want with that information.

My personal preference would be to ship with just unfiltered and allow people to play with building on top of that to see what solutions come up that we might not be thinking about, as I very much imagine there's going to be people who want to do that. If there's additional information we could include in the response to help with that (perhaps a list of maintainers and whether or not they each made an assertion?), I think it would be very reasonable to include that.

Individuals could override it with something like npm audit --trust-assertions-from=npmjs.org,[email protected] or via the presence of a file like audit-resolve.json (note the flag's name is terrible, but for the sake of clarity it's included in full).

@asciimike what would you see the model for identity being here? How do I register npmjs.org with assertions? What happens if I let package.com lapse and it gets picked up by a malicious third-party?

I think this could be good but I'd want to have an airtight definition of the model we use for identification.

One thing I have liked for a long time is the idea of publishing meta information that can be consumed by npm - so for example, npm audit --trust=@npmjs/assertions where @npmjs/assertions is a module that returns a JSON object in a certain format. If we require scopes for assertions this makes things pretty nice imo. If you really want to get into it, it also enables the following in package.json:

{
  ...
  "assertions": {
    "trust": {
      "@npmjs/assertions": "^1.2.0"
    },
    "distrust": {
      "@evilcorp/assertions": "^6.8.11"
    }
  } 
  ...
}

This also has the benefit of routing through the same registry mechanisms that npm already uses, so you'd be able to privately publish assertions and have them be usable internally with the same registry setup you already have.

I do, however, think that shipping without this and then adding it in a later RFC would be the most ideal path both for the sake of complexity/velocity and for the sake of validating that we're going down the right path.

@bnb
Copy link
Author

bnb commented Aug 20, 2021

There isn't a ton of use cases for npm so I think it could easily fit in less than 5 categories to be defined

This isn't really true. It is by far the largest programming ecosystem in the world, serving many different use cases. Static Analysis would be the most useful thing here, but unfortunately that is Very Hard in JavaScript and imo out of scope for the npm CLI.

@mhdawson
Copy link

mhdawson commented Aug 23, 2021

I get concerned about reducing the viability of this feature by extending it to be configurable in this way. For example, if the ecosystem starts going down the path of recommending to ignore all maintainers' assertions then we're back in the situation we're in now.

I don't quite follow how making it configurable causes this problem versus making it better. If you only have the choice between accepting all or ignoring all, then there is a higher chance that the recommendation could end up being to ignore all.

In terms of scalability I would not expect every individual company to build up their own list, but instead rely on shared experience. For example maybe company X publishes what they use and other smaller companies rely on that. I will agree that it makes things more complicated. The question in my mind is if security organizations will adopt the "trust all" if the answer is yes, then that is the easiest/lowest friction. I would agree that it should be easy/low friction to say "trust all" if that is what you want.

@asciimike
Copy link

@asciimike what would you see the model for identity being here? How do I register npmjs.org with assertions? What happens if I let package.com lapse and it gets picked up by a malicious third-party?

I think this could be good but I'd want to have an airtight definition of the model we use for identification.

Domains may be a bad option due to difficulty of proving ownership :/

On some further thought, tying it to existing user accounts may be feasible. If someone asked me to build a system for GitHub, starting with GitHub GPG keys (e.g. https://github.com/<user>.gpg) feels reasonable. It could be extended to allow "trust an attestation from any maintainer of a repo, or member of an org" (e.g. --trust org or --trust org/repo). Each attestation would be signed with the key associated with the user account making the attestation, and could thus be attributable to that user.

I assume that npm could provide a similar mechanism (users upload their own keys, or keys are generated like how GitHub does it), and you could provide per-user, per-package, and per-organization attestation filtering.

@bnb
Copy link
Author

bnb commented Sep 1, 2021

I don't quite follow how making it configurable causes this problem versus making it better. If you only have the choice between accepting all or ignoring all, then there is a higher chance that the recommendation could end up being to ignore all.

It makes the problem significantly less clear for end-users. Trust vs. no trust is an easy concept. Trust vs. find random sources to trust that are influenced by social capital and personalities vs. no trust is a weird space that massively complicates things. For example:

  • Maintainers
    • Should maintainers be publishing those who they trust?
    • Are maintainers required to maintain something entirely new?
    • Are maintainers going to be shamed into maintaining something new?
  • Companies
    • How do they configure trust consistently at scale?
    • How do they know who to trust?
    • What happens when their sources of trust are compromised or out of date?
    • Who maintains trust lists?
    • What happens when the people who maintain trust lists leave?
    • What happens if someone sets up trust that they shouldn't?
    • What happens when the previous point causes fear and thus they trust nobody?
  • Individuals
    • How do I know who to trust?
    • What happens when I forget I trusted someone that I shouldn't have?
    • Will npm tell me when someone I trust is untrustworthy?
    • Twitter Influencer tells me I shouldn't trust someone - do I believe them?

Compared to:

  • I can (transitively) trust the maintainers of my direct dependencies and their judgement
  • I can't (transitively) trust the maintainers of my direct dependencies

I think that for an initial feature, the latter is far more beneficial. It accomplishes multiple things:

  • Immediately reduces unnecessary and unhelpful noise for the maintainers of vital infrastructure
  • Introduces an option for those who really care about this kind of thing to have high fidelity input from maintainers on what the maintainers think, despite those who really care not necessarily being SMEs in any given module.
    • Allows faster triage and private patching.
  • Limits the additional context ecosystem members need around such a feature to be comfortable, allowing adoption to be smoother and grow the collective common knowledge more rapidly.
    • This is an approach that's often taken in project management, game design, and other fields: make incremental changes to validate and ensure you're headed in the right direction.
  • Is not one-way: shipping this feature with third-party trust will require significantly more effort and is not something that can be easily undone if we realize it's a mistake. Shipping this feature without third-party trust allows us to have the feature and expand on it later if that's something that we independently identify as being useful.
  • The set of people talking here is relatively limited. It would be trivially easy to pipe out the unfiltered output as JSON and build additional tooling on top of it. IMO doing that and seeing what the ecosystem does (if anything) is ideal, since there may well be better ideas and approaches on third-party trust that we've not considered.

@mhdawson
Copy link

mhdawson commented Sep 3, 2021

As a last comment I will agree that capturing and getting information from the maintainers is a good first step and don't want to complicate initial steps from being taken :)

@stefee
Copy link

stefee commented Sep 4, 2021

One way of framing the problem that has been discussed so far in this thread is to think of each npm user as being in one of the following categories:

  1. I trust the maintainers of the packages I depend on.
  2. I do not trust the maintainers of the packages I depend on.
  3. I have never thought about whether or not I trust the maintainers of the packages I depend on.
  4. I have a much more nuanced view than that.

For users in category 1 and 2, the proposed changes have no adverse effect either way, since I already made a conscious choice to opt in or opt out of the npm registry (i.e. consent).

If I imagine that I am in category 3, the proposed changes cause a subtle change for me, as previously I did not realise consciously that I was giving implicit consent to package maintainers to make code changes on my behalf, and now the scope of that consent is being increased to also encompass making good decisions about security on my behalf (?) - and this change in scope may happen without my knowledge when I start using the latest version of npm.

If I am in category 4, then it's possible I am happy to trust maintainers under the terms of how npm currently works, but I would not be happy to trust maintainers under the terms of how npm works if the proposed changes were to go into effect.

Does this pose a security risk or a significant ethical problem? I really don't think so.

I think what it might mean is, if the changes are accepted, and there is general consensus that the scope of consent is in fact changing as a result, npm has some measure of responsibility to tell its users that it is happening so that they can make informed choices.

@bnb
Copy link
Author

bnb commented Sep 7, 2021

If I imagine that I am in category 3, the proposed changes cause a subtle change for me, as previously I did not realise consciously that I was giving implicit consent to package maintainers to make code changes on my behalf, and now the scope of that consent is being increased to also encompass making good decisions about security on my behalf (?) - and this change in scope may happen without my knowledge when I start using the latest version of npm.

I generally think this is an accurate representation. The point I'd make here is that maintainers are already implicitly making decisions about security on your behalf. For example, if I think there's a vulnerable path in module and PR it to module and the maintainers of module accept it, they are making a security decision about my application. While the proposed avenue to make such decisions is more direct, it's also more directly auditable.

npm has some measure of responsibility to tell its users that it is happening so that they can make informed choices.

I generally agree with this and would expect quite a bit of communication through various avenues around it.

@lil5
Copy link

lil5 commented Oct 27, 2021

As I have mentioned in this issue
npm/cli#3930 (comment)

a. Maybe it's better to add something like the audit level, a boolean flag to specify if the issue will only effect a developer's machine.

b. Or just advise that the audit errors in question are labelled by a low audit level (e.g. info) and that by default npm audit does not show errors with info level.

@ljharb
Copy link
Contributor

ljharb commented Oct 27, 2021

That's not something that can be known by a package author. Nothing's always a dev dep, even eslint.

@Thayaruban
Copy link

Is this proposal to be implemented soon?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.