Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle vulnerabilities which don't have any vulnerability ids #259

Merged
merged 21 commits into from
Feb 23, 2021

Conversation

sbs2001
Copy link
Collaborator

@sbs2001 sbs2001 commented Sep 26, 2020

Fixes #232

  • Get the Vulnerability model and it's methods right. Use a timestamp as a custom id for now.
  • Refactor all the codebase to comply to the new Vulnerability model.
  • Create a repo which would contain all vulnerabilities with custom ids.
  • Add option to VulnerableCode to either assign these ids or sync the ids from the repo.
  • Automate dumping the vulnerabilities with custom ids to the repo
  • Decide the custom id and implement a data structure to make it feasible.

Signed-off-by: Shivam Sandbhor [email protected]

@sbs2001
Copy link
Collaborator Author

sbs2001 commented Sep 26, 2020

For the models the requirements as we decided upon at #232 are (this is a partial repaste of @pombredanne 's comment)

  • a. It has a CVE, then we reuse the CVE-xxxx as its identifier
  • b. It does not have a CVE, then we create the VULCODE-xxxx as its identifier
  • c. later if that VULCODE-xxx vulnerability gets a CVE, we will replace the id with the CVE-xxx id and move the VULCODE-xxx id as a reference for that vulnerability.

From a usage standpoint, this means that we should be able to search a vulnerability not only based on its identifier (that may change over time) but also based on its references.

Alternatively to c. above we could have a dedicated field to store the previous VULCODE-xxx when this is replaced by an assigned CVE-xxxid.

The model could look like this

  • Vulnerability
    • identifier: VULCODE-xxx or CVE-xxx
    • vc_identifier: empty or VULCODE-xxx if identifier is a CVE-xxx

@sbs2001
Copy link
Collaborator Author

sbs2001 commented Sep 26, 2020

The current models gives me this

In [1]: from vulnerabilities import models                                                                                                                                                                           

In [2]: z = models.Vulnerability.objects.create(summary="tar ball")                                                                                                                                                  

In [3]: z.identifier                                                                                                                                                                                                 
Out[3]: '2020-09-26 06:32:18.832135'

@sbs2001
Copy link
Collaborator Author

sbs2001 commented Sep 26, 2020

I'm using https://github.com/sbs2001/vulcodes as the repo for dev purpose.

@sbs2001 sbs2001 force-pushed the custom_vuln_ids branch 2 times, most recently from 705b859 to 020788c Compare September 26, 2020 13:26
@pombredanne
Copy link
Member

I am wondering ..."VULCODE-XXX" feels a little bit too generic, and I am kinda warming up to a the explicit albeit longer "VULNERABLECODE-XXX" prefix even if a tad long... it is explicit and meant to be replaced eventually by a CVE (and it is also very clear where it is coming from)?

Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!
Looking good except for the "vulcode/VULCODE" name that I would like to discuss more.

vulnerabilities/management/commands/push.py Outdated Show resolved Hide resolved
@pombredanne
Copy link
Member

The model could look like this

Vulnerability
    identifier: VULCODE-xxx or CVE-xxx
    vc_identifier: empty or VULCODE-xxx if identifier is a CVE-xxx

I wonder if we should not instead just treat the "past" vc_identifier as just an external reference?

@sbs2001 sbs2001 changed the title [WIP] Handle vulnerabilities which don't have any vulnerability ids Handle vulnerabilities which don't have any vulnerability ids Oct 7, 2020
@pombredanne
Copy link
Member

I revisited https://cve.mitre.org/data/refs/index.html and I suggest this instead prefixVULCOID as a prefix which is rather unique and would be short for Vulnerable Code Id.
We could have ids in this form then where the left part is an ISO-like time stamp e.g. VULCOID-YYYY-MM-DD-HH-MM-SS
The id would case insensitive with an uppercase canonical form.
Would this work?

@sbs2001
Copy link
Collaborator Author

sbs2001 commented Nov 27, 2020

@pombredanne VULCOID-YYYY-MM-DD-HH-MM-SS works, have you considered VULCOID-YYYY-<incremental id> ie something like CVE . FWIW It's easier to remember than the timestamp thus easy to reference.

@pombredanne
Copy link
Member

@sbs2001 re #259 (comment)

VULCOID-YYYY-MM-DD-HH-MM-SS works, have you considered VULCOID-YYYY-<incremental id> ie something like CVE . FWIW It's easier to remember than the timestamp thus easy to reference.

incremental id works too but requires more coordination than a context free timestamp. I am not sure a sequential number is easier to remember than a time stamp though it can be shorter.
We could go with:
VULCOID-YYYY-MM-DD-<incremental id> as a compromise?

@sbs2001
Copy link
Collaborator Author

sbs2001 commented Nov 29, 2020

@pombredanne

incremental id works too but requires more coordination than a context free timestamp

How so ?

The timestamp is dependent on when the instance encountered the vulnerability. coordination is needed in that case too if we want 2 instances to "have a common language" .

IMHO coordination is unavoidable.

@pombredanne
Copy link
Member

IMHO VULCOID-YYYY-MM-DD-HH-MM-SS is the scheme that requires no coordination and creates the least risk of collision and is still somewhat readable by a human. If there are collision across DBs that's OK.

@sbs2001
Copy link
Collaborator Author

sbs2001 commented Dec 17, 2020

@pombredanne It made sense to me about using timestamps instead of incremental ids after the chat. Main reason being, when using incremental id's it is guaranteed that collisions would occur ( between different instances) and hence they would then be needed to be resolved. This won't happen when using timestamps.

@sbs2001
Copy link
Collaborator Author

sbs2001 commented Feb 5, 2021

@pombredanne I am stashing the code to extract vulcoids to some other branch. This PR won't add that, that feature seems pre-mature to me.

Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! See my comments inline. IMHO the main issue we have is the name "identifier"
Either we use that everywhere, of may be we use vulnerability_id instead which would be more explicit and work in all the contexts.

vulnerabilities/api.py Outdated Show resolved Hide resolved
vulnerabilities/api.py Show resolved Hide resolved
vulnerabilities/data_source.py Outdated Show resolved Hide resolved
vulnerabilities/import_runner.py Outdated Show resolved Hide resolved
vulnerabilities/import_runner.py Outdated Show resolved Hide resolved
vulnerabilities/importer_yielder.py Show resolved Hide resolved
vulnerabilities/importers/safety_db.py Show resolved Hide resolved
# meaning if cve_ids is not [''] but either ['CVE-123'] or ['CVE-123, CVE-124']
if len(cve_ids[0]):
cve_ids = [s.strip() for s in cve_ids.split(",")]
if advisory["cve"]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be create a variable

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing the safetydb changes in other pr

Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good ... I have a few minor nit pickings but this is good to merge!
Thanks!

This reduces the dependence on CVE ID. Cases
where vulnerability don't have CVE can be
handled

Signed-off-by: Shivam Sandbhor <[email protected]>
This command takes as input a remote repo's url.
Upon  invoking the command all the vulnerabilities
which were id'd by vulnerablecode will be pushed to  this repo.

Signed-off-by: Shivam Sandbhor <[email protected]>
Signed-off-by: Shivam Sandbhor <[email protected]>
* Added incremental time id in import_runner.py
  to prevent vulnerability id conflicts

Signed-off-by: Shivam Sandbhor <[email protected]>
Signed-off-by: Shivam Sandbhor <[email protected]>
Signed-off-by: Shivam Sandbhor <[email protected]>
Signed-off-by: Shivam Sandbhor <[email protected]>
Signed-off-by: Shivam Sandbhor <[email protected]>
Signed-off-by: Shivam Sandbhor <[email protected]>
Signed-off-by: Shivam Sandbhor <[email protected]>
* In model Vulnerability "identifier" -> "vulnerability_id"
* In Advisory dataclass "identifier" -> "vulnerability_id"

Signed-off-by: Shivam Sandbhor <[email protected]>
Signed-off-by: Shivam Sandbhor <[email protected]>
Signed-off-by: Shivam Sandbhor <[email protected]>
Signed-off-by: Shivam Sandbhor <[email protected]>
Signed-off-by: Shivam Sandbhor <[email protected]>
@sbs2001 sbs2001 merged commit 9fe5864 into aboutcode-org:main Feb 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

On the identification of vulnerabilities
3 participants