Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Deserialize(denormalize) JSON/XML documents to Python objects(data model) #185

Closed
garethr opened this issue Mar 3, 2022 · 17 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@garethr
Copy link

garethr commented Mar 3, 2022

I'd like to be able to take existing CycloneDX documents, and work with them as Python objects. This library has the models, but I didn't see a from_xml or from_json method or similar.

This would make building tooling that consumes SBOMs much easier.

@jkowalleck jkowalleck changed the title Support for deserialising JSON/XML documents into Python objects Support for deserialising(denormalize) JSON/XML documents into Python objects Mar 3, 2022
@jkowalleck jkowalleck added the enhancement New feature or request label Mar 3, 2022
@jkowalleck
Copy link
Member

Thanks for the feature request, @garethr .

most important question for now: do you plan on implementing the feature yourself, or do you request it to be implemented?


initial thoughts, that we could dicsuss, @madpah :

Feasibility

This library is feature complete regarding data models, having everything in place from 1.0 to 1.4 spec.
Feasibility: check ✔️

Input validation

When it comes to deserializing/denormalizing it is preferred to validate the input document(JSON/XML) first. Therefore a schema detection and a JSON/XML schema validator is required.
These features are currently in a non-public state for the library. We might need to improve them, before making them public.
(possible discussion: do we depend on additional 3rd party libs, or are they optional dependencies and we switch of deserializing if the deps are missing?)

possible implementation details

to be discussed: architecture, design
keywords: (some answers may be obvious)

  • where would a python module be placed/named in the project - folder/file path?
  • do we call them deserializer or denormalizer?
  • to what level are tests required, what test coverage is acceptable?
  • do we intend to work with abstract base-classes(protcols) that will be implemented in a XML and a JSON denormalizer - or go with bare implements?
  • what public (helper) functions/methods/interfaces should exist?
  • how will schema detection and schema validation happen? are they optional? are they additional functionality or a internal part of the deserialization process?
  • what will happen if a deserialization input is insufficient/invalid according to spec?

@jkowalleck jkowalleck changed the title Support for deserialising(denormalize) JSON/XML documents into Python objects [FEATURE] Deserialize(denormalize) JSON/XML documents to Python objects(data model) Mar 3, 2022
@madpah
Copy link
Collaborator

madpah commented Mar 8, 2022

@garethr - great idea, we should look into providing this.

@jkowalleck - we're not 100% complete on schema coverage yet - see the following known open feature requests:

@iandh
Copy link

iandh commented Apr 20, 2022

I'm part of a team that also needs this feature.

We started working on a parser, and would be happy to contribute something back that was better aligned to this team's vision.

The biggest hurdle we experienced is inconsistent naming between model __init__ method arguments and model fields (and some not matching the json schema).

I'm a big fan of pydantic for handling serialization but I realize thats a bit of an architectural change.

@madpah
Copy link
Collaborator

madpah commented Apr 21, 2022

Thanks for joining the conversation @iandh. Let us take a look at your work and pydantic and we'll come back to you shortly.

Thanks again!

@alexwahl
Copy link

I would also be very happy if that feature would be available. Is there any progress on this issue?

@jkowalleck
Copy link
Member

Hello @alexwahl ,

There is no implementation done so far. Are you interested in contributing any?
If so, feel free to get in contact with @madpah, he considered this feature for version 3 that is currently in preparation.

@nettrino
Copy link

I would be interested in contributing / helping build towards such functionality - I think it should be doable to auto-generate the python classes based on the spec right? (e.g., https://github.com/CycloneDX/specification/blob/1.4/schema/bom-1.4.schema.json) @madpah happy to sync offline if you have started working on this already

@jkowalleck
Copy link
Member

re: #185 (comment)

I think it should be doable to auto-generate the python classes based on the spec right

totally. if the input file was validated positive against the schema to use, then a generic/auto-generated code should work 👍

@idunbarh
Copy link

We did a similar exercise too and released it as hoppr_cyclonedx_models package to support Hoppr.

There was some minor changes from what pydantic created but otherwise was straight forward. It loads json great.

@alexwahl
Copy link

I would also be happy to contribute. I already created a loader cyclonedx_fileparser which is used in a private project. But this code is not complete, please see it as work in progress.

@madpah
Copy link
Collaborator

madpah commented Aug 1, 2022

@alexwahl @idunbarh @nettrino @iandh @garethr - by way of update, the team are working on this and we'll be seeking feedback / testing on this in the coming weeks.

I'll post more details here shortly.

@pombredanne
Copy link

Any status update on this?

@pombredanne
Copy link

FWIW I found and looked at https://gitlab.com/hoppr/hoppr-cyclonedx-models/ by @iandh and @kirse and it looks like it fits the bill nicely for this feature.

@iandh
Copy link

iandh commented Jan 19, 2023

@pombredanne feel free to reach out if you experience any issues with hoppr-cyclonedx-models. We're using it pretty heavily over the last couple months and its stable.

@pombredanne
Copy link

@iandh Thanks! FWIW we need to support this issue: aboutcode-org/scancode.io#583

@jheck88
Copy link

jheck88 commented Jan 20, 2023

The purpose of the hoppr-cyclonedx-models was to solve the same issue as the OP.

@madpah
Copy link
Collaborator

madpah commented Mar 20, 2023

Included in 4.0.0 - now released.

@madpah madpah closed this as completed Mar 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

9 participants