Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-91] Separate serialization for internal use and for artifacts with published schemas #4617

Closed
gshank opened this issue Jan 24, 2022 · 6 comments
Labels
artifacts tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality

Comments

@gshank
Copy link
Contributor

gshank commented Jan 24, 2022

Right now we use the same serialization for creating the partial_parse.msgpack file, and the manifest.json (and other artifacts). In some cases we need an attribute in internal processing but it's not necessary for the manifest.json, but implementing the attribute means we have to update the published json schema.

Investigate ways to have different serialization. One possibility: we use an 'omit_none' flag that gets passed along for serialization. Perhaps we could implement a similar flag for artifacts.

The goal would be to implement this for 1.1.0.

@github-actions github-actions bot changed the title Separate serialization for internal use and for artifacts with published schemas [CT-91] Separate serialization for internal use and for artifacts with published schemas Jan 24, 2022
@kwigley
Copy link
Contributor

kwigley commented Jan 24, 2022

Sorry for adding my 2 cents. I think these should be different data classes entirely. I don't think the same data class that is used for internal purposes should be used to write external artifacts. We can using mapping to go from internal representation <--> external representation and only the external representation needs to care about serialization. I think there is a path forward here that does not involve the (in)famous "hologram" package as well. Happy to provide examples when thinking about this more!

@jtcohen6 jtcohen6 added this to the v1.1.0 milestone Jan 25, 2022
@jtcohen6 jtcohen6 added artifacts tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality labels Jan 25, 2022
@jtcohen6
Copy link
Contributor

Moving away from hologram, and toward a better dataclass -> JSONSchema serialization package, may also resolve some of the residual issues with the schemas we're producing today: #4657

@jtcohen6
Copy link
Contributor

The original prompt/motivation for this issue:

  • We made a tiny change to the manifest schema in [#2479] Allow unique_id to take a list #4618 (just a new default)
  • By our current practice, we'll need to bump the manifest version to v5, creating work for everyone who parses metadata from the manifest

I'm interested in any small, tactical lifts we can make so that we're not bumping the manifest schema version in every single new minor version. At the same time, we're eager to provide other interfaces into manifest information, over which we have more control: #4807

@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Sep 12, 2022
@github-actions
Copy link
Contributor

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest; add a comment to notify the maintainers.

@leahwicz leahwicz reopened this Sep 27, 2022
@leahwicz leahwicz removed the stale Issues that have gone stale label Sep 27, 2022
@leahwicz
Copy link
Contributor

leahwicz commented Oct 3, 2022

Closing this one because things have changed and we are looking to create artifacts off of the protobuf schemas

@leahwicz leahwicz closed this as completed Oct 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
artifacts tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality
Projects
None yet
Development

No branches or pull requests

4 participants