Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change
This change seeks to reduce the size of the index in two ways:
Better schema design
This is achieved by the map table having no
rowid
and using a primary key with the value first. This makes the table already sorted by the value, thus the reverse lookups are fast. It also drops a fair amount of the data in the table itself to remove therowid
, given that it was ~1/3 of the rows.Dropping map data
We don't actually use the fact that we know that different versions have different tags (or any other data). Thus, we can simply have one manifest entry per package identifier have all of the values and maintain the same functionality. There is a slight loss of fidelity if one is reading through the values via API, but this is deemed acceptable given the large data savings. I explicitly left the product codes alone, as this does have value to keep per version (even if we are not using it currently).
Size comparison
As a bonus, we plan to also shorten the file names of the manifests, but this is a service only change.
Validation
Tests are added to verify the behavior is as expected for the various folded data.
The regression tests should help ensure that the schema changes are not functionally impactful.
Microsoft Reviewers: Open in CodeFlow