Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package registry redesign #670

Closed
ycombinator opened this issue Jan 11, 2021 · 8 comments
Closed

Package registry redesign #670

ycombinator opened this issue Jan 11, 2021 · 8 comments

Comments

@ycombinator
Copy link
Contributor

ycombinator commented Jan 11, 2021

This issue is a placeholder to capture requirements - current pain points + desired enhancements - for a redesigned package registry. A number of discussions have happened on this topic privately; this issue is the place to centralize them as as well as bring more visibility and discoverability to them.

Summarizing from various private discussions:

  • The current package registry works quite well but we foresee some potential scalability, performance, and reliability issues as the number of packages continues to grow.
    • We need to come up with a design that would allow the package registry process to start up in low and constant amount of time.
    • Ideally the size of the package registry image and the memory used by the package registry process would not grow proportionally with the number of packages being served by the registry.
    • The current package registry bears two distinct responsibilities under the hood: that of providing a search API for packages and that of acting as a static file server for packages. There might be an opportunity to decouple these responsibilities from each other to address the performance and scalability concerns mentioned above, but also to increase the reliability by delegating each responsibility to appropriate technologies.
  • There will be a need for private / air-gapped package registries, especially for deployments that are subject to regulatory approval with very strict security/access requirements. Provide self-managed package-registry that backs integrations UI integrations#1178
  • The registry will ideally support packages from multiple sources, not just Elastic-developed packages.
  • The ability to serve as a source of truth for our marketing integrations page https://www.elastic.co/integrations
    • The ability to render landing pages and docs
    • Web-scale through a CDN if not already
@ycombinator
Copy link
Contributor Author

@mtojek @ruflin @masci @andresrc please feel free to edit the issue description and add more requirements if I've missed any.

Next step is to come up with a proposal doc for a registry redesign that would address all the requirements (and link to it from this issue) so we can all review and iterate on it.

@ruflin
Copy link
Collaborator

ruflin commented Jan 13, 2021

Adding to the list:

  • API to publish packages: This partially falls into the multiple sources but having an API would remove the need to support multiple sources
  • Search backend: You have it scoped under separate responsibility / split up registry. Lets not make early decision on this but what we need for sure is a search backend as currently we load everything into memory to have quick results. This will not work forever.

@ycombinator
Copy link
Contributor Author

I'm going to start working on a proposal draft for this one. I'll post the link here (Google doc) as soon as I have something to share and review.

In the mean time, if there are more use cases / requirements for the package registry that are not currently satisfiable, please keep adding them here as comments. Thanks!

@ycombinator
Copy link
Contributor Author

As I started working on a proposal for this issue, I realized it's quite a broad scope covering everything from scalability of the package registry to security aspects.

So I decided to initially just focus on the scalability aspects. I think we can roll out a solution for these aspects first and then return to the other aspects after that. To that end, I've put together a proposal for scaling the package registry here: https://docs.google.com/document/d/1tVUx00xjinSGw9dYyBiNuNui6a3bZsruyZwpkP30x2U/edit#. Please review and leave comments in the document. Thank you!

@jsoriano
Copy link
Member

We recently added APM intrumentation to package registry (#702), and a couple of early conclusions:

  • /package "catch all" endpoint to request individual files (mostly images and READMEs) is with a lot of difference the most used one, we have to take this into account for any redesign. We probably need to do something about this if we separate the storage from the registry.
  • /search endpoint seems to be fast enough, average and median response times are under the milisecond, 99th percentil is around 6ms, search itself is in most cases under 4ms, generally under 1ms. This seems to be fast enough, and is probably faster than querying any external service over the network. So I think we are good with this (while the registry can keep the index locally).

@jlind23
Copy link
Contributor

jlind23 commented Sep 23, 2021

Redesign should also include elastic/package-spec#162

jsoriano added a commit that referenced this issue Oct 7, 2021
Add support to serve packages stored directly as zip files,
so users receive the same packages as stored in the backend.

An "Indexer" interface is introduced to allow to provide different
implementations for sources of packages, it abstracts how packages
are indexed and queried.
List of packages is not global anymore, allowing better isolation of
test cases.
All handlers use the indexers to find packages and files.

This is done in context of #670, to decouple package storage from
the registry.
@jlind23
Copy link
Contributor

jlind23 commented Nov 18, 2021

@mtojek @jsoriano @akshay-saraswat
Can we close this issue as we have already created issues for the four milestones?

@mtojek
Copy link
Contributor

mtojek commented Nov 18, 2021

Yes, I think we can do this. Thanks for the housekeeping!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants