Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serve locally stored fbc content via a server #113

Closed
everettraven opened this issue Jul 17, 2023 · 6 comments · Fixed by #144 or #148
Closed

Serve locally stored fbc content via a server #113

everettraven opened this issue Jul 17, 2023 · 6 comments · Fixed by #144 or #148
Assignees
Milestone

Comments

@everettraven
Copy link
Collaborator

everettraven commented Jul 17, 2023

As part of implementing the RFC for catalogd's content storage and serving mechanism we should import the RukPak storage library (dependent on operator-framework/rukpak#656). Once imported we will need to create a custom implementation of the Storage interface.

The custom Storage implementation should:

  • Store all FBC JSON blobs in a singular {catalogName}/all.json file
  • Serve the {catalogName}/all.json file via a go http.FileServer handler

Note If operator-framework/rukpak#656 is not accepted by the RukPak maintainers it would be acceptable to copy the Storage interface to a catalogd package. This is not ideal but is rather a last resort if RukPak maintainers deny the request to externalize the storage package.

Acceptance Criteria

  • Storage interface implementation that:
    • Store all FBC JSON blobs in a singular {catalogName}/all.json file
    • Serve the {catalogName}/all.json file via a go http.FileServer handler
  • Unit tests
@everettraven
Copy link
Collaborator Author

/assign @anik120

anik120 added a commit to anik120/catalogd that referenced this issue Aug 3, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Aug 3, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Aug 3, 2023
@anik120
Copy link
Collaborator

anik120 commented Aug 3, 2023

@everettraven what's the motivation for implementing rukpak's storage interface again?

type Storage interface {
	Loader
	Storer
}

type Loader interface {
	Load(ctx context.Context, owner client.Object) (fs.FS, error)
}

type Storer interface {
	Store(ctx context.Context, owner client.Object, bundle fs.FS) error
	Delete(ctx context.Context, owner client.Object) error

	http.Handler
	URLFor(ctx context.Context, owner client.Object) (string, error)
}

Looks like rukpak's storage interface is specialized for storing/loading a filesystem (like an operator bundle, which has multiple files containing multiple Kubernetes CRs), but for catalogd, we want something like

type Storage interface {
	Loader
	Storer
}

type Loader interface {
	Load(ctx context.Context, owner client.Object) (*declaritiveConfig, error)
}

type Storer interface {
	Store(ctx context.Context, owner client.Object, fbc *declarativeConfig) error
	Delete(ctx context.Context, owner client.Object) error

	http.Handler
	URLFor(ctx context.Context, owner client.Object) (string, error)
}

I.e that ☝🏽 is an interface that's for loading and storing fbc files, whereas the first one, is an interface to store and load bundles (two different interfaces with separate purpose).

If we were to store the catalog as a tar.gz file, it'd probably still make sense to implement the rukpak storage. But looks like we decided to store the catalogs as json files, and not compress them like we have to for bundle content.
It's doable if there's a very good reason why we must implement rukpak's interface, but thinking about implementing storing, it's evident that it's not exactly a natural fit:

func Store(ctx context.Context, owner client.Object, catalog fs.FS) error {
         fbc := dclcfg.LoadFS(catalog)
         declcfg.WriteJSON(fbc, writer)
}

There's an extra step of loading the fs into declarativeConfig there, that'll just be done just to fit into the rukpak bundle storing/loading interface

dclcfg.LoadFS

@everettraven
Copy link
Collaborator Author

what's the motivation for implementing rukpak's storage interface again?

type Storage interface {
	Loader
	Storer
}

type Loader interface {
	Load(ctx context.Context, owner client.Object) (fs.FS, error)
}

type Storer interface {
	Store(ctx context.Context, owner client.Object, bundle fs.FS) error
	Delete(ctx context.Context, owner client.Object) error

	http.Handler
	URLFor(ctx context.Context, owner client.Object) (string, error)
}

Looks like rukpak's storage interface is specialized for storing/loading a filesystem

This right here is the motivation. A file-based catalog is a filesystem of YAML/JSON files that represent catalog objects. Additionally the unpacking process we currently use returns an fs.FS as part of the result. Both of these things led me to the conclusion that reusing rukpak's Storage interface would be sufficient for what we wanted to do.

type Storage interface {
	Loader
	Storer
}

type Loader interface {
	Load(ctx context.Context, owner client.Object) (*declaritiveConfig, error)
}

type Storer interface {
	Store(ctx context.Context, owner client.Object, fbc *declarativeConfig) error
	Delete(ctx context.Context, owner client.Object) error

	http.Handler
	URLFor(ctx context.Context, owner client.Object) (string, error)
}

I.e that ☝🏽 is an interface that's for loading and storing fbc files, whereas the first one, is an interface to store and load bundles (two different interfaces with separate purpose).

I agree that this would be a better interface if we wanted to use the declcfg.DeclarativeConfig type but doing so would require changes to the unpacking implementation, which ideally deviates minimally from rukpak so we can coordinate a way to reuse rukpak's implementation.

If we were to store the catalog as a tar.gz file, it'd probably still make sense to implement the rukpak storage. But looks like we decided to store the catalogs as json files, and not compress them like we have to for bundle content. It's doable if there's a very good reason why we must implement rukpak's interface, but thinking about implementing storing, it's evident that it's not exactly a natural fit:

func Store(ctx context.Context, owner client.Object, catalog fs.FS) error {
         fbc := dclcfg.LoadFS(catalog)
         declcfg.WriteJSON(fbc, writer)
}

There's an extra step of loading the fs into declarativeConfig there, that'll just be done just to fit into the rukpak bundle storing/loading interface

dclcfg.LoadFS

I don't think we should use the declcfg.LoadFS function here. We could instead use the declcfg.WalkMetasFS function which was created specifically to walk an fs.FS and parse out the declcfg.Meta type. We already use this in the syncCatalogMetadata function that currently exists in the catalog controller:

err := declcfg.WalkMetasFS(fsys, func(path string, meta *declcfg.Meta, err error) error {
if err != nil {
return fmt.Errorf("error in parsing catalog content files in the filesystem: %w", err)
}
packageOrName := meta.Package
if packageOrName == "" {
packageOrName = meta.Name
}
var objName string
if objName, err = generateCatalogMetadataName(ctx, catalog.Name, meta); err != nil {
return fmt.Errorf("error in generating catalog metadata name: %w", err)
}
catalogMetadata := &v1alpha1.CatalogMetadata{
TypeMeta: metav1.TypeMeta{
APIVersion: v1alpha1.GroupVersion.String(),
Kind: "CatalogMetadata",
},
ObjectMeta: metav1.ObjectMeta{
Name: objName,
Labels: map[string]string{
"catalog": catalog.Name,
"schema": meta.Schema,
"package": meta.Package,
"name": meta.Name,
"packageOrName": packageOrName,
},
OwnerReferences: []metav1.OwnerReference{{
APIVersion: v1alpha1.GroupVersion.String(),
Kind: "Catalog",
Name: catalog.Name,
UID: catalog.UID,
BlockOwnerDeletion: pointer.Bool(true),
Controller: pointer.Bool(true),
}},
},
Spec: v1alpha1.CatalogMetadataSpec{
Catalog: corev1.LocalObjectReference{Name: catalog.Name},
Name: meta.Name,
Schema: meta.Schema,
Package: meta.Package,
Content: meta.Blob,
},
}
newCatalogMetadataObjs[catalogMetadata.Name] = catalogMetadata
return nil
})

Using declcfg.WalkMetasFS allows us to lower the memory usage by not duplicating the entire fs.FS as a declcfg.DeclarativeConfig object and instead just append the marshalled JSON from the declcfg.Meta object to the all.json file that corresponds to the Catalog resource passed in as the "owner".

@anik120
Copy link
Collaborator

anik120 commented Aug 3, 2023

Looks like rukpak's storage interface is specialized for storing/loading a filesystem

This right here is the motivation. A file-based catalog is a filesystem of YAML/JSON files that represent catalog objects.

What I meant was, looks like rukpak's storage interface is specialized for compressing/decompressing the content of a filesystem. Bundles can have arbitrary number and type of content, so it makes total sense to compress and store them. And looks like we've made the same conclusion in the RFC, that it makes more sense to store the FBCs as JSON files.

Additionally the unpacking process we currently use returns an fs.FS as part of the result.

IIRC, there's been discussions about working on that unpacking interface, we just haven't had the chance to get to it. So if I'm not wrong, that interface is already up for debate in rukpak(due to the fidelity of pod logs). We may want to actually write an unpacker that returns a declCfg.DeclartiveConfig anyway.

We could instead use the declcfg.WalkMetasFS function which was created specifically to walk an fs.FS and parse out the declcfg.Meta type.

That's a good point. Although, it's just another way that store has to load the content into memory, before it can actually append to any file. Passing around an fs.Fs feels like tying the knots with rukpak, which again is specializing in a different type of content.

input:component:output
current path
Unpacker:fs.FS -> fs.FS:controller -> fs.FS:Storer:declCfg.DeclarativeConfig -> stored in file

instead if we do
Unpacker:declCfg.DeclarativeConfig -> declCfg.DeclarativeConfig:controller -> declCfg.DeclarativeConfig:Storer -> stored in file

there's only one chunk being loaded into memory (the declCfg.DeclarativeConfig), instead of two (fs.FS and declCfg.DeclarativeConfig).

At the very least, it feels like we should start with the simplest thing, which is a Storer which gets declCfg.DeclarativeConfig. That way, we have the option of deciding "nope, we'll just use the rukpak interface, so we'll change this implementation to be a rukpak storage", which is much easier than untying things.

which ideally deviates minimally from rukpak so we can coordinate a way to reuse rukpak's implementation.

this is saying there are two components (catalogD and rukpak) which does the same thing, but are located in different places. That's hard to reconcile when bundle content (unstructured, from the perspective of rukpak) is very different from catalog content(structured fbc).

anik120 added a commit to anik120/catalogd that referenced this issue Aug 4, 2023
@anik120
Copy link
Collaborator

anik120 commented Aug 9, 2023

Further discussion in this slack thread: https://kubernetes.slack.com/archives/C0181L6JYQ2/p1691082152778479

Check back later for TLDR..next steps :loading:

anik120 added a commit to anik120/catalogd that referenced this issue Aug 10, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Aug 10, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Aug 12, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Aug 14, 2023
@joelanford joelanford modified the milestones: v0.5.0, v0.6.0 Aug 15, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Aug 16, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Aug 30, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Aug 30, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Aug 30, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Aug 30, 2023
github-merge-queue bot pushed a commit that referenced this issue Aug 30, 2023
@everettraven
Copy link
Collaborator Author

I think this was incorrectly closed. The original intent with this issue was to implement the filesystem storage and the http fileserver for serving the stored content. With the way things are split up this issue will be appropriately closed once #148 is merged

@everettraven everettraven reopened this Aug 31, 2023
@anik120 anik120 changed the title Import and implement a RukPak Storage interface Serve locally stored fbc content via a server Sep 6, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Sep 6, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Sep 6, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Sep 6, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Sep 6, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Sep 6, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Sep 7, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Sep 7, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Sep 7, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Sep 7, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Sep 7, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Sep 7, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Sep 7, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Sep 7, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Sep 7, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Sep 7, 2023
anik120 added a commit to anik120/catalogd that referenced this issue Sep 8, 2023
github-merge-queue bot pushed a commit that referenced this issue Sep 8, 2023
@github-project-automation github-project-automation bot moved this to Done in OLM v1 Sep 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment