Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add kernelKeeper index to efficiently track the importing vats of each kernel object #3223

Open
warner opened this issue May 31, 2021 · 5 comments
Labels
enhancement New feature or request performance Performance related issues SwingSet package: SwingSet vaults_triage DO NOT USE

Comments

@warner
Copy link
Member

warner commented May 31, 2021

What is the Problem Being Solved?

As part of #3109, I need a way to build a list of all the vats that have imported a given kernel object. This is a lot like the p.subscribers list for a kernel promise, but it's changed by import (receiving a delivery or syscall.invoke response which cites the kref in an argument), rather than the result of a syscall.subscribe() call. The import set is always exactly equal to the vats which have a c-list entry for the kref, whose matching vref is o-NN (indicating an import) rather than o+NN (indicating an export).

To implement #3109, I made a quick-and-dirty implementation of kernelKeeper.getImporters(), which does exactly that: it walks all vatIDs, looks at their c-list entries, and finds the ones which 1: include the kref and 2: map it to an import. This is O(number of vats).

Instead, we should maintain an index: add an additional key per kref (koNN.importers), record a comma-separated (or JSON-stringified array) of vatIDs, add to it each time a c-list import entry is added, and remove from it when the c-list entry is removed.

That will speed up getImporters() to O(1), at the cost of increasing DB storage (one key per kernel object) and adding an extra kernelDB read (and sometimes write) to clist addition/removal operations.

Description of the Design

Security Considerations

Test Plan

@warner warner added enhancement New feature or request SwingSet package: SwingSet labels May 31, 2021
@warner warner self-assigned this May 31, 2021
warner added a commit that referenced this issue Jun 1, 2021
This function takes an object kref and returns a list of vatIDs which have
imported that object. As mentioned in #3223, this would be more efficient
with an index.
warner added a commit that referenced this issue Jun 1, 2021
This function takes an object kref and returns a list of vatIDs which have
imported that object. As mentioned in #3223, this would be more efficient
with an index.
warner added a commit that referenced this issue Jun 1, 2021
This function takes an object kref and returns a list of vatIDs which have
imported that object. As mentioned in #3223, this would be more efficient
with an index.
warner added a commit that referenced this issue Jun 1, 2021
This function takes an object kref and returns a list of vatIDs which have
imported that object. As mentioned in #3223, this would be more efficient
with an index.
warner added a commit that referenced this issue Jun 3, 2021
This function takes an object kref and returns a list of vatIDs which have
imported that object. As mentioned in #3223, this would be more efficient
with an index.
warner added a commit that referenced this issue Jun 19, 2021
To support `dispatch.retireImport` (the "downstream" retirement that happens
when a Remotable is dropped, and downstream importers need to be informed of
the loss), the comms vat needs to know a list of the kernel and/or remotes
which have imported an object that the comms vat has exported.

The kernel currently implements this in a simple and non-performant way, by
reading every vat's c-list for a matching entry. This makes lookups (which
happen only when an upstream export is deleted) `O(numVats)`, but has no
space penalty. #3223 is about making this faster.

For now, the comms vat uses a differently-non-ideal approach: for each lref,
it maintains a key with a JSON-serialized list of remotes. Adding, removing,
and querying the importers are all is `O(numImporters)`, with a space penalty
of one key per object, and `O(numImporters)` space for the values.

A faster approach is implemented, but commented out, which uses distinct keys
for each (remoteID, lref) pair. This is `O(1)` for all operations, with a
space penalty of one key per import (and a constant-size value). However,
this approach would require a non-existent `syscall.vatstoreGetKeys()` API,
to iterate over vatstore keys. This wouldn't be hard to add, but we should
consider our storage model more carefully before deciding to commit to that
feature.
warner added a commit that referenced this issue Jun 19, 2021
To support `dispatch.retireImport` (the "downstream" retirement that happens
when a Remotable is dropped, and downstream importers need to be informed of
the loss), the comms vat needs to know a list of the kernel and/or remotes
which have imported an object that the comms vat has exported.

The kernel currently implements this in a simple and non-performant way, by
reading every vat's c-list for a matching entry. This makes lookups (which
happen only when an upstream export is deleted) `O(numVats)`, but has no
space penalty. #3223 is about making this faster.

For now, the comms vat uses a differently-non-ideal approach: for each lref,
it maintains a key with a JSON-serialized list of remotes. Adding, removing,
and querying the importers are all is `O(numImporters)`, with a space penalty
of one key per object, and `O(numImporters)` space for the values.

A faster approach is implemented, but commented out, which uses distinct keys
for each (remoteID, lref) pair. This is `O(1)` for all operations, with a
space penalty of one key per import (and a constant-size value). However,
this approach would require a non-existent `syscall.vatstoreGetKeys()` API,
to iterate over vatstore keys. This wouldn't be hard to add, but we should
consider our storage model more carefully before deciding to commit to that
feature.
warner added a commit that referenced this issue Jun 19, 2021
To support `dispatch.retireImport` (the "downstream" retirement that happens
when a Remotable is dropped, and downstream importers need to be informed of
the loss), the comms vat needs to know a list of the kernel and/or remotes
which have imported an object that the comms vat has exported.

The kernel currently implements this in a simple and non-performant way, by
reading every vat's c-list for a matching entry. This makes lookups (which
happen only when an upstream export is deleted) `O(numVats)`, but has no
space penalty. #3223 is about making this faster.

For now, the comms vat uses a differently-non-ideal approach: for each lref,
it maintains a key with a JSON-serialized list of remotes. Adding, removing,
and querying the importers are all is `O(numImporters)`, with a space penalty
of one key per object, and `O(numImporters)` space for the values.

A faster approach is implemented, but commented out, which uses distinct keys
for each (remoteID, lref) pair. This is `O(1)` for all operations, with a
space penalty of one key per import (and a constant-size value). However,
this approach would require a non-existent `syscall.vatstoreGetKeys()` API,
to iterate over vatstore keys. This wouldn't be hard to add, but we should
consider our storage model more carefully before deciding to commit to that
feature.
warner added a commit that referenced this issue Jun 19, 2021
To support `dispatch.retireImport` (the "downstream" retirement that happens
when a Remotable is dropped, and downstream importers need to be informed of
the loss), the comms vat needs to know a list of the kernel and/or remotes
which have imported an object that the comms vat has exported.

The kernel currently implements this in a simple and non-performant way, by
reading every vat's c-list for a matching entry. This makes lookups (which
happen only when an upstream export is deleted) `O(numVats)`, but has no
space penalty. #3223 is about making this faster.

For now, the comms vat uses a differently-non-ideal approach: for each lref,
it maintains a key with a JSON-serialized list of remotes. Adding, removing,
and querying the importers are all is `O(numImporters)`, with a space penalty
of one key per object, and `O(numImporters)` space for the values.

A faster approach is implemented, but commented out, which uses distinct keys
for each (remoteID, lref) pair. This is `O(1)` for all operations, with a
space penalty of one key per import (and a constant-size value). However,
this approach would require a non-existent `syscall.vatstoreGetKeys()` API,
to iterate over vatstore keys. This wouldn't be hard to add, but we should
consider our storage model more carefully before deciding to commit to that
feature.
warner added a commit that referenced this issue Jun 20, 2021
To support `dispatch.retireImport` (the "downstream" retirement that happens
when a Remotable is dropped, and downstream importers need to be informed of
the loss), the comms vat needs to know a list of the kernel and/or remotes
which have imported an object that the comms vat has exported.

The kernel currently implements this in a simple and non-performant way, by
reading every vat's c-list for a matching entry. This makes lookups (which
happen only when an upstream export is deleted) `O(numVats)`, but has no
space penalty. #3223 is about making this faster.

For now, the comms vat uses a differently-non-ideal approach: for each lref,
it maintains a key with a JSON-serialized list of remotes. Adding, removing,
and querying the importers are all is `O(numImporters)`, with a space penalty
of one key per object, and `O(numImporters)` space for the values.

A faster approach is implemented, but commented out, which uses distinct keys
for each (remoteID, lref) pair. This is `O(1)` for all operations, with a
space penalty of one key per import (and a constant-size value). However,
this approach would require a non-existent `syscall.vatstoreGetKeys()` API,
to iterate over vatstore keys. This wouldn't be hard to add, but we should
consider our storage model more carefully before deciding to commit to that
feature.
warner added a commit that referenced this issue Jun 20, 2021
To support `dispatch.retireImport` (the "downstream" retirement that happens
when a Remotable is dropped, and downstream importers need to be informed of
the loss), the comms vat needs to know a list of the kernel and/or remotes
which have imported an object that the comms vat has exported.

The kernel currently implements this in a simple and non-performant way, by
reading every vat's c-list for a matching entry. This makes lookups (which
happen only when an upstream export is deleted) `O(numVats)`, but has no
space penalty. #3223 is about making this faster.

For now, the comms vat uses a differently-non-ideal approach: for each lref,
it maintains a key with a JSON-serialized list of remotes. Adding, removing,
and querying the importers are all is `O(numImporters)`, with a space penalty
of one key per object, and `O(numImporters)` space for the values.

A faster approach is implemented, but commented out, which uses distinct keys
for each (remoteID, lref) pair. This is `O(1)` for all operations, with a
space penalty of one key per import (and a constant-size value). However,
this approach would require a non-existent `syscall.vatstoreGetKeys()` API,
to iterate over vatstore keys. This wouldn't be hard to add, but we should
consider our storage model more carefully before deciding to commit to that
feature.
warner added a commit that referenced this issue Jun 20, 2021
To support `dispatch.retireImport` (the "downstream" retirement that happens
when a Remotable is dropped, and downstream importers need to be informed of
the loss), the comms vat needs to know a list of the kernel and/or remotes
which have imported an object that the comms vat has exported.

The kernel currently implements this in a simple and non-performant way, by
reading every vat's c-list for a matching entry. This makes lookups (which
happen only when an upstream export is deleted) `O(numVats)`, but has no
space penalty. #3223 is about making this faster.

For now, the comms vat uses a differently-non-ideal approach: for each lref,
it maintains a key with a JSON-serialized list of remotes. Adding, removing,
and querying the importers are all is `O(numImporters)`, with a space penalty
of one key per object, and `O(numImporters)` space for the values.

A faster approach is implemented, but commented out, which uses distinct keys
for each (remoteID, lref) pair. This is `O(1)` for all operations, with a
space penalty of one key per import (and a constant-size value). However,
this approach would require a non-existent `syscall.vatstoreGetKeys()` API,
to iterate over vatstore keys. This wouldn't be hard to add, but we should
consider our storage model more carefully before deciding to commit to that
feature.
warner added a commit that referenced this issue Jun 20, 2021
To support `dispatch.retireImport` (the "downstream" retirement that happens
when a Remotable is dropped, and downstream importers need to be informed of
the loss), the comms vat needs to know a list of the kernel and/or remotes
which have imported an object that the comms vat has exported.

The kernel currently implements this in a simple and non-performant way, by
reading every vat's c-list for a matching entry. This makes lookups (which
happen only when an upstream export is deleted) `O(numVats)`, but has no
space penalty. #3223 is about making this faster.

For now, the comms vat uses a differently-non-ideal approach: for each lref,
it maintains a key with a JSON-serialized list of remotes. Adding, removing,
and querying the importers are all is `O(numImporters)`, with a space penalty
of one key per object, and `O(numImporters)` space for the values.

A faster approach is implemented, but commented out, which uses distinct keys
for each (remoteID, lref) pair. This is `O(1)` for all operations, with a
space penalty of one key per import (and a constant-size value). However,
this approach would require a non-existent `syscall.vatstoreGetKeys()` API,
to iterate over vatstore keys. This wouldn't be hard to add, but we should
consider our storage model more carefully before deciding to commit to that
feature.
warner added a commit that referenced this issue Jun 20, 2021
To support `dispatch.retireImport` (the "downstream" retirement that happens
when a Remotable is dropped, and downstream importers need to be informed of
the loss), the comms vat needs to know a list of the kernel and/or remotes
which have imported an object that the comms vat has exported.

The kernel currently implements this in a simple and non-performant way, by
reading every vat's c-list for a matching entry. This makes lookups (which
happen only when an upstream export is deleted) `O(numVats)`, but has no
space penalty. #3223 is about making this faster.

For now, the comms vat uses a differently-non-ideal approach: for each lref,
it maintains a key with a JSON-serialized list of remotes. Adding, removing,
and querying the importers are all is `O(numImporters)`, with a space penalty
of one key per object, and `O(numImporters)` space for the values.

A faster approach is implemented, but commented out, which uses distinct keys
for each (remoteID, lref) pair. This is `O(1)` for all operations, with a
space penalty of one key per import (and a constant-size value). However,
this approach would require a non-existent `syscall.vatstoreGetKeys()` API,
to iterate over vatstore keys. This wouldn't be hard to add, but we should
consider our storage model more carefully before deciding to commit to that
feature.
@warner
Copy link
Member Author

warner commented Jan 31, 2022

Hrm, I guess this really ought to be in MN-1. The performance improvement it makes isn't huge until we have a significant number of vats, which might not be until MN-2, but it would extra expensive to upgrade an existing kernel in-place.

@Tartuffo Tartuffo added MN-1 and removed MN-1 labels Feb 2, 2022
@warner warner assigned FUDCo and unassigned warner Feb 22, 2022
@Tartuffo Tartuffo added this to the Mainnet 1 milestone Mar 23, 2022
@Tartuffo
Copy link
Contributor

@warner Can we revisit if this is strictly MN-1, or whether it can wait until MN-1.1? @FUDCo thinks this could possibly wait to 1.1

@Tartuffo Tartuffo removed this from the Mainnet 1 milestone Apr 5, 2022
@ivanlei ivanlei added the vaults_triage DO NOT USE label Jan 3, 2023
@warner
Copy link
Member Author

warner commented Jan 3, 2023

This is probably easier to do now that we've got SQL available, so it could be part of #6677

@warner
Copy link
Member Author

warner commented Jan 3, 2023

This got into "In Progress" for weird reasons, but my instinct is that it's worth doing for the Vaults release (although it's probably not impossible to migrate to it afterwards). I'm going to move it to Vaults backlog for now.

@dckc dckc added the performance Performance related issues label Jan 30, 2023
@warner
Copy link
Member Author

warner commented Jan 30, 2023

We can defer this one for later.

The current cost is O(N) in the number of vats: for each kref retired by a vat's syscall.retireExports(), we must check every vat's c-list to see if that was was importing the kref, and if so we'll send them a dispatch.retireImports(). As long as the number of vats doesn't get out of hand before we improve it to use a better scheme, we can defer it.

The migration process will involve introducing a swing-store c-list handler (with a schema of (vatID, kref, vref)) and indexes on (vatID, kref), (vatID, vref), and (kref) (where the last one is specifically for getImporters), and then walking all c-lists to move their contents into the new table. We've listed this as a feature to pursue as part of #6677 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance Performance related issues SwingSet package: SwingSet vaults_triage DO NOT USE
Projects
None yet
Development

No branches or pull requests

5 participants