Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace userspace WeakMap/WeakSet with vref-aware versions #2993

Closed
warner opened this issue Apr 28, 2021 · 8 comments
Closed

replace userspace WeakMap/WeakSet with vref-aware versions #2993

warner opened this issue Apr 28, 2021 · 8 comments
Labels
enhancement New feature or request SwingSet package: SwingSet

Comments

@warner
Copy link
Member

warner commented Apr 28, 2021

What is the Problem Being Solved?

To satisfy the requirements of #2724 , and allow userspace to use Remotables, Presences, and virtual-object Representatives as if they were normal objects, we must replace the userspace WeakMap and WeakSet objects with versions that are aware of vref-based identity.

The normal Weak tools provide a recognition predicate for a specific JavaScript Object. Unless prevented, a supposedly-deterministic userspace could use this to sense the difference between two (non-overlapping) Presences or Representatives for the same underlying object. Since the lifetime of a Presence/Representative depends upon GC behavior, which is generally not a deterministic function of vat inputs, this would expose nondeterminism to vat code, enabling adversarial vats to break consensus. Therefore we must prevent vat code from getting access to the native WeakMap/WeakSet.

However WeakMap/WeakSet form a vital part of the JavaScript toolbox, and it would be barbaric to deny userspace code access to them entirely. We want users to be able to treat Presences and Representatives as if they were normal objects. To support this, we need versions that understand the shared identity between two (non-overlapping) Presences or Representatives for the same vref.

We must create a WeakMap object that provides the same API as the native WeakMap: build it with new WeakMap(), manipulate it with set/get/has/delete, and subclass it (we don't do this, but userspace might, and the goal is to be indistinguishable from the native version). We'll do the same with WeakSet (which, although technically implementable with a dummy-value wrapper around WeakMap, can be more efficient if we create a dedicated version).

The Agoric platform provides a different tool for contract code named Store, with a slightly different API (init/get/set/has/delete, which throws errors on access to undefined keys), built on top of the native WeakMap. For now, the WeakStore provided by @agoric/store will build on top of the replacement WeakMap we're building here (necessarily so, because @agoric/store is not privileged in any way, so our vref-aware WeakMap will be the only tool it has to work with). In the future, we may create a more specialized facility that restricts the keys and values to ones we can manage efficiently (e.g. requiring serializable values, and keeping them on disk instead of in memory).

Description of the Design

@FUDCo has built much of this in the makeWeakStore() function of the virtual object manager, so we'll start with that.

The basic design is a hybrid table: a normal WeakMap for keys which are ordinary objects, and a parallel Map keyed by vrefs for keys which were provided as Presences or Representatives.

The changes we need to makeWeakStore() include:

  • change the API: it is written with the Store API, rather than the native one
  • make it handle Presences, and not just Representatives
    • note that Remotables are treated like normal objects: we do not use their vrefs
  • integrate it with the GC hooks (e.g. liveslots needs to know when a given vref remains recognizable, which we can do by checking all the parallel Map for the vref)
  • make it safely subclassable

We'll also create WeakSet, which will share most of the same code.

Once done, we'll use liveslots and the inescapableGlobals feature of importBundle to replace the native WeakMap with the replacement.

Security Considerations

We must be absolutely sure that the native WeakMap and WeakSet are unavailable to adversarial vat code, else it would be able to create a consensus failure.

We must make sure the set of syscalls each vat makes (modulo the "sufficiently" of "sufficiently-deterministic" GC syscalls) is strictly a function of userspace behavior. That means we cannot read data from the secondary store (syscall.vatstoreGet) in response to e.g. merely creating a Representative. This is more of a concern for the virtual object manager and its provideRepresentative() function, but might influence the WeakMap design in some places.

Performance Considerations

WeakMap and WeakSet are not about reducing RAM usage for live objects. They must not prevent their keys from being collected. They must not continue to use storage for entries that have gone away (explicitly deleted), nor for entries whose keys have become unrecognizable. But while those keys are still recognizable, they are allowed to use as much memory as a regular Map would.

Therefore our WeakMap is not a suitable tool for high-cardinality rights-amplification patterns that require some secret data to be associated with each key. For instance, the usual Sealer/Unsealer pattern uses a WeakMap to map the Box to the sealed contents, where both the Sealer and Unsealer hold close access to the WeakMap. If the WeakMap lives in the same vat as the Box objects, the Boxes can't be Remotables, because those are stored as normal objects, so they'd have to be virtual-object Representatives. However their values are held strongly (in RAM) by the WeakMap as long as the key is reachable, which means you'd have high cardinality of the sealed contents (RAM usage is proportional to the number of Boxes, which is not what you'd want). The same would be true if the Boxes were created in some different vat: the WeakMap keys would be Presences, but the values are still pinned in RAM.

Our new WeakSet may be suitable for high-cardinality rights-amplification patterns that merely need to recognize a given object, for example a set of Purses or Payments recognized as valid by an Issuer. If we chose to save the WeakSet vref keys in secondary storage, we would have flat RAM usage independent of the number of vref keys. We need to decide whether this should be a property of our WeakSet, or if we defer this to some form of virtual collections.

The key issue is what performance properties do we want, vs the complexity of implementing them. There are a half-dozen-ish functional and performance properties to consider, and each of the tools we provide will mix-and-match a subset of them:

  • is the collection iterable? specifically can you list the keys? WeakMap/WeakSet no, Map/Set/Array yes
  • can you filter or sort the items in some more-efficient way?
  • are the keys held strongly?
    • if not, you probably can't iterate them
  • does RAM usage scale with the number of entries?
  • are the values pinned to RAM, or can they be virtualized/offloaded somehow?

Our replacement WeakMap and WeakSet are primarily intended to 1: behave like the native versions, 2: be safe in the face of vref-based identity. It would be nice if they could minimize RAM usage in some circumstances, but that's not the main goal.

cc @FUDCo @erights

@warner warner added enhancement New feature or request SwingSet package: SwingSet labels Apr 28, 2021
@katelynsills
Copy link
Contributor

A few questions from the consumer side of this:

First, are we sure that we must provide consumers with a WeakMap and WeakSet for virtual objects compared to a different library that we provide? It seems like a lot of effort, when it might be sufficient from a product perspective to provide the different library and use ESLint rules and documentation to discourage usage of the native WeakMap and WeakSet. Certainly for objects that are internal to Zoe (no real user can access them) this would be sufficient.

Second, the main usage of a WeakSet/WeakStore in ERTP is for the payment ledger, which maps payments to amounts. Payments are currently virtual objects for the sole purpose of saving on RAM. So I was surprised to read that "Therefore our WeakMap is not a suitable tool for high-cardinality rights-amplification patterns..." I think I must be missing something, as this design would therefore go against one of our main requirements. Are we gaining anything from payments being virtual objects?

Thanks!

@erights
Copy link
Member

erights commented Apr 29, 2021

A few questions from the consumer side of this:

I can answer this from the perspective of the Store semantics taking shape at https://github.com/Agoric/agoric-sdk/blob/markm-many-stores/packages/marshal/src/store-types.md , for which we did a nice recorded kernel meeting today. (@warner, Please let me know where I can find the recording.) Thus, my answers here may be very different than what @warner had in mind before today's meeting.

First, are we sure that we must provide consumers with a WeakMap and WeakSet for virtual objects compared to a different library that we provide?

We would leave the semantics of the native JS collections unmodified: Map, Set, WeakMap, WeakSet. All of our use of virtual collections would be only in terms of our Stores, which has a distinct semantics. But these do include WeakStores, both in memory and virtual (on disk).

To preserve the semantics of the language native collections unmodified, we do need to patch WeakMap and WeakSet.

It seems like a lot of effort, when it might be sufficient from a product perspective to provide the different library and use ESLint rules and documentation to discourage usage of the native WeakMap and WeakSet.

I don't trust eslint rules on this, because unpatched JS WeakMap and WeakSet would expose non-determinism. But we could patch them so that they reject --- with a thrown error --- an attempt to use a Remotable as a key. However, we also need our own WeakMapStore and WeakSetStore that only use Remotables as keys, so all the work needed for the full patch is work we'll need to do anyway. In fact, it helps the layering. The in-memory WeakStores are naturally build on top of the JS WeakMaps and WeakSets. So if these are repaired first, our in-memory weak stores just inherit these repairs.

Certainly for objects that are internal to Zoe (no real user can access them) this would be sufficient.

My assumption is that we want Zoe to eventually support a huge number of outstanding idle invitations as well as a huge number of outstanding seats. Both would require virtual weak stores. So I don't think we can escape this.

Second, the main usage of a WeakSet/WeakStore in ERTP is for the payment ledger, which maps payments to amounts. Payments are currently virtual objects for the sole purpose of saving on RAM. So I was surprised to read that "Therefore our WeakMap is not a suitable tool for high-cardinality rights-amplification patterns..."

As of today's clarification, we would store the payment ledger only in virtual weak stores. The JS native collections would remain only in memory, and should not be used for any of this.

I think I must be missing something, as this design would therefore go against one of our main requirements. Are we gaining anything from payments being virtual objects?

Do the new answers make more sense?

Thanks!

@katelynsills
Copy link
Contributor

Thanks for the explanation! That makes a lot of sense. One part that I'm still processing:

Second, the main usage of a WeakMap/WeakStore in ERTP is for the payment ledger, which maps payments to amounts. Payments are currently virtual objects for the sole purpose of saving on RAM. So I was surprised to read that "Therefore our WeakMap is not a suitable tool for high-cardinality rights-amplification patterns..."

As of today's clarification, we would store the payment ledger only in virtual weak stores. The JS native collections would remain only in memory, and should not be used for any of this.

Maybe the part I was missing is that there are two independent choices: 1) whether or not Presences and Representatives can be used as keys, and 2) whether external storage is used to reduce RAM. The JS native collections would be patched to act correctly when Presences and Representatives are used as keys, but they wouldn't otherwise be changed, so there's no saving on RAM. Is that right?

@erights
Copy link
Member

erights commented Apr 29, 2021

Correct. The patching of WeakMap and WeakSet itself does not save on RAM. It only rescues their determinism if someone happens to use a virtual object as a key in one of them. For us, we would avoid doing so directly because we would prefer our stores. But we can't stop our customers from doing so. And to make that work correctly, the work we need to do is substantial, but it is work we need to do anyway, even if we don't fix WeakMap and WeakSet.

@warner
Copy link
Member Author

warner commented Apr 29, 2021

Right. If the platform exposed both native WeakMap/WeakSet and performed GC of unreferenced Presences, then adversarial userspace code would have access to non-determinism (enabling it to break consensus), and simply uncautious userspace code (which used Presences as WeakMap keys) would see data mysteriously disappear at random. So we certainly can't afford to leave WeakMap alone, and it would be too weird to delete them entirely. So once we're committed to replacing them, the question is what programming model should we provide for the replacements.

@erights the meeting recording is in the Agoric google drive, under the path "Engineering > kernel/SwingSet > 2021-04-28 virtual collections by MarkM".

@warner
Copy link
Member Author

warner commented May 19, 2021

This is mostly addressed by the closing of #1968 via PR #2996, since that overrides the WeakMap/WeakSet that userspace can reach.

The remaining issues are:

@warner
Copy link
Member Author

warner commented May 21, 2021

Now that #3115 is done, I think we can close this.

@FUDCo
Copy link
Contributor

FUDCo commented May 21, 2021

Closed by #3115

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request SwingSet package: SwingSet
Projects
None yet
Development

No branches or pull requests

4 participants