-
Notifications
You must be signed in to change notification settings - Fork 41
Soft check for WeakRef with reclaimed referent? #188
Comments
Just FWIW, here's an idea what
(Tried to keep the same style.) [Edit 2020/03/05 - My updated understanding -- thank you @littledan!! -- is that a scan like the one below is not a good idea, so don't copy this code or the idea behind it.] function makeWeakCached(f) {
const cache = new Map();
let itHoover = cache.entries();
const hasIdle = typeof requestIdleCallback !== "undefined";
const cleanup = new FinalizationRegistry(iterator => {
hoover = itHoover = null; // Saw a finalization callback, we don't need the hoover
for (const key of iterator) {
// See note below on concurrency considerations.
const ref = cache.get(key);
if (ref && !ref.deref()) cache.delete(key);
}
});
let hoover = () => {
if (!hoover) return;
let result = itHoover.next();
if (result.done) {
// Restart
itHoover = cache.entries();
result = itHoover.next();
}
if (!result.done) {
const [key, ref] = result.value;
if (ref.isReclaimed()) cache.delete(key);
}
if (hasIdle) requestIdleCallback(hoover);
};
if (hasIdle) requestIdleCallback(hoover);
return key => {
if (hoover && !hasIdle) hoover();
const ref = cache.get(key);
if (ref) {
const cached = ref.deref();
// See note below on concurrency considerations.
if (cached !== undefined) return cached;
}
const fresh = f(key);
cache.set(key, new WeakRef(fresh));
cleanup.register(fresh, key, fresh);
return fresh;
};
} Obviously it's more complicated. It also adds a bit more concurrency btw the program and GC, but in the same sort of class as the existing concurrency. There's more one could do (not running the hoover if the cache is empty, only running it periodically rather than constantly when idle, etc.) but for an example, seems sufficient. |
There are definitely possible improvements that we could make to the weak cache (for example, strongly hold an LRU list, which would probably be necessary to make this useful). Maybe it should be deleted from the README since it's not a great idea in its current state. However, we cannot add any APIs to check whether something is being referenced--they would be too inefficient. The WeakRef and FinalizationRegistry APIs don't give guarantees of precision because this would imply inefficient traversals over the heap, but you can expect implementations to make the tradeoff they feel is appropriate in terms of nulling things out in a timely manner, so |
@littledan - Thanks for the reply! I'm a bit confused, though:
That's not what I'm suggesting/asking about -- or at least, if it is I'm afraid I don't understand. :-D
class WeakRef {
// ...
isReclaimed() {
return this.deref() === undefined;
}
} ...but where the GC is not supposed to infer from that check that we actually wanted the referent (e.g., if it prioritizes based on recent usage). That's its only purpose, checking if something has already been reclaimed (so the WeakRef is pointless). It's not asking if something is referenced, it's asking if it's already known to no longer exist. |
I think the "peeking without impact" issue has been raised a couple times before. I've personally wondered if that API could be useful as well. This would allow for the equivalent of
I'm not sure why we'd need this API be stable for the Edit: mixed up my |
@mhofman -
It's the |
I'm definitely -1 on this proposal. WeakRefs make GC observable, which is hugely problematic, and the proposal goes to some effort to reduce that observability (by restricting it to the end of the current synchronous execution, for example.) This proposal, if implemented usefully, would reintroduce that observability unless I'm missing something. I haven't really worked through the details of WeakRefs + caching, but I'm sympathetic to the use case. You could have the weakCache use a Map from key to (metadata, WeakRef(value)) tuples, where the metadata is sufficient to compute the eviction priority of an item. But the most straightforward time to decide to evict something is during finalization, at which point it's too late to not evict the entry being finalized. I guess you could evict during insertion: eg for LRU eviction, you maintain an LRU list of strong refs to the values. When you insert something that pushes you above your desired cache size, you remove some number of LRU list entries. Deallocation is not very prompt, though. You'd probably want to do the eviction before the allocation of the new item if possible, because its allocation would be a signal to the engine to do a GC if things are getting tight. If you did it the other way around, that GC would happen while the LRU list was still holding the doomed values, and the LRU list removal isn't enough to tell the engine that there is an opportunity to clean some big things up. I don't know what the minimal API would be to handle this more promptly. You kind of want a whole new API entry for the engine to ask "would you like to keep this?" before discarding registered targets. But that in itself would probably need to be delayed to the end of a turn (synch JS execution) to avoid leaking timing and internals, and would probably delay the cleanup until the following turn anyway. Err... I don't think it works. Cleanup would be delayed at least until the following GC, which is even less prompt. The "would you like to keep this?" code requires a GC collection in order to be called in the first place. And then you need another GC to finalize the target for real, since the target and everything reachable from the target would have to be traced in that first GC. The engine would first do a full GC treating all WeakRefs as weak, at which time it would accumulate a list of dead stuff. Then it would take the subset of that stuff that is keepalive-able, trace it, and run the "would you like to keep this?" keepalive callbacks. Anything that returns "no" would then go into an internal "condemned" set that would be ineligible for keepalive callbacks in the next GC. Bleh. |
I don't think this introduces any observability that isn't already there via |
Making |
If that's the case, then yes there would be no way to make this API not fully equivalent to However, maybe we could add a note that mentions something like:
Maybe implementations would be able to avoid moving objects between generation buckets if they are only in the |
That isn't the impact I'm talking about. That's the [[KeptAlive]] impact (which would be fine). I'm talking about any impact on the garbage collector: You're right about multiple
I'd said smarter people than me needed to decide whether a
|
Note: All of this is predicated on the idea that garbage collectors may take recent usage of a referent as an input to their processing (now, or at some point in the future, since weakly held references are a new design input). If they don't and won't, if calling Separately:
Here's prior art for an API providing this kind of information: C#'s
(Again, please note those return values are the inverse of what I'd originally suggested for That "has not been garbage collected" is a stronger statement than I'd originally suggested for |
It sounds like we're down to a version which wouldn't affect observability of GC, as described in #188 (comment) .
I can't rule out that some GCs might prioritize things this way, but I haven't heard of it. Can we consider adding this API in a follow-on proposal if there is interest from implementers in the future? Right now, we're in a phase where we're considering removing some possibly-overengineered APIs that try to give more signals to implementations, e.g., #187 . |
@littledan - Thanks Dan. FWIW, that seems reasonable to me, perhaps with the caveat that if implementers are taking I ran some tests with the current support in V8 and SpiderMonkey that suggest neither of them cares whether you use |
You are correct, SpiderMonkey currently only uses For scheduling and tuning to be relevant, we would have to be holding back and/or reordering finalization callbacks. Right now, when anything dies, we immediately queue up any relevant callbacks and then yield/invoke all of them in the order they were discovered. I suspect all the other implementations are at the same point, though the ordering may vary depending on the underlying data structures used. (eg, I think we're iterating over a set for part of this.) I'd guess we're a ways off from any form of scheduling. |
OK, given #188 (comment) , I'm closing this issue. |
Apologies if I'm retreading trodden ground here, as seems likely. I see this note in this PDF in the history:
which seems like it might be relevant, but I'm not sure what "Fallback: No" means.
The explainer's
makeWeakCached
example leaks memory in the absense of finalizer callbacks from GC, which implementations can skip "...for any reason or no reason..." That made me wonder if a cache like that should do proactive cleanup in some way, removing WeakRefs whose referent has been reclaimed. But the only way to know if a referent has been reclaimed is to usederef
, which makes the referent strongly reachable again and keeps it alive until the end of the current job. (I also wonder if the GC might take that "use" [which isn't really a use] of the referent as evidence it should keep the referent in preference to other reclaimable objects. [It's probably painfully obvious that I know very little about modern GC techniques used in JavaScript engines.])Would some kind of soft check for a reclaimed referent (
isReclaimed
) make sense, something that doesn't make the referent strongly reachable, and isn't taken as evidence of use? Then caches like themakeWeakCached
example could remove entries for whichref.isReclaimed()
returnstrue
, perhaps in an idle callback if the host supports them. (Not necessarily in a big loop, it could be incremental.)It would be important to emphasize that only a
true
return actually tells you anything useful. Afalse
return is no guarantee that the referent won't be GC'd milliseconds later.Semantics:
isReclaimed
returnstrue
, it will always returntrue
in the future for that same WeakRef.false
, smarter people than me would need to decide whether it would consistently returnfalse
for that same WeakRef until the end of the current job (e.g.,isReclaimed
adds the referent to the [[KeptAlive]] list) or if it might returntrue
if called a second time.The text was updated successfully, but these errors were encountered: