Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

downloadprogress event clarification #20

Open
fergald opened this issue Oct 15, 2024 · 14 comments
Open

downloadprogress event clarification #20

fergald opened this issue Oct 15, 2024 · 14 comments

Comments

@fergald
Copy link

fergald commented Oct 15, 2024

I'd like us to make a few things clear about downloadprogress

  • it's possible to receive 0 events, so never rely on the first event to create a download UI e.g.
    • if the model was already downloaded
    • if it was just very quick to download
    • if there's a network problem
  • reaching 100% does not mean that the model will become available, there may be installation

Part of me wants to say that if the create call's promise is not resolvable immediately, then we should queue one downloadprogress event immediately. It could be 0%, it could be 100%. Without this, a page has to pop up the download progress defaulting to 0% until it receives the first event.

It's an edge case but let's say something already triggered a model download, the download has finished and installation is happening (which may be slow) or installation already failed for some reason and will fail again. There's no download to do, just waiting for installation. The page should show the download at 100%. If the page defaults 0% and displays that in the UI even though no download is occurring that is confusing. If the installation fails, to the user, it looks like the download failed without making any progress.

@domenic
Copy link
Collaborator

domenic commented Oct 15, 2024

webmachinelearning/prompt-api#4 is related.

Some precedent from XHR:

  • Always fires a 0% progress event, although it's named loadstart specifically
  • Always fires a 100% progress event (actually 3 of them, progress, load, and loadend)
  • Otherwise tries to fire every time a new chunk is received, as long as >=50 ms have passed since the last chunk.

I think this precedent is reasonable. What do you think of the following pseudo-spec?

If the state is not "after-download", then:

  • Queue a global task on the networking task source to fire a 0% downloadprogress event
  • Every time "download progress is made", and at least 50 ms have passed since the last downloadprogress event, queue a global task on the networking task source to fire a downloadprogress event with the appropriate values.
  • Once the model is completely downloaded, queue a global task on the networking task source to fire a 100% downloadprogress event

If state is "readily", then never fire any downloadprogress events.

This doesn't take care of the edge case you mention but I think that is handled best by expanding the monitor class to have some way of monitoring installation / loading progress, per webmachinelearning/prompt-api#4.

@fergald
Copy link
Author

fergald commented Oct 15, 2024

Queue a global task on the networking task source to fire a 0% downloadprogress event

This might not be 0% if something else caused the download to begin already. XHR doesn't have this possibility. The first event should just reflect the current state of the download.

If state is "readily", then never fire any downloadprogress events.

Is there a race here?

const translatorCapabilities = await ai.translator.capabilities();

if (languageDetectorCapabilities.available === "after-download") {
      const detector = await ai.languageDetector.create();
}

will the state for the create call definitely be "after-download". I think we can say that

  1. it must be - now what happens if I call capabilities() again and get a different answer, how can it be the same as both?
  2. the state shoud be consistent throughout a single JS task
  3. the state is recaclulated on calling create

2 is the only version that avoids a race but it's awkward to implement because the span of a JS task is only known to the renderer process.

It would be simpler to say that you wlll get a 100% event. What happens if the XHR response was cached? Do you immediately get a 100% event?

@domenic
Copy link
Collaborator

domenic commented Oct 15, 2024

This might not be 0% if something else caused the download to begin already. XHR doesn't have this possibility. The first event should just reflect the current state of the download.

I think it might still be worthwhile, for predictability, to always fire the 0 immediately.

This helps very slightly with the cross-context fingerprinting issues: it makes it harder to tell the difference between "the user at 5.52% through the download" vs. "the user who downloaded 5.52% + whatever in 50 ms".

Is there a race here?

My intention (which has not yet met implementation reality, so thanks for engaging) is that we keep a copy of the model's state in the renderer process, and only update it via queued tasks. In particular the task that resolves the promise returned by capabilities(), or the download progress tasks. This includes "readily" vs. "after-download".

Note that things that happen due to the action of one renderer process should not broadcast updates to all other renderer processes. You should only get downloadprogress updates once you've "tuned in" by calling create().

I think this design avoids all races. It matches your (2), I guess? I don't understand what's hard to implement it as I would assume the usual way to update state is via the browser queuing tasks, which ensures that any JS tasks running between those browser tasks always see consistent state.

What happens if the XHR response was cached? Do you immediately get a 100% event?

You get both the 0% and 100% events. (In separate queued tasks, although that would be hard to observe.)

@fergald
Copy link
Author

fergald commented Oct 15, 2024

I think this design avoids all races. It matches your (2), I guess? I don't understand what's hard to implement it as I would assume the usual way to update state is via the browser queuing tasks, which ensures that any JS tasks running between those browser tasks always see consistent state.

It's not hard, but it requires extra state tracking. It might not be what you meant but what I meant by 2 was that

const translatorCapabilities1 = await ai.translator.capabilities();
// do something for 5 seconds (yeah I know that's bad)
const translatorCapabilities2 = await ai.translator.capabilities();

console.log(translatorCapabilities1.available == translatorCapabilities2.available);

will always log true.

The browser cannot naturally tell that both calls to capabilities() are from the same JS task. We would have to have some kind of task-bound cache of responses. It's not rocket science but it's not pleasant either.

@domenic
Copy link
Collaborator

domenic commented Oct 15, 2024

I don't think we should have a task-bound cache to handle such cases. Any time you have an await, that's an async boundary, and it's understood that state might change between the suspension from JS (before the await) and the return to JS (after the await).

@fergald
Copy link
Author

fergald commented Oct 15, 2024

Oh yeah! But you still have this weird thing where when you call create it has to remember whether you got "readily" or "after-download" and it has to cause the right event behaviour. This was what I was originally thinking of when I said it was awkward.

@domenic
Copy link
Collaborator

domenic commented Oct 16, 2024

I feel like that falls out naturally if you only update the renderer-process state via tasks. Is there an alternative architecture you were thinking of where it's awkward?

@fergald
Copy link
Author

fergald commented Oct 18, 2024

When you get a result from capabilties() , we could e.g. have to stash it on the factory object (since no other object need live beyond JS holding a reference but when the JS task ends and a new one begins, nothing will reset this value. So there's no way to "forget" the result at the end of the task.

@domenic
Copy link
Collaborator

domenic commented Oct 19, 2024

Yes, I think the factory object is a natural place to store the state when you update the renderer process via posted tasks in my model.

@fergald
Copy link
Author

fergald commented Oct 19, 2024

The problem is that the response has to be invalidated at the end of the task (otherwise consistency has to be maintained indefinitely). There is no mechanism for that. Or I guess you just maintain consistency for the duration of the context, with the data only being updated when create's promise resolves.

I don't see a problem speccing or implementing that but it seems odd and avoidable if we just promise to always deliver an initial progress event instead.

@domenic
Copy link
Collaborator

domenic commented Oct 20, 2024

Or I guess you just maintain consistency for the duration of the context, with the data only being updated when create's promise resolves.

Yes, that's the intent. To be clear, the data is updated whenever either capabilities() or create() goes and gathers info from the browser process.

if we just promise to always deliver an initial progress event instead

I think that is what we should do, per #20 (comment), where I said

I think it might still be worthwhile, for predictability, to always fire the 0 immediately.

Let me try to spell out the case that I think we're talking about, with my proposed model:

  1. The model is initially not downloaded.
  2. In Windows A, B, and C, the web developer calls ai.translator.capabilities() and gets back an object whose value for available is "after-download"`. (Let's ignore how exactly this object gets created for now as it's similar to what's shown in step 4.)
  3. In Window A, the model download starts and finishes. (Let's ignore how this happens for now as it's somewhat similar to what's shown in step 5.)
  4. In Window B:
    1. There have been no updates broadcast, so the cached AITranslatorCapabilities object's available still returns "after-download".
    2. The web developer calls ai.translator.capabilities(). This goes in parallel to:
      1. Check the status of the model. It's downloaded!
      2. So, queue a task to Window B to:
        1. Updates the model availability from "after-download" to "readily".
        2. Resolves the capabilities() promise.
    3. The web developer checks the return value of this latest capabilities() call's available. Because Window B's model availability information now says "readily", the getter returns "readily".
    4. The web developer checks the cached-in-step-2 AITranslatorCapabilities's available. Because Window B's model availability information now says "readily", the getter now returns "readily".
  5. In Window C:
    1. There have been no updates broadcast, so the cached AITranslatorCapabilities object's available still returns "after-download".
    2. The web developer calls ai.translator.create(). This goes in parallel to:
      1. Start the download of the model. Oh wait, it's already downloaded! So,
      2. Queue a task to window C to:
        1. Update the model availability from "after-download" to "readily".
        2. Fire a downloadprogress event for 0%.
        3. Fire a downloadprogress event for 100%.
        4. Resolve the promise returned by create().
        5. (Or maybe this should be three separate tasks? Four separate tasks? The difference is observable by downloadprogress event handler code, and I'm not sure which is best.)
      3. The web developer checks the cached-in-step-2 AITranslatorCapabilities's available. Because Window C's model availability information now says "readily", the getter now returns "readily".

@fergald
Copy link
Author

fergald commented Oct 23, 2024

I think I agree with all of that although 4.ii and 5.ii are odd because it says "This goes in parallel to" but then a and b are sequential.

Also, we're always firing the events even when it's already downloaded. I thought you were arguing against that. If you're not then I'm happy.

@domenic
Copy link
Collaborator

domenic commented Oct 23, 2024

"In parallel to" means "on a background thread" or "in the browser process": https://html.spec.whatwg.org/#in-parallel

Also, we're always firing the events even when it's already downloaded. I thought you were arguing against that. If you're not then I'm happy.

In all my above examples the model's state, as seen by the Window, is "after-download". In those cases we should fire the event.

Let's add the following scenario:

  1. In Window C, after step 5 has fully completed:
    1. The cached AITranslatorCapabilities object's available is now returning "readily".
    2. The web developer calls ai.translator.create(). Window C knows this doesn't require a download, so it goes in parallel to:
      1. Load the model
      2. Queue a task to Window C to:
        1. Resolve the promise returned by create().

In this case I think not firing any downloadprogress makes sense.

@domenic
Copy link
Collaborator

domenic commented Oct 24, 2024

As I write the spec for this, I'm having second thoughts. It doesn't really make sense to consult the cached availability for the model in create() since we're going to have to go to the browser process anyway.

Let's put this on hold until I can get a reasonable spec written, hopefully over the next few days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants