Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should validate MLGraph.[[context]] in MLContext.compute() and MLContext.computeSync() steps #341

Closed
huningxin opened this issue Feb 13, 2023 · 7 comments · Fixed by #584
Assignees
Labels

Comments

@huningxin
Copy link
Contributor

huningxin commented Feb 13, 2023

This issue was raised in Chromium WebNN CL review by @wacky6 , thanks Jiewei!

MLGraphBuilder takes a context, why do we define compute() on MLContext instead
of the built graph?

The change of moving compute methods from MLGraph to MLContext was introduced by #257 . As @wchao1115 mentioned, the main intention is to avoid adding more compute methods for different execution modes, such as queued execution mode of MLCommandEncoder for WebGPU interop, into the single MLGraph interface. The design choice was made to fold the execution methods into MLContext and MLCommandEncoder interfaces, hopefully "making it easier for developer to associate execution method to the context created with different options."

This allows passing a MLGraph build for a different context. This sounds like the spec is allowing misuse? I'd imagine a CPU graph can't be used with GPU context.

This is not the intention. This looks like an issue of WebNN. The current steps of Synchronous Execution and Asynchronous Execution should validate the MLGraph.[[context]] against the MLContext instance

@wacky6
Copy link

wacky6 commented Feb 13, 2023

Checking context match is one option. Though ideally the API design should avoid getting into such failure cases.

AFAIK, a built graph should have been initialized (i.e. a built GPU graph should have initialized GPU resources to avoid cold start in first compute(), or have JIT compilation completed). So the graph is strongly tied with context and can't really be used for a different context.

Or alternatively, we make MLGraph a context independent concept, and introduce a new MLBuiltGraph concept that represents a graph initialized on device and can immediately start computing().

MLGraph (context independent) + MLContext (resources) -> MLBuiltGraph (can do commpute).

Though in this case, MLGraph becomes a "Web standardized" way to describe a ML model, the Context + MLBuiltGraph effectively becomes what we have in ML Model Loader.

@huningxin
Copy link
Contributor Author

@wacky6

AFAIK, a built graph should have been initialized (i.e. a built GPU graph should have initialized GPU resources to avoid cold start in first compute(), or have JIT compilation completed). So the graph is strongly tied with context and can't really be used for a different context.

I'd agree with you. The initialization should be context dependent and it would be critical for optimal executions.

According to WebNN programming model, the MLGraph actually represents a compiled graph for optimal execution. The computation graph compilation, optimization and initialization would be context dependent.

For default MLContext, the build methods (build() or buildSync()) of MLGraphBuilder would compile and optimize the computation graph. For example, the Chromium WebNN XNNPACK backend calls xnn_create_runtime() for MLGraphBuilder.build() implementation. XNNPACK internally would do subgraph optimizations (such as op fusion), JIT compilation (if it is enabled) and weights repacking (e.g. for conv2d) etc.,. After this step, the MLGraph XNNPACK implementation (MLGraphXnnpack) would use the optimized xnn_runtime object for the optimal executions by calling xnn_invoke_runtime().

For MLCommandEncoder, its initializeGraph() method for this purpose and focuses on WebGPU queue submission. The implementation experience is to be established for this interface.

So, I guess the MLGraph interface is close to MLBuiltGraph concept in your comment. In current WebNN spec, there is no interface to represent the context independent computation graph. Are you proposing we should introduce such interface?

@wacky6
Copy link

wacky6 commented Feb 14, 2023

I guess the key issue is to provide a task description (for GPU) so that the client can decide when the task is executed along with other non-WebNN tasks.

From the current spec, it's unclear where GPU related states are stored (are they on MLCommandEncoder, or MLGraph itself).

I feel some example code would help with the discussion (I'm not entirely familiar with WebGPU usage): what's the difference between CPU MLGraph.compute() vs. GPU MLGraph.compute() vs. GPU command encoder + WebGPU interop?

I'd imagine GPU MLGraph.compute() just do things under the hood, and WebGPU interop provides fine-grain control.

Maybe we should subclass MLGraph based on the context that creates it. For example, CPU context returns a MLCpuGraph with compute(). GPU context returns a MLGpuGraph with compute() and GPU interop methods (commandBuffer, dispatch, etc).

We probably should also see if prior arts exist (like Canvas's rendering context)?

@huningxin
Copy link
Contributor Author

@wacky6 , thanks for your suggestions and proposal. I am filing new issues that we can follow up respectively.

From the current spec, it's unclear where GPU related states are stored (are they on MLCommandEncoder, or MLGraph itself).

#342

I feel some example code would help with the discussion (I'm not entirely familiar with WebGPU usage): what's the difference between CPU MLGraph.compute() vs. GPU MLGraph.compute() vs. GPU command encoder + WebGPU interop?

#343

Maybe we should subclass MLGraph based on the context that creates it. For example, CPU context returns a MLCpuGraph with compute(). GPU context returns a MLGpuGraph with compute() and GPU interop methods (commandBuffer, dispatch, etc).

#344

@anssiko
Copy link
Member

anssiko commented Feb 15, 2023

If we add MLGraph.[[context]] validation into compute() and computeSync() we should see if that suggests changes to the validate MLContext algorithm that is called from createContext(), createContextSync() method, and MLGraphBuilder constructor. We hit that "validate MLContext" algorithm before we get to compute() or computeSync().

I think we also want to review that all the validation checks in place are necessary, no duplicates and that we bail out as soon as possible as an optimization strategy.

We can always rename these internal algorithms and that is encouraged if it improves spec readability. More specific names are generally better than very generic names.

@inexorabletash
Copy link
Member

inexorabletash commented Feb 21, 2024

Unless #303 results in a big reorganization of the API, I think this issue boils down to what it says in the title. Basically, add this to the compute() method steps:

  1. If graph.[[context]] is not this, then reject promise with a "DataError" DOMException.

The Chromium prototype implementation has this check. Shall we just add it and close out this issue, and let the other issues track broader API shape considerations?

@anssiko
Copy link
Member

anssiko commented Feb 26, 2024

@inexorabletash that line added to the compute() method steps would be the missing piece to close this issue, I think.

(The spec has evolved since this was initially discussed. E.g. *Sync() methods, the dedicated MLContext validation algorithm have been removed and relevant validation steps inlined into the method steps, thus my earlier comments have been addressed already.)

inexorabletash added a commit to inexorabletash/webnn that referenced this issue Feb 26, 2024
This matches the Chromium prototype implementation.

Fixes webmachinelearning#341
@inexorabletash inexorabletash self-assigned this Feb 26, 2024
anssiko pushed a commit that referenced this issue Feb 27, 2024
This matches the Chromium prototype implementation.

Fixes #341
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants