-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should validate MLGraph.[[context]] in MLContext.compute() and MLContext.computeSync() steps #341
Comments
Checking context match is one option. Though ideally the API design should avoid getting into such failure cases. AFAIK, a built graph should have been initialized (i.e. a built GPU graph should have initialized GPU resources to avoid cold start in first compute(), or have JIT compilation completed). So the graph is strongly tied with context and can't really be used for a different context. Or alternatively, we make MLGraph a context independent concept, and introduce a new MLBuiltGraph concept that represents a graph initialized on device and can immediately start computing(). MLGraph (context independent) + MLContext (resources) -> MLBuiltGraph (can do commpute). Though in this case, MLGraph becomes a "Web standardized" way to describe a ML model, the Context + MLBuiltGraph effectively becomes what we have in ML Model Loader. |
I'd agree with you. The initialization should be context dependent and it would be critical for optimal executions. According to WebNN programming model, the For default For So, I guess the |
I guess the key issue is to provide a task description (for GPU) so that the client can decide when the task is executed along with other non-WebNN tasks. From the current spec, it's unclear where GPU related states are stored (are they on MLCommandEncoder, or MLGraph itself). I feel some example code would help with the discussion (I'm not entirely familiar with WebGPU usage): what's the difference between CPU MLGraph.compute() vs. GPU MLGraph.compute() vs. GPU command encoder + WebGPU interop? I'd imagine GPU MLGraph.compute() just do things under the hood, and WebGPU interop provides fine-grain control. Maybe we should subclass MLGraph based on the context that creates it. For example, CPU context returns a We probably should also see if prior arts exist (like Canvas's rendering context)? |
@wacky6 , thanks for your suggestions and proposal. I am filing new issues that we can follow up respectively.
|
If we add MLGraph.[[context]] validation into compute() and computeSync() we should see if that suggests changes to the validate MLContext algorithm that is called from createContext(), createContextSync() method, and MLGraphBuilder constructor. We hit that "validate MLContext" algorithm before we get to compute() or computeSync(). I think we also want to review that all the validation checks in place are necessary, no duplicates and that we bail out as soon as possible as an optimization strategy. We can always rename these internal algorithms and that is encouraged if it improves spec readability. More specific names are generally better than very generic names. |
Unless #303 results in a big reorganization of the API, I think this issue boils down to what it says in the title. Basically, add this to the compute() method steps:
The Chromium prototype implementation has this check. Shall we just add it and close out this issue, and let the other issues track broader API shape considerations? |
@inexorabletash that line added to the compute() method steps would be the missing piece to close this issue, I think. (The spec has evolved since this was initially discussed. E.g. *Sync() methods, the dedicated MLContext validation algorithm have been removed and relevant validation steps inlined into the method steps, thus my earlier comments have been addressed already.) |
This matches the Chromium prototype implementation. Fixes webmachinelearning#341
This matches the Chromium prototype implementation. Fixes #341
This issue was raised in Chromium WebNN CL review by @wacky6 , thanks Jiewei!
The change of moving compute methods from
MLGraph
toMLContext
was introduced by #257 . As @wchao1115 mentioned, the main intention is to avoid adding more compute methods for different execution modes, such as queued execution mode ofMLCommandEncoder
for WebGPU interop, into the singleMLGraph
interface. The design choice was made to fold the execution methods intoMLContext
andMLCommandEncoder
interfaces, hopefully "making it easier for developer to associate execution method to the context created with different options."This is not the intention. This looks like an issue of WebNN. The current steps of Synchronous Execution and Asynchronous Execution should validate the
MLGraph.[[context]]
against theMLContext
instanceThe text was updated successfully, but these errors were encountered: