Skip to content

Commit

Permalink
Add MLDeviceType npu (#696)
Browse files Browse the repository at this point in the history
* Add MLDeviceType npu
* Minor wording
* Add note about fallback
* Remove 'can' regarding performance
* Review feedback
* Fix funky line break in issue text
* Review feedback
* Review feedback from Joshua
  • Loading branch information
fdwr authored Jun 18, 2024
1 parent 073ce7e commit 5c64074
Showing 1 changed file with 11 additions and 4 deletions.
15 changes: 11 additions & 4 deletions index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -613,7 +613,9 @@ Unlike WebGPU, this API does not intrinsically support custom shader authoring;

The WebGPU API identifies <a href="https://gpuweb.github.io/gpuweb/#privacy-machine-artifacts">machine-specific artifacts</a> as a privacy consideration. Similarly, the WebNN API's compute unit scheduling may under certain circumstances introduce a fingerprint. However, similarly to WebGPU, such fingerprints are identical across most or all of the devices of each vendor, mitigating the concern. Furthermore, software implementations can be used to further eliminate such artifacts.

The WebNN API defines two developer-settable preferences to help inform [[#programming-model-device-selection]] and allow the implementation to better select the most appropriate underlying execution device for the workload. An {{MLDeviceType}} normatively indicates the kind of device and is either {{MLDeviceType/"cpu"}} or {{MLDeviceType/"gpu"}}. If this type cannot be satisfied, an "{{OperationError}}" {{DOMException}} is thrown, thus this type can in some cases add two bits of entropy to the fingerprint. An {{MLPowerPreference}} indicates preference as related to the power consumption and is considered a hint only and as such does not increase entropy of the fingerprint.
The WebNN API defines two developer-settable preferences to help inform [[#programming-model-device-selection]] and allow the implementation to better select the most appropriate underlying execution device for the workload. An {{MLDeviceType}} normatively indicates the kind of device and is one of: {{MLDeviceType/"cpu"}}, {{MLDeviceType/"gpu"}}, {{MLDeviceType/"npu"}}. If this type cannot be satisfied, an "{{OperationError}}" {{DOMException}} is thrown, thus this type can in some cases add two bits of entropy to the fingerprint. An {{MLPowerPreference}} indicates preference as related to the power consumption and is considered a hint only and as such does not increase entropy of the fingerprint.

Issue(623): {{MLContextOptions}} is under active development, and the design is expected to change, informed by further implementation experience and new use cases from the wider web community.

If a future version of this specification introduces support for a new {{MLDeviceType}} that can only support a subset of {{MLOperandDataType}}s, that may introduce a new fingerprint.

Expand Down Expand Up @@ -702,7 +704,8 @@ WorkerNavigator includes NavigatorML;
<script type=idl>
enum MLDeviceType {
"cpu",
"gpu"
"gpu",
"npu"
};

enum MLPowerPreference {
Expand All @@ -725,12 +728,16 @@ interface ML {

### {{MLContextOptions}} ### {#api-mlcontextoptions}

Issue(623): {{MLContextOptions}} is under active development, and the design is expected to change, informed by further implementation experience and new use cases from the wider web community. The Working Group is considering additional API controls to allow the definition of a fallback device, multiple devices in a preferred order, or an exclusion of a specific device. Other considerations under discussion include error handling, ultimate fallback, and quantized operators. Feedback is welcome on any of these design considerations from web developers, library authors, OS and hardware vendors, and other stakeholders via GitHub:

The <dfn dfn-for=MLContextOptions dfn-type=dict-member>deviceType</dfn> option is an <dfn dfn-type=enum>MLDeviceType</dfn> and indicates the application's preference for the kind of device used for the context. It is one of the following:
<dl dfn-for="MLDeviceType">
<dt>"<dfn enum-value>cpu</dfn>"</dt>
<dd>Provides the broadest compatibility and usability across all client devices with varying degrees of performance.</dd>
<dt>"<dfn enum-value>gpu</dfn>"</dt>
<dd>Provides the broadest range of achievable performance across graphics hardware platforms from consumer devices to professional workstations.</dd>
<dd>Provides the broadest range of achievable performance across graphics hardware platforms from consumer devices to professional workstations. The underlying platform implementation may fall back to other devices for certain operators and parts of the graph.</dd>
<dt>"<dfn enum-value>npu</dfn>"</dt>
<dd>Provides power efficiency for sustained workloads across hardware platforms with purpose-built accelerators. The underlying platform implementation may fall back to other devices for certain operators and parts of the graph.</dd>
</dl>

The <dfn dfn-for=MLContextOptions dfn-type=dict-member>powerPreference</dfn> option is an <dfn dfn-type=enum>MLPowerPreference</dfn> and indicates the application's preference as related to power consumption. It is one of the following:
Expand Down Expand Up @@ -891,7 +898,7 @@ When the {{MLContext/[[contextType]]}} is set to [=context type/default=] with t
</details>

### {{MLContext/compute()}} ### {#api-mlcontext-compute}
Asynchronously carries out the computational workload of a compiled graph {{MLGraph}} on a separate timeline, either on a worker thread for the CPU execution, or on a GPU timeline for the submission of GPU workload on the command queue. The asynchronous nature of this call avoids blocking the calling thread while the computation for result is ongoing. This method of execution requires an {{MLContext}} created with {{MLContextOptions}}. Otherwise, it [=exception/throws=] an "{{OperationError}}" {{DOMException}}.
Asynchronously carries out the computational workload of a compiled graph {{MLGraph}} on a separate timeline, either on a worker thread for the CPU execution, or on a GPU/NPU timeline for submitting a workload onto the command queue. The asynchronous nature of this call avoids blocking the calling thread while the computation for result is ongoing. This method of execution requires an {{MLContext}} created with {{MLContextOptions}}. Otherwise, it [=exception/throws=] an "{{OperationError}}" {{DOMException}}.

<div class="note">
In accordance with the [=ArrayBufferView/write|Web IDL warning=], to prevent the calling thread from modifying the input and output resources while the computation is ongoing, this method [=MLNamedArrayBufferViews/transfer|transfers=] the input and output {{MLNamedArrayBufferViews}} to new views that share the same backing memory allocations. The transferred views are returned to the caller via the promise fulfillment with the computation result written into the backing memory of the output views.
Expand Down

0 comments on commit 5c64074

Please sign in to comment.