From b722192c8b6ba8653e1c063c6b536ef3331b03d3 Mon Sep 17 00:00:00 2001 From: Joshua Bell Date: Wed, 24 Jul 2024 10:22:36 -0700 Subject: [PATCH 1/2] Complain about RFC2119 terms, and fix usage --- index.bs | 53 ++++++++++++++++++++++++++++++----------------------- 1 file changed, 30 insertions(+), 23 deletions(-) diff --git a/index.bs b/index.bs index e3831e13..176b6875 100644 --- a/index.bs +++ b/index.bs @@ -21,6 +21,7 @@ Markup Shorthands: css no Logo: https://webmachinelearning.github.io/webmachinelearning-logo.png Deadline: 2023-10-01 Assume Explicit For: yes +Complain About: accidental-2119 yes Status Text:

Since the initial Candidate Recommendation Snapshot the Working Group has gathered further implementation experience and added new operations and data types needed for well-known transformers to support generative AI use cases. In addition, informed by this implementation experience, the group removed MLCommandEncoder, support for synchronous execution, and higher-level operations that can be expressed in terms of lower-level primitives in a performant manner. The group has also updated the specification to use modern authoring conventions to improve interoperability and precision of normative definitions. The group is developing a new feature, a backend-agnostic storage type, to improve performance and interoperability between the WebNN, WebGPU APIs and purpose-built hardware for ML and expects to republish this document as a Candidate Recommendation Snapshot when ready for implementation. @@ -401,7 +402,7 @@ This section illustrates application-level use cases for neural network inference hardware acceleration. All applications in those use cases can be built on top of pre-trained deep neural network (DNN) [[models]]. -Note: Please be aware that some of the use cases described here, are by their very nature, privacy-invasive. Developers who are planning to use the API for such use cases should ensure that the API is being used to benefit users, for purposes that users understand, and approve. They should apply the Ethical Principles for Web Machine Learning [[webmachinelearning-ethics]] and implement appropriate privacy risk mitigations such as transparency, data minimisation, and users controls. +Note: Please be aware that some of the use cases described here, are by their very nature, privacy-invasive. Developers who are planning to use the API for such use cases should ensure that the API is being used to benefit users, for purposes that users understand, and approve. They should apply the Ethical Principles for Web Machine Learning [[webmachinelearning-ethics]] and implement appropriate privacy risk mitigations such as transparency, data minimisation, and users controls. ### Person Detection ### {#usecase-person-detection} @@ -630,6 +631,10 @@ Purpose-built Web APIs for measuring high-resolution time mitigate against timin ## Guidelines for new operations ## {#security-new-ops} +*This section is non-normative.* + +

+ To ensure operations defined in this specification are shaped in a way they can be implemented securely, this section includes guidelines on how operations are expected to be defined to reduce potential for implementation problems. These guidelines are expected to evolve over time to align with industry best practices: - Prefer simplicity of arguments @@ -637,15 +642,17 @@ To ensure operations defined in this specification are shaped in a way they can - If an operation can be decomposed to low level primitives: - Add an informative emulation path - Prefer primitives over new high level operations but consider performance consequences -- Operations should follow a consistent style for inputs and attributes -- Operation families such as pooling and reduction should share API shape and options +- Follow a consistent style for operation inputs and attributes +- Share API shape and options for operation families such as pooling and reduction - Formalize failure cases into test cases whenever possible -- When in doubt, leave it out: API surface should be as small as possible required to satisfy the use cases, but no smaller +- When in doubt, leave it out: keep the API surface as small as possible to satisfy the use cases, but no smaller - Try to keep the API free of implementation details that might inhibit future evolution, do not overspecify - Fail fast: the sooner the web developer is informed of an issue, the better In general, always consider the security and privacy implications as documented in [[security-privacy-questionnaire]] by the Technical Architecture Group and the Privacy Interest Group when adding new features. +
+ Privacy Considerations {#privacy} =================================== @@ -665,7 +672,7 @@ The WebNN API defines two developer-settable preferences to help inform [[#progr Issue(623): {{MLContextOptions}} is under active development, and the design is expected to change, informed by further implementation experience and new use cases from the wider web community. -If a future version of this specification introduces support for a new {{MLDeviceType}} that can only support a subset of {{MLOperandDataType}}s, that may introduce a new fingerprint. +If a future version of this specification introduces support for a new {{MLDeviceType}} that can only support a subset of {{MLOperandDataType}}s, that could introduce a new fingerprint. In general, implementers of this API are expected to apply WebGPU Privacy Considerations to their implementations where applicable. @@ -951,7 +958,7 @@ Schedules the computational workload of a compiled {{MLGraph}} on the {{MLContex **Returns:** {{undefined}}. -Note: `dispatch()` itself provides no signal that graph execution has completed. Rather, callers should await the results of reading back the output tensors. See [[#api-mlcontext-dispatch-examples]] below. +Note: `dispatch()` itself provides no signal that graph execution has completed. Rather, callers can `await` the results of reading back the output tensors. See [[#api-mlcontext-dispatch-examples]] below.
@@ -1108,7 +1115,7 @@ Bring-your-own-buffer variant of {{MLContext/readTensor(tensor)}}. Reads back th 1. Otherwise, [=queue an ML task=] with |global| and the following steps: 1. If |outputData| is [=BufferSource/detached=], [=reject=] |promise| with a {{TypeError}}, and abort these steps. - Note: [=Validating buffer with descriptor=] above will fail if |outputData| is detached, but it's possible |outputData| may detach between then and now. + Note: [=Validating buffer with descriptor=] above will fail if |outputData| is detached, but it is possible that |outputData| could detach between then and now. 1. [=ArrayBuffer/Write=] |bytes| to |outputData|. 1. [=Resolve=] |promise| with {{undefined}}. @@ -1145,7 +1152,7 @@ Writes data to the {{MLTensor/[[data]]}} of an {{MLTensor}} on the {{MLContext}} 1. Return {{undefined}}.
-Note: Similar to `dispatch()`, `writeTensor()` itself provides no signal that the write has completed. To inspect the contents of a tensor, callers should await the results of reading back the tensor. +Note: Similar to `dispatch()`, `writeTensor()` itself provides no signal that the write has completed. To inspect the contents of a tensor, callers can `await` the results of reading back the tensor. ### {{MLContext/opSupportLimits()}} ### {#api-mlcontext-opsupportlimits} The {{MLContext/opSupportLimits()}} exposes level of support that differs across implementations at operator level. Consumers of the WebNN API are encouraged to probe feature support level by using {{MLContext/opSupportLimits()}} to determine the optimal model architecture to be deployed for each target platform. @@ -1325,7 +1332,7 @@ Issue(391): Should 0-size dimensions be supported? An {{MLOperand}} represents an intermediary graph being constructed as a result of compositing parts of an operation into a fully composed operation. -For instance, an {{MLOperand}} may represent a constant feeding to an operation or the result from combining multiple constants together into an operation. See also [[#programming-model]]. +For instance, an {{MLOperand}} can represent a constant feeding to an operation or the result from combining multiple constants together into an operation. See also [[#programming-model]].