Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.17.3 cherry-picks for ORT Web changes #19926

Merged
merged 51 commits into from
Mar 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
a24273e
[js/webgpu] Add HardSigmoid support (#19215)
qjia7 Jan 22, 2024
c44d497
[js/webgpu] set query type in onRunStart (#19202)
qjia7 Jan 23, 2024
254b543
[js/webgpu] Add FusedConv clip test case (#18900)
axinging Jan 23, 2024
7282e23
[JS/WebGPU] Added Uniforms to SkipLayerNorm. (#18788)
satyajandhyala Jan 24, 2024
6bd8586
[js/webgpu] Fix issue of timestamp query (#19258)
Jan 24, 2024
4eafe73
[WebNN EP] Support WebNN async API with Asyncify (#19145)
Honry Jan 24, 2024
b03a1c5
[js/webgpu] Fix Tanh explosion (#19201)
hujiajie Jan 25, 2024
f02accb
[js/webgpu] Support uniforms for conv, conv transpose, conv grouped (…
axinging Jan 25, 2024
3abc3db
[js/webgpu] Support f16 uniform (#19098)
axinging Jan 26, 2024
5ef244a
fix f16 for attention, enable slice and flatten for more types (#19262)
guschmue Jan 29, 2024
55cede9
[js/webgpu] Remove enableShapesUniforms (#19279)
axinging Jan 30, 2024
c61a8e5
[js/webgpu] Add hardSigmoid activation for fusedConv (#19233)
qjia7 Jan 31, 2024
43b95b0
[js/webgpu] Support capture and replay for jsep (#18989)
qjia7 Jan 31, 2024
d2db872
[js/webgpu] Use DataType as uniform cpu type (#19281)
axinging Jan 31, 2024
e305794
[js/webgpu] resolve codescan alert (#19343)
fs-eire Jan 31, 2024
fcefc67
[js/webgpu] Refactor createTensorShapeVariables (#18883)
axinging Feb 2, 2024
bff2f5b
[js/webgpu] Fix the undefined push error (#19366)
qjia7 Feb 2, 2024
8cf59d2
[js/webgpu] Add LeakyRelu activation for fusedConv (#19369)
qjia7 Feb 2, 2024
257cf5e
[js/webgpu] support customop FastGelu (#19392)
fs-eire Feb 6, 2024
2eb5f3b
[js/webgpu] allow uint8 tensors for webgpu (#19545)
fs-eire Feb 17, 2024
0f97b5b
[JS/WebGPU] Add MatMulNBits (#19446)
satyajandhyala Feb 17, 2024
ed9d178
[js/webgpu] Create Split indices helpers by rank, not by shape (#19554)
hujiajie Feb 20, 2024
9641002
[js] small fix to workaround formatter (#19400)
fs-eire Feb 21, 2024
a750c39
[js/common] upgrade tsc in common from 4.9.5 to 5.2.2 (#19317)
fs-eire Feb 21, 2024
a092546
[js] changes to allow Float16Array if any polyfill is available (#19305)
fs-eire Feb 21, 2024
4d0a685
[js/web] Fix fused-conv is not included in npm test (#19581)
axinging Feb 21, 2024
fb9d285
Misspelling in README.md (#19433)
martholomew Feb 21, 2024
c8e7e8b
Bump ip from 1.1.8 to 1.1.9 in /js/react_native (#19582)
dependabot[bot] Feb 21, 2024
12101de
[js/webgpu] Fix Conv2DTransposeMatMul f16 compilation failure (#19596)
axinging Feb 22, 2024
bb51cea
Bump ip from 1.1.8 to 1.1.9 in /js/react_native/e2e (#19583)
dependabot[bot] Feb 22, 2024
6768326
[node] Switch to setImmediate to avoid starving the Node.js event loo…
segevfiner Feb 23, 2024
846f445
[JS/WebGPU] Fix Split and Where to handle corner cases. (#19613)
satyajandhyala Feb 23, 2024
0eff983
[js/webgpu] allows a ProgramInfo's RunData to use zero sized output (…
fs-eire Feb 23, 2024
16c5546
[js/webgpu] minor fixes to make tinyllama work (#19564)
guschmue Feb 23, 2024
d6a7bd6
[js/web] fix suite test list for zero sized tensor (#19638)
fs-eire Feb 24, 2024
d61ef6f
[js/common] move 'env.wasm.trace' to 'env.trace' (#19617)
fs-eire Feb 27, 2024
ffe1583
[js/webgpu] use Headless for webgpu test by default (#19702)
fs-eire Feb 29, 2024
ce612c7
[js/web] transfer input buffer back to caller thread (#19677)
fs-eire Mar 1, 2024
4ccd620
[JS/WebGPU] Preserve zero size input tensor dims. (#19737)
satyajandhyala Mar 8, 2024
18027be
[js/webgpu] expose a few properties in WebGPU API (#19857)
fs-eire Mar 13, 2024
b54dd28
[js/webgpu] Enable GroupedConvVectorize path (#19791)
Mar 13, 2024
1df9911
[JS/WebGPU] Optimize MatMulNBits (#19852)
satyajandhyala Mar 13, 2024
648cc42
[js/web] rewrite backend resolve to allow multiple EPs (#19735)
fs-eire Mar 15, 2024
0208b1e
Fix #19931 broken Get Started link of "ONNX Runtime JavaScript API" p…
ibelem Mar 16, 2024
0ece4ef
[js/common] fix typedoc warnings (#19933)
fs-eire Mar 16, 2024
8f03331
Bump follow-redirects from 1.15.4 to 1.15.6 in /js/web (#19949)
dependabot[bot] Mar 17, 2024
bdb3b01
Bump follow-redirects from 1.15.4 to 1.15.6 in /js/node (#19951)
dependabot[bot] Mar 17, 2024
b2c48b9
accumulate in fp32 for Reduce* (#19868)
guschmue Mar 18, 2024
333efa0
[js/webgpu] Fix NAN caused by un-initialized buffer in instance-norm …
axinging Mar 19, 2024
53377a3
[js/webgpu] allow setting env.webgpu.adapter (#19940)
fs-eire Mar 19, 2024
310c099
[js/webgpu] fix maxpool / fp16 (#19981)
guschmue Mar 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
121 changes: 90 additions & 31 deletions js/common/lib/backend-impl.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
// Licensed under the MIT License.

import {Backend} from './backend.js';
import {InferenceSession} from './inference-session.js';

interface BackendInfo {
backend: Backend;
Expand All @@ -10,6 +11,7 @@ interface BackendInfo {
initPromise?: Promise<void>;
initialized?: boolean;
aborted?: boolean;
error?: string;
}

const backends: Map<string, BackendInfo> = new Map();
Expand Down Expand Up @@ -60,43 +62,100 @@ export const registerBackend = (name: string, backend: Backend, priority: number
};

/**
* Resolve backend by specified hints.
* Try to resolve and initialize a backend.
*
* @param backendHints - a list of execution provider names to lookup. If omitted use registered backends as list.
* @returns a promise that resolves to the backend.
* @param backendName - the name of the backend.
* @returns the backend instance if resolved and initialized successfully, or an error message if failed.
*/
const tryResolveAndInitializeBackend = async(backendName: string): Promise<Backend|string> => {
const backendInfo = backends.get(backendName);
if (!backendInfo) {
return 'backend not found.';
}

if (backendInfo.initialized) {
return backendInfo.backend;
} else if (backendInfo.aborted) {
return backendInfo.error!;
} else {
const isInitializing = !!backendInfo.initPromise;
try {
if (!isInitializing) {
backendInfo.initPromise = backendInfo.backend.init(backendName);
}
await backendInfo.initPromise;
backendInfo.initialized = true;
return backendInfo.backend;
} catch (e) {
if (!isInitializing) {
backendInfo.error = `${e}`;
backendInfo.aborted = true;
}
return backendInfo.error!;
} finally {
delete backendInfo.initPromise;
}
}
};

/**
* Resolve execution providers from the specific session options.
*
* @param options - the session options object.
* @returns a promise that resolves to a tuple of an initialized backend instance and a session options object with
* filtered EP list.
*
* @ignore
*/
export const resolveBackend = async(backendHints: readonly string[]): Promise<Backend> => {
const backendNames = backendHints.length === 0 ? backendsSortedByPriority : backendHints;
const errors = [];
for (const backendName of backendNames) {
const backendInfo = backends.get(backendName);
if (backendInfo) {
if (backendInfo.initialized) {
return backendInfo.backend;
} else if (backendInfo.aborted) {
continue; // current backend is unavailable; try next
}
export const resolveBackendAndExecutionProviders = async(options: InferenceSession.SessionOptions):
Promise<[backend: Backend, options: InferenceSession.SessionOptions]> => {
// extract backend hints from session options
const eps = options.executionProviders || [];
const backendHints = eps.map(i => typeof i === 'string' ? i : i.name);
const backendNames = backendHints.length === 0 ? backendsSortedByPriority : backendHints;

const isInitializing = !!backendInfo.initPromise;
try {
if (!isInitializing) {
backendInfo.initPromise = backendInfo.backend.init(backendName);
// try to resolve and initialize all requested backends
let backend: Backend|undefined;
const errors = [];
const availableBackendNames = new Set<string>();
for (const backendName of backendNames) {
const resolveResult = await tryResolveAndInitializeBackend(backendName);
if (typeof resolveResult === 'string') {
errors.push({name: backendName, err: resolveResult});
} else {
if (!backend) {
backend = resolveResult;
}
if (backend === resolveResult) {
availableBackendNames.add(backendName);
}
}
await backendInfo.initPromise;
backendInfo.initialized = true;
return backendInfo.backend;
} catch (e) {
if (!isInitializing) {
errors.push({name: backendName, err: e});
}

// if no backend is available, throw error.
if (!backend) {
throw new Error(`no available backend found. ERR: ${errors.map(e => `[${e.name}] ${e.err}`).join(', ')}`);
}

// for each explicitly requested backend, if it's not available, output warning message.
for (const {name, err} of errors) {
if (backendHints.includes(name)) {
// eslint-disable-next-line no-console
console.warn(`removing requested execution provider "${
name}" from session options because it is not available: ${err}`);
}
backendInfo.aborted = true;
} finally {
delete backendInfo.initPromise;
}
}
}

throw new Error(`no available backend found. ERR: ${errors.map(e => `[${e.name}] ${e.err}`).join(', ')}`);
};
const filteredEps = eps.filter(i => availableBackendNames.has(typeof i === 'string' ? i : i.name));

return [
backend, new Proxy(options, {
get: (target, prop) => {
if (prop === 'executionProviders') {
return filteredEps;
}
return Reflect.get(target, prop);
}
})
];
};
6 changes: 3 additions & 3 deletions js/common/lib/backend.ts
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ export interface TrainingSessionHandler extends SessionHandler {
options: InferenceSession.RunOptions): Promise<SessionHandler.ReturnType>;

getParametersSize(trainableOnly: boolean): Promise<number>;
loadParametersBuffer(array: Uint8Array, trainableOnly: boolean): Promise<void>;
loadParametersBuffer(buffer: Uint8Array, trainableOnly: boolean): Promise<void>;
getContiguousParameters(trainableOnly: boolean): Promise<OnnxValue>;
}

Expand All @@ -77,8 +77,8 @@ export interface Backend {
Promise<InferenceSessionHandler>;

createTrainingSessionHandler?
(checkpointStateUriOrBuffer: TrainingSession.URIorBuffer, trainModelUriOrBuffer: TrainingSession.URIorBuffer,
evalModelUriOrBuffer: TrainingSession.URIorBuffer, optimizerModelUriOrBuffer: TrainingSession.URIorBuffer,
(checkpointStateUriOrBuffer: TrainingSession.UriOrBuffer, trainModelUriOrBuffer: TrainingSession.UriOrBuffer,
evalModelUriOrBuffer: TrainingSession.UriOrBuffer, optimizerModelUriOrBuffer: TrainingSession.UriOrBuffer,
options: InferenceSession.SessionOptions): Promise<TrainingSessionHandler>;
}

Expand Down
50 changes: 49 additions & 1 deletion js/common/lib/env.ts
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ export declare namespace Env {
/**
* set or get a boolean value indicating whether to enable trace.
*
* @deprecated Use `env.trace` instead. If `env.trace` is set, this property will be ignored.
* @defaultValue `false`
*/
trace?: boolean;
Expand Down Expand Up @@ -142,13 +143,52 @@ export declare namespace Env {
*/
ondata?: (data: WebGpuProfilingData) => void;
};
/**
* Set or get the power preference.
*
* Setting this property only has effect before the first WebGPU inference session is created. The value will be
* used as options for `navigator.gpu.requestAdapter()`.
*
* See {@link https://gpuweb.github.io/gpuweb/#dictdef-gpurequestadapteroptions} for more details.
*
* @defaultValue `undefined`
*/
powerPreference?: 'low-power'|'high-performance';
/**
* Set or get the force fallback adapter flag.
*
* Setting this property only has effect before the first WebGPU inference session is created. The value will be
* used as options for `navigator.gpu.requestAdapter()`.
*
* See {@link https://gpuweb.github.io/gpuweb/#dictdef-gpurequestadapteroptions} for more details.
*
* @defaultValue `undefined`
*/
forceFallbackAdapter?: boolean;
/**
* Set or get the adapter for WebGPU.
*
* Setting this property only has effect before the first WebGPU inference session is created. The value will be
* used as the GPU adapter for the underlying WebGPU backend to create GPU device.
*
* If this property is not set, it will be available to get after the first WebGPU inference session is created. The
* value will be the GPU adapter that created by the underlying WebGPU backend.
*
* When use with TypeScript, the type of this property is `GPUAdapter` defined in "@webgpu/types".
* Use `const adapter = env.webgpu.adapter as GPUAdapter;` in TypeScript to access this property with correct type.
*
* see comments on {@link Tensor.GpuBufferType}
*/
adapter: unknown;
/**
* Get the device for WebGPU.
*
* This property is only available after the first WebGPU inference session is created.
*
* When use with TypeScript, the type of this property is `GPUDevice` defined in "@webgpu/types".
* Use `const device = env.webgpu.device as GPUDevice;` in TypeScript to access this property with correct type.
*
* see comments on {@link GpuBufferType} for more details about why not use types defined in "@webgpu/types".
* see comments on {@link Tensor.GpuBufferType} for more details about why not use types defined in "@webgpu/types".
*/
readonly device: unknown;
/**
Expand All @@ -167,13 +207,21 @@ export interface Env {
* @defaultValue `'warning'`
*/
logLevel?: 'verbose'|'info'|'warning'|'error'|'fatal';

/**
* Indicate whether run in debug mode.
*
* @defaultValue `false`
*/
debug?: boolean;

/**
* set or get a boolean value indicating whether to enable trace.
*
* @defaultValue `false`
*/
trace?: boolean;

/**
* Get version of the current package.
*/
Expand Down
5 changes: 4 additions & 1 deletion js/common/lib/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
* - [onnxruntime-react-native](https://www.npmjs.com/package/onnxruntime-react-native)
*
* See also:
* - [Get Started](https://onnxruntime.ai/docs/get-started/with-javascript.html)
* - [Get Started](https://onnxruntime.ai/docs/get-started/with-javascript/)
* - [Inference examples](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/js)
*
* @packageDocumentation
Expand All @@ -21,6 +21,9 @@ export * from './backend.js';
export * from './env.js';
export * from './inference-session.js';
export * from './tensor.js';
export * from './tensor-conversion.js';
export * from './tensor-factory.js';
export * from './trace.js';
export * from './onnx-model.js';
export * from './onnx-value.js';
export * from './training-session.js';
10 changes: 4 additions & 6 deletions js/common/lib/inference-session-impl.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.

import {resolveBackend} from './backend-impl.js';
import {resolveBackendAndExecutionProviders} from './backend-impl.js';
import {InferenceSessionHandler} from './backend.js';
import {InferenceSession as InferenceSessionInterface} from './inference-session.js';
import {OnnxValue} from './onnx-value.js';
Expand Down Expand Up @@ -195,11 +195,9 @@ export class InferenceSession implements InferenceSessionInterface {
throw new TypeError('Unexpected argument[0]: must be \'path\' or \'buffer\'.');
}

// get backend hints
const eps = options.executionProviders || [];
const backendHints = eps.map(i => typeof i === 'string' ? i : i.name);
const backend = await resolveBackend(backendHints);
const handler = await backend.createInferenceSessionHandler(filePathOrUint8Array, options);
// resolve backend, update session options with validated EPs, and create session handler
const [backend, optionsWithValidatedEPs] = await resolveBackendAndExecutionProviders(options);
const handler = await backend.createInferenceSessionHandler(filePathOrUint8Array, optionsWithValidatedEPs);
TRACE_FUNC_END();
return new InferenceSession(handler);
}
Expand Down
51 changes: 42 additions & 9 deletions js/common/lib/inference-session.ts
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ export declare namespace InferenceSession {
optimizedModelFilePath?: string;

/**
* Wether enable profiling.
* Whether enable profiling.
*
* This setting is a placeholder for a future use.
*/
Expand Down Expand Up @@ -154,6 +154,12 @@ export declare namespace InferenceSession {
*/
preferredOutputLocation?: OnnxValueDataLocation|{readonly [outputName: string]: OnnxValueDataLocation};

/**
* Whether enable graph capture.
* This setting is available only in ONNXRuntime Web for WebGPU EP.
*/
enableGraphCapture?: boolean;

/**
* Store configurations for a session. See
* https://github.com/microsoft/onnxruntime/blob/main/include/onnxruntime/core/session/
Expand All @@ -180,22 +186,22 @@ export declare namespace InferenceSession {
// #region execution providers

// Currently, we have the following backends to support execution providers:
// Backend Node.js binding: supports 'cpu' and 'cuda'.
// Backend Node.js binding: supports 'cpu', 'dml' (win32), 'coreml' (macOS) and 'cuda' (linux).
// Backend WebAssembly: supports 'cpu', 'wasm', 'webgpu' and 'webnn'.
// Backend ONNX.js: supports 'webgl'.
// Backend React Native: supports 'cpu', 'xnnpack', 'coreml' (iOS), 'nnapi' (Android).
interface ExecutionProviderOptionMap {
coreml: CoreMLExecutionProviderOption;
cpu: CpuExecutionProviderOption;
coreml: CoreMlExecutionProviderOption;
cuda: CudaExecutionProviderOption;
dml: DmlExecutionProviderOption;
nnapi: NnapiExecutionProviderOption;
tensorrt: TensorRtExecutionProviderOption;
wasm: WebAssemblyExecutionProviderOption;
webgl: WebGLExecutionProviderOption;
xnnpack: XnnpackExecutionProviderOption;
webgpu: WebGpuExecutionProviderOption;
webnn: WebNNExecutionProviderOption;
nnapi: NnapiExecutionProviderOption;
xnnpack: XnnpackExecutionProviderOption;
}

type ExecutionProviderName = keyof ExecutionProviderOptionMap;
Expand All @@ -213,10 +219,6 @@ export declare namespace InferenceSession {
readonly name: 'cuda';
deviceId?: number;
}
export interface CoreMlExecutionProviderOption extends ExecutionProviderOption {
readonly name: 'coreml';
coreMlFlags?: number;
}
export interface DmlExecutionProviderOption extends ExecutionProviderOption {
readonly name: 'dml';
deviceId?: number;
Expand Down Expand Up @@ -247,8 +249,39 @@ export declare namespace InferenceSession {
}
export interface CoreMLExecutionProviderOption extends ExecutionProviderOption {
readonly name: 'coreml';
/**
* The bit flags for CoreML execution provider.
*
* ```
* COREML_FLAG_USE_CPU_ONLY = 0x001
* COREML_FLAG_ENABLE_ON_SUBGRAPH = 0x002
* COREML_FLAG_ONLY_ENABLE_DEVICE_WITH_ANE = 0x004
* COREML_FLAG_ONLY_ALLOW_STATIC_INPUT_SHAPES = 0x008
* COREML_FLAG_CREATE_MLPROGRAM = 0x010
* ```
*
* See include/onnxruntime/core/providers/coreml/coreml_provider_factory.h for more details.
*
* This flag is available only in ONNXRuntime (Node.js binding).
*/
coreMlFlags?: number;
/**
* Specify whether to use CPU only in CoreML EP.
*
* This setting is available only in ONNXRuntime (react-native).
*/
useCPUOnly?: boolean;
/**
* Specify whether to enable CoreML EP on subgraph.
*
* This setting is available only in ONNXRuntime (react-native).
*/
enableOnSubgraph?: boolean;
/**
* Specify whether to only enable CoreML EP for Apple devices with ANE (Apple Neural Engine).
*
* This setting is available only in ONNXRuntime (react-native).
*/
onlyEnableDeviceWithANE?: boolean;
}
export interface NnapiExecutionProviderOption extends ExecutionProviderOption {
Expand Down
Loading
Loading