Skip to content

Commit

Permalink
[WebNN EP] Cache MLTensors between runs (#22278)
Browse files Browse the repository at this point in the history
### Description
This change enables caching `MLTensor`s between inferences runs. This is
done by keeping a reference to `MLTensor`s alive after they have been
released. `MLTensor`s are only destroyed once the sessions goes out of
scope.

### Motivation and Context
Creating and destroying `MTensor`s on every run has a non-trivial
performance penalty. This performance penalty materializes when using
`ort.Tensors`[location=cpu] for inputs/outputs or when using the CPU EP
as a fallback EP for unsupported operators. The former could be
mitigated by developer using `ort.Tensors`[location=ml-tensor]. The
latter cannot be mitigated by developers.
  • Loading branch information
egalli authored Oct 18, 2024
1 parent b4cb937 commit 1e5bda8
Show file tree
Hide file tree
Showing 2 changed files with 166 additions and 128 deletions.
2 changes: 1 addition & 1 deletion js/web/lib/wasm/jsep/backend-webnn.ts
Original file line number Diff line number Diff line change
Expand Up @@ -91,12 +91,12 @@ export class WebNNBackend {
// Current session is not a WebNN session.
return;
}
this.tensorManager.releaseTensorsForSession(sessionId);
this.mlContextBySessionId.delete(sessionId);
const sessionIds = this.sessionIdsByMLContext.get(mlContext)!;
sessionIds.delete(sessionId);
if (sessionIds.size === 0) {
this.sessionIdsByMLContext.delete(mlContext);
this.tensorManager.releaseTensorsForContext(mlContext);
}
}

Expand Down
Loading

0 comments on commit 1e5bda8

Please sign in to comment.