Skip to content

Commit

Permalink
[WebNN EP] Cache MLTensors between runs
Browse files Browse the repository at this point in the history
### Description
This change enables caching `MLTensor`s between inferences runs. This is done by keeping a reference to `MLTensor`s alive after they have been released. `MLTensor`s are only destroyed once the sessions goes out of scope.

### Motivation and Context
Creating and destroying `MTensor`s on every run has a non-trivial performance penalty. This performance penalty materializes when using `ort.Tensors`[location=cpu] for inputs/outputs or when using the CPU EP as a fallback EP for unsupported operators. The former could be mitigated by developer using `ort.Tensors`[location=ml-tensor]. The latter cannot be mitigated by developers.
  • Loading branch information
egalli committed Sep 30, 2024
1 parent d069475 commit 240d2b2
Show file tree
Hide file tree
Showing 2 changed files with 166 additions and 128 deletions.
2 changes: 1 addition & 1 deletion js/web/lib/wasm/jsep/backend-webnn.ts
Original file line number Diff line number Diff line change
Expand Up @@ -91,12 +91,12 @@ export class WebNNBackend {
// Current session is not a WebNN session.
return;
}
this.tensorManager.releaseTensorsForSession(sessionId);
this.mlContextBySessionId.delete(sessionId);
const sessionIds = this.sessionIdsByMLContext.get(mlContext)!;
sessionIds.delete(sessionId);
if (sessionIds.size === 0) {
this.sessionIdsByMLContext.delete(mlContext);
this.tensorManager.releaseTensorsForContext(mlContext);
}
}

Expand Down
Loading

0 comments on commit 240d2b2

Please sign in to comment.