Skip to content

Commit

Permalink
[js/webgpu] Fix attention shader recompilation issue (#21770)
Browse files Browse the repository at this point in the history
### Description
<!-- Describe your changes. -->

This PR fixes the `AttentionProbsSoftmax` recompilation issue when
executing the phi3 model. With this fix, it will further improve the
phi3 performance.

### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
  • Loading branch information
qjia7 authored Aug 18, 2024
1 parent 49fc168 commit c4ade79
Showing 1 changed file with 3 additions and 5 deletions.
8 changes: 3 additions & 5 deletions js/web/lib/wasm/jsep/webgpu/ops/attention.ts
Original file line number Diff line number Diff line change
Expand Up @@ -262,9 +262,7 @@ const createInPlaceSoftmaxProgramInfo = (_context: ComputeContext, input: Tensor
let WG = 64;
const dComp = d / components;
if (dComp < WG) {
WG = 1;
} else if (dComp / 8 < 64) {
WG = Math.ceil(dComp / 8);
WG = 32;
}
const elementsPerThread = Math.ceil(d / components / WG);
const programUniforms: ProgramUniform[] = [
Expand All @@ -274,7 +272,7 @@ const createInPlaceSoftmaxProgramInfo = (_context: ComputeContext, input: Tensor
];
const dataType = tensorTypeToWsglStorageType(input.dataType, components);
const f32Type = tensorTypeToWsglValueType(DataType.float, components);

const inputDependencies: ProgramInputTensorInfoDependency[] = ['type'];
const getShaderSource = (shaderHelper: ShaderHelper) => {
const inputHelper = outputVariable('x', input.dataType, input.dims, components);
const elemValueType = tensorTypeToWsglValueType(input.dataType);
Expand Down Expand Up @@ -353,7 +351,7 @@ const createInPlaceSoftmaxProgramInfo = (_context: ComputeContext, input: Tensor

return {
name: 'AttentionProbsSoftmax',
shaderCache: { hint: `${WG};${dataType};${components}` },
shaderCache: { hint: `${WG};${dataType};${components}`, inputDependencies },
getShaderSource,
getRunData: () => ({ outputs: [], dispatchGroup: { x: n }, programUniforms }),
};
Expand Down

0 comments on commit c4ade79

Please sign in to comment.