Skip to content

Commit

Permalink
Merge branch 'main' into temporary_decompose
Browse files Browse the repository at this point in the history
  • Loading branch information
dan-garvey authored Oct 29, 2024
2 parents f0a4e5f + d8f39a9 commit b98d61b
Show file tree
Hide file tree
Showing 17 changed files with 76 additions and 187 deletions.
6 changes: 6 additions & 0 deletions .github/workflows/ci-llama.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# Copyright 2024 Advanced Micro Devices, Inc.
#
# Licensed under the Apache License v2.0 with LLVM Exceptions.
# See https://llvm.org/LICENSE.txt for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

name: Llama Benchmarking Tests

on:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/ci-sdxl.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ jobs:
repository: iree-org/iree
path: ${{ env.IREE_REPO_DIR }}
submodules: false
ref: 67ba1c45424d5cedc7baf7bfe8a998ee86e510af
ref: candidate-20241029.1062

- name: Initalize IREE submodules
working-directory: ${{ env.IREE_REPO_DIR }}
Expand Down
6 changes: 6 additions & 0 deletions .github/workflows/ci-sharktank.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# Copyright 2024 Advanced Micro Devices, Inc.
#
# Licensed under the Apache License v2.0 with LLVM Exceptions.
# See https://llvm.org/LICENSE.txt for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

name: CI - sharktank

on:
Expand Down
6 changes: 6 additions & 0 deletions .github/workflows/ci-tuner.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# Copyright 2024 Advanced Micro Devices, Inc.
#
# Licensed under the Apache License v2.0 with LLVM Exceptions.
# See https://llvm.org/LICENSE.txt for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

name: CI - Tuner

on:
Expand Down
6 changes: 6 additions & 0 deletions .github/workflows/ci_eval.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# Copyright 2024 Advanced Micro Devices, Inc.
#
# Licensed under the Apache License v2.0 with LLVM Exceptions.
# See https://llvm.org/LICENSE.txt for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

name: Evaluation Tests

on:
Expand Down
9 changes: 6 additions & 3 deletions .github/workflows/ci_linux_x64-libshortfin.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@ jobs:
build-and-test:
name: Build and test
runs-on: ubuntu-24.04
strategy:
matrix:
python-version: ["3.11", "3.12"]

steps:
- name: Install dependencies
Expand All @@ -56,7 +59,7 @@ jobs:
repository: iree-org/iree
path: ${{ env.IREE_REPO_DIR }}
submodules: false
ref: 67ba1c45424d5cedc7baf7bfe8a998ee86e510af
ref: candidate-20241029.1062

- name: Initalize IREE submodules
working-directory: ${{ env.IREE_REPO_DIR }}
Expand All @@ -67,10 +70,10 @@ jobs:
git submodule update --init --depth 1 -- third_party/googletest
git submodule update --init --depth 1 -- third_party/hip-build-deps/
- name: Setup Python
- name: Setup Python ${{ matrix.python-version }}
uses: actions/setup-python@39cd14951b08e74b54015e9e001cdefcf80e669f # v5.1.1
with:
python-version: "3.12"
python-version: ${{ matrix.python-version }}
cache: "pip"
- name: Install Python packages
# TODO: Switch to `pip install -r requirements.txt -e shortfin/`.
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/ci_linux_x64_asan-libshortfin.yml
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ jobs:
repository: iree-org/iree
path: ${{ env.IREE_SOURCE_DIR }}
submodules: false
ref: 67ba1c45424d5cedc7baf7bfe8a998ee86e510af
ref: candidate-20241029.1062

- name: Initalize IREE submodules
working-directory: ${{ env.IREE_SOURCE_DIR }}
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/ci_linux_x64_nogil-libshortfin.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ jobs:
repository: iree-org/iree
path: ${{ env.IREE_REPO_DIR }}
submodules: false
ref: 67ba1c45424d5cedc7baf7bfe8a998ee86e510af
ref: candidate-20241029.1062

- name: Initalize IREE submodules
working-directory: ${{ env.IREE_REPO_DIR }}
Expand Down
6 changes: 6 additions & 0 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# Copyright 2024 Advanced Micro Devices, Inc.
#
# Licensed under the Apache License v2.0 with LLVM Exceptions.
# See https://llvm.org/LICENSE.txt for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

name: Integration Tests

on:
Expand Down
4 changes: 2 additions & 2 deletions sharktank/sharktank/layers/kv_cache.py
Original file line number Diff line number Diff line change
Expand Up @@ -191,8 +191,8 @@ def write_timestep(
update_count = len(cache_partitions)

for b in range(bs):
row_index = torch.tensor(b, dtype=torch.int64)
row_start_pos = seq_positions[row_index]
row_index = torch.tensor([b], dtype=torch.int64)
row_start_pos = seq_positions[row_index].unsqueeze(0)

for i, update in enumerate(cache_partitions):
cache = state[transformer_block_index * update_count + i]
Expand Down
2 changes: 1 addition & 1 deletion shortfin/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ elseif (SHORTFIN_BUNDLE_DEPS)
FetchContent_Declare(
shortfin_iree
GIT_REPOSITORY https://github.com/iree-org/iree.git
GIT_TAG 67ba1c45424d5cedc7baf7bfe8a998ee86e510af
GIT_TAG candidate-20241029.1062
GIT_SUBMODULES ${IREE_SUBMODULES}
GIT_SHALLOW TRUE
SYSTEM
Expand Down
6 changes: 4 additions & 2 deletions shortfin/python/shortfin_apps/llm/components/generate.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,12 @@ async def run(self):
token = sfnp.argmax(exec.result_logits)
token_int = token.items[0]

self.append_token(token_int)
# Decode loop.
# TODO: Use correct eot token from config.
exec.start_position = len(self.input_token_ids) - 1
while token_int != 128001:
# TODO: Use correct eot token from config.
# while token_int != 128001:
for i in range(15):
exec.reset(InferencePhase.DECODE)
exec.input_token_ids = [token_int]
exec.start_position += 1
Expand Down
25 changes: 17 additions & 8 deletions shortfin/python/shortfin_apps/llm/components/service.py
Original file line number Diff line number Diff line change
Expand Up @@ -337,21 +337,30 @@ async def run(self):
m.items = self.exec_requests[i].input_token_ids
tokens_host.copy_to(tokens)

# Populate seq_lens.
seq_lens_host = seq_lens.for_transfer()
with seq_lens_host.map(discard=True) as m:
m.fill(0)
m.items = [len(req.input_token_ids) for req in self.exec_requests]
seq_lens_host.copy_to(seq_lens)

# For decode, populate start_positions.
# For prefill, populate seq_lens
if self.phase == InferencePhase.PREFILL:
seq_lens_host = seq_lens.for_transfer()
with seq_lens_host.map(discard=True) as m:
m.fill(0)
m.items = [len(req.input_token_ids) for req in self.exec_requests]
seq_lens_host.copy_to(seq_lens)

# For decode, populate start_positions and seq_lens.
# paged_llm_v1 and export_paged_llm_v1 do some funky things with start_positions and seq_lens
# TODO: make them not so funky
if self.phase == InferencePhase.DECODE:
start_positions_host = start_positions.for_transfer()
with start_positions_host.map(discard=True) as m:
m.fill(0)
m.items = [req.start_position for req in self.exec_requests]
start_positions_host.copy_to(start_positions)

seq_lens_host = seq_lens.for_transfer()
with seq_lens_host.map(discard=True) as m:
m.fill(0)
m.items = [req.start_position + 1 for req in self.exec_requests]
seq_lens_host.copy_to(seq_lens)

# Populate cache pages.
seq_block_ids_host = seq_block_ids.for_transfer()
for i in range(bs):
Expand Down
3 changes: 2 additions & 1 deletion shortfin/requirements-iree-compiler.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# Keep in sync with IREE_REF in CI and GIT_TAG in CMakeLists.txt
-f https://iree.dev/pip-release-links.html
iree-compiler==20240904.1006
iree-compiler==20241029.1062
iree-runtime==20241029.1062
10 changes: 5 additions & 5 deletions shortfin/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,12 +111,12 @@ def is_cpp_prebuilt():

# Due to a quirk of setuptools, that package_dir map must only contain
# paths relative to the directory containing setup.py. Why? No one knows.
REL_SOURCE_DIR = SOURCE_DIR.relative_to(SETUPPY_DIR, walk_up=True)
REL_BINARY_DIR = BINARY_DIR.relative_to(SETUPPY_DIR, walk_up=True)
REL_CMAKE_DEFAULT_BUILD_DIR = CMAKE_DEFAULT_BUILD_DIR.relative_to(
SETUPPY_DIR, walk_up=True
REL_SOURCE_DIR = Path(os.path.relpath(SOURCE_DIR, SETUPPY_DIR))
REL_BINARY_DIR = Path(os.path.relpath(BINARY_DIR, SETUPPY_DIR))
REL_CMAKE_DEFAULT_BUILD_DIR = Path(
os.path.relpath(CMAKE_DEFAULT_BUILD_DIR, SETUPPY_DIR)
)
REL_CMAKE_TRACY_BUILD_DIR = CMAKE_TRACY_BUILD_DIR.relative_to(SETUPPY_DIR, walk_up=True)
REL_CMAKE_TRACY_BUILD_DIR = Path(os.path.relpath(CMAKE_TRACY_BUILD_DIR, SETUPPY_DIR))


class CMakeExtension(Extension):
Expand Down
162 changes: 0 additions & 162 deletions shortfin/tests/apps/llm/test_llm_server.py

This file was deleted.

6 changes: 6 additions & 0 deletions shortfin/tests/apps/sd/e2e_test.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# Copyright 2024 Advanced Micro Devices, Inc.
#
# Licensed under the Apache License v2.0 with LLVM Exceptions.
# See https://llvm.org/LICENSE.txt for license information.
# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

import json
import requests
import time
Expand Down

0 comments on commit b98d61b

Please sign in to comment.