Skip to content

Commit

Permalink
Merge branch 'main' into ay/issue-5389-repeat-interleave
Browse files Browse the repository at this point in the history
  • Loading branch information
ayerofieiev-tt authored Jun 5, 2024
2 parents 2890cce + 11822c5 commit 2b268ac
Show file tree
Hide file tree
Showing 597 changed files with 17,286 additions and 8,435 deletions.
1 change: 0 additions & 1 deletion .github/actions/install-metal-deps/dependencies.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
"python3.8-venv=3.8.10-0ubuntu1~20.04.9",
"libgoogle-glog-dev=0.4.0-1build1",
"libyaml-cpp-dev=0.6.2-4ubuntu1",
"libboost-all-dev=1.71.0.0ubuntu2",
"libsndfile1=1.0.28-7ubuntu0.2",
"libhwloc-dev",
"graphviz",
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/build-artifact.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,11 +32,11 @@ jobs:
git submodule update --init --recursive
- name: Build tt-metal and libs
run: |
cmake -B build -G Ninja -DCMAKE_CXX_COMPILER=clang++-17
cmake -B build -G Ninja
cmake --build build --target tests
cmake --build build --target install
- name: 'Tar files'
run: tar -cvf ttm_${{ matrix.arch }}.tar build/hw build/lib tt_eager/tt_lib/*.so ttnn/ttnn/*.so build/programming_examples build/test build/tools
run: tar -cvf ttm_${{ matrix.arch }}.tar build/hw build/lib tt_eager/tt_lib/*.so ttnn/ttnn/*.so build/programming_examples build/test build/tools runtime
- name: 'Upload Artifact'
uses: actions/upload-artifact@v4
with:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ jobs:
echo "TT_METAL_HOME=$(pwd)" >> $GITHUB_ENV
- name: Build tt-metal libraries
run: |
cmake -B build -G Ninja -DCMAKE_CXX_COMPILER=clang++-17
cmake -B build -G Ninja
cmake --build build
- name: Build tt-metal C++ tests
run: |
Expand Down
6 changes: 0 additions & 6 deletions .github/workflows/eager-package-main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -78,12 +78,6 @@ jobs:
source env/bin/activate
python3 -m tt_metal.scripts.get_home_dir --short
echo "TT_METAL_HOME=$(python3 -m tt_metal.scripts.get_home_dir --short)" >> $GITHUB_ENV
- name: Set up kernel builds
working-directory: tests/end_to_end_tests
run: |
echo $TT_METAL_HOME
source env/bin/activate
python3 -m tt_metal.scripts.set_up_kernels prepare
- name: Activate env and run release tests - silicon
timeout-minutes: 2
shell: bash
Expand Down
11 changes: 6 additions & 5 deletions .github/workflows/fast-dispatch-full-regressions-and-models.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,14 @@ jobs:
matrix:
test-group:
[
{ name: "Common models GS", arch: grayskull, cmd: tests/scripts/nightly/run_common_models.sh, timeout: 40 },
{ name: "Common models N300 WH B0", arch: wormhole_b0, cmd: tests/scripts/nightly/run_common_models.sh, timeout: 40 },
{ name: "GS-only ttnn nightly", arch: grayskull, cmd: tests/scripts/nightly/run_ttnn.sh, timeout: 40 },
{ name: "GS-only models", arch: grayskull, cmd: tests/scripts/nightly/run_gs_only.sh, timeout: 40 },
{ name: "N300 WH-only models", arch: wormhole_b0, cmd: tests/scripts/nightly/run_wh_b0_only.sh, timeout: 60 },
{ name: "Common models GS", arch: grayskull, cmd: tests/scripts/single_card/nightly/run_common_models.sh, timeout: 40 },
{ name: "Common models N300 WH B0", arch: wormhole_b0, cmd: tests/scripts/single_card/nightly/run_common_models.sh, timeout: 40 },
{ name: "GS-only ttnn nightly", arch: grayskull, cmd: tests/scripts/single_card/nightly/run_ttnn.sh, timeout: 40 },
{ name: "GS-only models", arch: grayskull, cmd: tests/scripts/single_card/nightly/run_gs_only.sh, timeout: 40 },
{ name: "N300 WH-only models", arch: wormhole_b0, cmd: tests/scripts/single_card/nightly/run_wh_b0_only.sh, timeout: 30 },
{ name: "API tests GS", arch: grayskull, cmd: ./tests/scripts/run_tests.sh --tt-arch grayskull --pipeline-type frequent_api --dispatch-mode fast, timeout: 40 },
{ name: "API tests N300 WH B0", arch: wormhole_b0, cmd: ./tests/scripts/run_tests.sh --tt-arch wormhole_b0 --pipeline-type frequent_api --dispatch-mode fast, timeout: 40 },
{ name: "[Unstable] N300 models", arch: wormhole_b0, cmd: tests/scripts/single_card/nightly/run_wh_b0_unstable.sh, timeout: 35 },
]
name: FD ${{ matrix.test-group.name }} ${{ matrix.test-group.arch }}
env:
Expand Down
64 changes: 64 additions & 0 deletions .github/workflows/single-card-demo-tests.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
name: "[Single-card] Demo tests"

on:
workflow_dispatch:
schedule:
- cron: "0 0 * * 1,2,3,4,5"
- cron: "0 */4 * * 0,6"

jobs:
build-artifact:
uses: ./.github/workflows/build-artifact.yaml
with:
arch: '["wormhole_b0"]'
secrets: inherit
t3000-demo-tests:
needs: build-artifact
strategy:
fail-fast: false
matrix:
test-group: [
{
name: "N150",
arch: wormhole_b0,
runs-on: ["wormhole_b0", "multi-chip-num-pcie-1", "multi-chip-num-chips-1"],
cmd: './tests/scripts/run_tests.sh --tt-arch wormhole_b0 --pipeline-type demos_single_card_n150 --dispatch-mode ""'
},
{
name: "N300",
arch: wormhole_b0,
runs-on: ["wormhole_b0", "multi-chip-num-pcie-1", "multi-chip-num-chips-2"],
cmd: './tests/scripts/run_tests.sh --tt-arch wormhole_b0 --pipeline-type demos_single_card_n300 --dispatch-mode ""'
}
]
name: ${{ matrix.test-group.name }}
env:
TT_METAL_ENV: ${{ vars.TT_METAL_ENV }}
ARCH_NAME: ${{ matrix.test-group.arch }}
LOGURU_LEVEL: INFO
LD_LIBRARY_PATH: ${{ github.workspace }}/build/lib
environment: dev
runs-on: ${{ matrix.test-group.runs-on }}
steps:
- uses: tenstorrent-metal/metal-workflows/.github/actions/[email protected]
- name: Ensure weka mount is active
run: |
sudo systemctl restart mnt-MLPerf.mount
sudo /etc/rc.local
ls -al /mnt/MLPerf/bit_error_tests
- name: Set up dynamic env vars for build
run: |
echo "TT_METAL_HOME=$(pwd)" >> $GITHUB_ENV
- uses: actions/download-artifact@v4
with:
name: TTMetal_build_${{ matrix.test-group.arch }}
- name: Extract files
run: tar -xvf ttm_${{ matrix.test-group.arch }}.tar
- uses: ./.github/actions/install-python-deps
- name: Run demo regression tests
timeout-minutes: 150
run: |
source ${{ github.workspace }}/python_env/bin/activate
cd $TT_METAL_HOME
export PYTHONPATH=$TT_METAL_HOME
${{ matrix.test-group.cmd }}
2 changes: 1 addition & 1 deletion .github/workflows/t3000-demo-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: "[T3K] T3000 demo tests"
on:
workflow_dispatch:
schedule:
- cron: '0 0 * * 6' # This cron schedule runs the workflow every Saturday at 12am UTC
- cron: '0 0 * * 1,3,5' # This cron schedule runs the workflow every Monday/Wednesday/Friday at 12am UTC

jobs:
build-artifact:
Expand Down
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
.cache
.vscode
.idea
runtime
*.log
*.csv
*.xlsx
Expand Down Expand Up @@ -118,3 +119,6 @@ compile_commands.json
# rpath_check
tt_eager/tt_lib/.rpath_checked*
ttnn/ttnn/.rpath_checked

# exclude packages brough in from CPM
.cpmcache
37 changes: 23 additions & 14 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,15 @@ cmake_policy(VERSION 3.16)
# Project setup
############################################

### Uncomment this if you don't want to manually pass Clang-17 compiler through CLI ###
# find_program(CLANG_17 clang++-17)
# if(CLANG_17)
# message(STATUS "Found Clang-17 here: ${CLANG_17}")
# set(CMAKE_CXX_COMPILER "${CLANG_17}")
# else()
# message(WARNING "Clang++-17 not found, recommended to build with Clang-17 > GCC for better perf")
# endif()
# Use Clang-17 by default until we upgrade to Ubuntu version that supports higher GCC
# No longer support GCC-9 as it does not support C++20
find_program(CLANG_17 clang++-17)
if(CLANG_17)
message(STATUS "Found Clang-17 here: ${CLANG_17}")
set(CMAKE_CXX_COMPILER "${CLANG_17}")
else()
message(WARNING "Clang++-17 not found!!!")
endif()

if(${CMAKE_SOURCE_DIR} STREQUAL ${CMAKE_BINARY_DIR})
message(FATAL_ERROR "CMake generation is not allowed within source directory!! Please set a build folder with '-B'!!")
Expand All @@ -31,7 +32,7 @@ CHECK_COMPILERS()
############################################################################################################################
# Find all required libraries to build
############################################################################################################################
find_package(Boost REQUIRED COMPONENTS thread filesystem system regex)
include(${CMAKE_SOURCE_DIR}/cmake/CPM_boost.cmake)
find_package(GTest REQUIRED)
find_package (Python3 COMPONENTS Interpreter Development)
find_library(NUMA_LIBRARY NAMES numa)
Expand All @@ -57,8 +58,9 @@ set(CMAKE_CXX_FLAGS_DEBUG "-O0 -g -DDEBUG=DEBUG")
set(CMAKE_CXX_FLAGS_RELWITHDEBINFO "-O3 -g -DDEBUG=DEBUG")
set(CMAKE_CXX_FLAGS_CI "-O3 -DDEBUG=DEBUG")

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD 20)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
set(CMAKE_CXX_EXTENSIONS OFF)

# Set default values for variables/options
set(UMD_HOME "${CMAKE_SOURCE_DIR}/tt_metal/third_party/umd")
Expand Down Expand Up @@ -88,8 +90,7 @@ set(CMAKE_INSTALL_DATAROOTDIR "${CMAKE_BINARY_DIR}/tmp/share")
############################################################################################################################
add_library(metal_common_libs INTERFACE)
target_link_libraries(metal_common_libs INTERFACE
dl z pthread atomic stdc++ numa # system libraries
Boost::thread Boost::filesystem Boost::system Boost::regex hwloc # hwloc has no cmake support, find_package won't find it
dl z pthread atomic stdc++ hwloc numa # system libraries, hwloc has no cmake support, find_package won't find it
)

# Note on flags:
Expand Down Expand Up @@ -121,7 +122,11 @@ if($ENV{ENABLE_TRACY})
endif()

add_library(metal_header_directories INTERFACE)
target_include_directories(metal_header_directories INTERFACE tt_metal/hw/inc)
target_include_directories(metal_header_directories INTERFACE ${CMAKE_SOURCE_DIR}/tt_metal/hw/inc)
foreach(lib ${BoostPackages})
target_include_directories(metal_header_directories INTERFACE ${Boost${lib}_SOURCE_DIR}/include)
endforeach()

if ("$ENV{ARCH_NAME}" STREQUAL "wormhole_b0")
target_include_directories(metal_header_directories INTERFACE tt_metal/hw/inc/wormhole
tt_metal/hw/inc/wormhole/wormhole_b0_defines
Expand Down Expand Up @@ -161,7 +166,6 @@ add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/ttnn)

add_subdirectory(${CMAKE_CURRENT_SOURCE_DIR}/tests EXCLUDE_FROM_ALL)


############################################################################################################################
# Install targets for build artifacts and pybinds
# If built with Tracy, cannot install 'all' since it will pick up install targets from Tracy
Expand Down Expand Up @@ -201,3 +205,8 @@ install(FILES ${CMAKE_BINARY_DIR}/lib/_C.so
DESTINATION ${CMAKE_SOURCE_DIR}/tt_eager/tt_lib
COMPONENT tt_pybinds
)

# Temporary workaround for Issue #8767
install(DIRECTORY ${CMAKE_BINARY_DIR}/hw/toolchain
DESTINATION ${CMAKE_SOURCE_DIR}/runtime/hw
)
1 change: 1 addition & 0 deletions CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ setup_hugepages.py @tt-rkim
scripts/build_scripts/ @tt-rkim @vtangTT @TT-billteng
scripts/build_scripts/build_with_profiler_opt.sh @mo-tenstorrent @tt-rkim
cmake/ @tt-rkim @vtangTT @TT-billteng
build_metal.sh @tt-rkim @vtangTT @TT-billteng

Makefile @tt-rkim
/module.mk @tt-rkim
Expand Down
10 changes: 6 additions & 4 deletions INSTALLING.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ These instructions will guide you through the installation of Tenstorrent system

---

## Installation Steps

### Step 1. Driver & Firmware

Follow the Software Setup instructions for your specific board or system provided on our [general docs](https://docs.tenstorrent.com/tenstorrent).
Expand All @@ -26,7 +28,7 @@ Note the current compatability matrix:
sudo apt update
sudo apt install software-properties-common=0.99.9.12 build-essential=12.8ubuntu1.1 python3.8-venv=3.8.10-0ubuntu1~20.04.9 libgoogle-glog-dev=0.4.0-1build1 libyaml-cpp-dev=0.6.2-4ubuntu1 libboost-all-dev=1.71.0.0ubuntu2 libsndfile1=1.0.28-7ubuntu0.2 libhwloc-dev graphviz

# Install Clang-17: Recommended to use Clang-17 as that's what is officially supported and tested on CI.
# Install Clang-17 for C++20 support!!
wget https://apt.llvm.org/llvm.sh
chmod u+x llvm.sh
sudo ./llvm.sh 17
Expand Down Expand Up @@ -111,18 +113,18 @@ are less familiar with Python and its various environment tools, just use

5. Start coding

You are all set! Visit the [TT-NN Basic examples page](https://tenstorrent.github.io/tt-metal/latest/ttnn/ttnn/usage.html#basic-examples) or get started with [simple kernels on TT-Metalium](https://github.com/tenstorrent/tt-metal/blob/main/README.md)
You are all set! Visit the [TT-NN Basic examples page](https://tenstorrent.github.io/tt-metal/latest/ttnn/ttnn/usage.html#basic-examples) or get started with [simple kernels on TT-Metalium](https://tenstorrent.github.io/tt-metal/latest/tt-metalium/tt_metal/examples/index.html).

---

### Step 5. Software dependencies for codebase contributions

Please follow the next additional steps if you want to contribute to the codebase
Please follow the next additional steps if you want to contribute to the codebase.

1. Install dependencies

```sh
sudo apt install clang-6.0=1:6.0.1-14 git git-lfs cmake=3.16.3-1ubuntu1.20.04.1 pandoc libtbb-dev libcapstone-dev pkg-config ninja-build patchelf
sudo apt install git git-lfs cmake=3.16.3-1ubuntu1.20.04.1 pandoc libtbb-dev libcapstone-dev pkg-config ninja-build patchelf
```

2. Download and install [Doxygen](https://www.doxygen.nl/download.html), (v1.9 or higher, but less than v1.10)
Expand Down
2 changes: 2 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,5 @@ prune tt_metal/**/.github/
prune docs/doxygen_build/
prune docs/build/
exclude .pre-commit-config.yaml

recursive-include runtime *
32 changes: 21 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,10 @@

| Model | Batch | End-to-end throughput [1] | Device throughput [2] | Target |
|---------------------------------------------------------- |---------------------|------------------------------|-----------------------------|-------------------------------------|
| [ResNet-50](./models/demos/resnet) (fps) | 20 | 2,850 | 7,200 | 10,000 |
| [ResNet-50](./models/demos/resnet) (fps) | 20 | 4,400 | 7,700 | 10,000 |
| [BERT-Large](./models/demos/bert) (sen/s) | 12 | 362 | 406 | 410 |
| [Falcon7B-decode](./models/demos/ttnn_falcon7b) (t/s) | 32 | 135 | 135 | 140 |
| [ViT](./models/demos/grayskull/vit) (fps) | 8 | 480 | 1570 | 2000 |
| [ViT](./models/demos/grayskull/vit) (fps) | 8 | 860 | 1570 | 2000 |
| [T5 small](.models/demos/grayskull/t5) (sen/s) | | 140 | | |
| [Bloom](.models/demos/grayskull/functional_bloom) (sen/s) | | 70 | | |
| U-Net | coming soon | | | |
Expand All @@ -38,15 +38,25 @@

## Wormhole (WH) Models

| Model | Gen. Token [3] | Batch | End-to-end throughput [1] | Device throughput [2] | Target |
|-------------------------------------------------------------|--------------------|----------------------|------------------------------|-----------------------------|----------------|
| [Falcon7B-decode](./models/demos/wormhole/falcon7b) | 129th | 32 | 11.6 t/s/u - 371 t/s | 15.4 t/s/u - 493 t/s | 21 t/s/u |
| [Mistral-7B-decode](./models/demos/wormhole/mistral7b) | 33rd | 32 | 10.9 t/s/u - 349 t/s | 13.3 t/s/u - 426 t/s | 21 t/s/u |
| [Mamba-2.8B-decode](./models/demos/mamba) | any | 32 | 9.2 t/s/u - 295 t/s | 13.1 t/s/u - 419 t/s | 22 t/s/u |
| [BERT-Large](./models/demos/metal_BERT_large_11/) (sen/s) | any | 8 | 270 | 340 | 400 |
| [Stable Diffusion 1.4](./models/demos/wormhole/stable_diffusion) 512x512 (sec/img) | | 1 | 8s | 5s | |
> [!NOTE]
>
> All model demos in this table function on both N150 and N300 Wormhole cards, unless otherwise stated.
[3] - Generating the i'th token in a sequence while the kv_cache is filled with i-1 rows.
| Model | Gen. Token [3] | Batch | End-to-end throughput [1] | Device throughput [2] | Target |
|--------------------------------------------------------------------------------------|--------------------|----------------------|------------------------------|-----------------------------|----------------|
| [Falcon7B-decode](./models/demos/wormhole/falcon7b) | 129th | 32 | 11.6 t/s/u - 371 t/s | 15.4 t/s/u - 493 t/s | 21 |
| [Mistral-7B-decode](./models/demos/wormhole/mistral7b) | 33rd | 32 | 10.9 t/s/u - 349 t/s | 13.3 t/s/u - 426 t/s | 21 |
| [Mamba-2.8B-decode](./models/demos/mamba) | any | 32 | 9.6 t/s/u - 307 t/s | 15.8 t/s/u - 506 t/s | 22 |
| [BERT-Large](./models/demos/metal_BERT_large_11/) (sen/s) [4] | | 8 | 270 | 340 | 400 |
| [Stable Diffusion 1.4](./models/demos/wormhole/stable_diffusion) 512x512 (sec/img) | | 1 | 8 | 5 | |

[1] - Observed from the host. Includes dispatch overhead and kernel execution time.

[2] - Ignoring host overhead. Kernel execution time only.

[3] - Generating the `i`'th token in a sequence while the kv_cache is filled with `i-1` rows.

[4] - This model demo does not work on N150. It does work on N300.

## T3000 (2x4 mesh of WHs) Models

Expand All @@ -56,7 +66,7 @@
| [LLaMA-2-70B-decode](./models/demos/t3000/llama2_70b) | Tensor Parallel | 129th | 32 | 8.5 t/s/u - 272 t/s | 13.9 t/s/u - 445 t/s | 20 t/s/u |
| [LLaMA-3-70B-decode](./models/demos/t3000/llama3_70b) | Tensor Parallel | 129th | 32 | 8.1 t/s/u - 257 t/s | 13.9 t/s/u - 445 t/s | 20 t/s/u |
| [Falcon40B-decode](./models/demos/t3000/falcon40b) | Tensor Parallel | 129th | 32 | 1.5 t/s/u - 48 t/s | 14.0 t/s/u - 448 t/s | 30 t/s/u |
| [Mixtral7Bx8-decode](./models/demos/t3000/mixtral8x7b) | Tensor Parallel | 129th | 32 | 3.6 t/s/u - 114 t/s | 23.5 t/s/u - 752 t/s | 28 t/s/u |
| [Mixtral7Bx8-decode](./models/demos/t3000/mixtral8x7b) | Tensor Parallel | 129th | 32 | 7.0 t/s/u - 225 t/s | 27.0 t/s/u - 864 t/s | 28 t/s/u |
| ResNet50 | Data Parallel | coming soon | | | | |

## Using TT-NN ops and tensors
Expand Down
Loading

0 comments on commit 2b268ac

Please sign in to comment.