Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MacOS platform support to onnxruntime-c pod #18334

Closed
wants to merge 161 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
161 commits
Select commit Hold shift + click to select a range
c4460c5
add macos to build xcframework and other minor changes
Nov 7, 2023
e008e56
add more changes for configuring optional macos build in pod framework
Nov 7, 2023
3544456
enable in ios packaging pipeline and check artifacts
Nov 8, 2023
9dfbfae
fix input argument
Nov 8, 2023
46e44c6
fix
Nov 8, 2023
cf27c8d
adjust build settings json file for ios/macos framework
Nov 8, 2023
2c01018
adjust ios packaging pipeline
Nov 8, 2023
ad013e2
add back ios archs
Nov 8, 2023
af3209f
update c.podsepc.template
Nov 8, 2023
013b520
update framework_info.json.template
Nov 8, 2023
74a8b35
fix
Nov 8, 2023
85e9a69
remove two framework_info.json changes
Nov 8, 2023
2c69e12
minor updates
Nov 8, 2023
700a40d
add changes for build.py script to add optional --macosx behavior
Nov 8, 2023
3cfaaf7
minor update
Nov 8, 2023
b65bfab
Merge branch 'main' of https://github.com/microsoft/onnxruntime into …
Nov 9, 2023
4dc6369
Add FlattenAndUnpad Op (#17845)
guyang3532 Nov 9, 2023
7a3da45
add bfloat16 support for CUDA Neg kernel (#18306)
prathikr Nov 9, 2023
55c19d6
[QNN EP] Enable option to set QNN context priority (#18315)
HectorSVC Nov 9, 2023
55efb83
add softmax op support-coreml
Nov 9, 2023
b0a0588
Merge branch 'main' of https://github.com/microsoft/onnxruntime into …
Nov 9, 2023
01e746b
update
Nov 9, 2023
8d50313
[Migraphx EP] Static int8 QDQ support (#17931)
TedThemistokleous Nov 9, 2023
2c22b49
Fix rust compile issues and add GH action to run build validations an…
devigned Nov 9, 2023
25fbc2b
fix fused relu activation (#18303)
guschmue Nov 9, 2023
bafb0b6
comment unused params
Nov 9, 2023
f237b0b
[QNN EP/Quantization] Add MinimumRealRange extra option to quantizati…
adrianlizarraga Nov 9, 2023
829d802
[js/webgpu] Support uniform for softmax (#18345)
axinging Nov 9, 2023
1ff8948
Bump actions/stale from 4.1.1 to 8.0.0 (#18149)
dependabot[bot] Nov 9, 2023
59262df
Add cuda context headers to zip (#18330)
RandySheriffH Nov 9, 2023
d955885
Update stale.yml to fix start-date bug (#18376)
sophies927 Nov 10, 2023
3e32fc5
add split op support
Nov 10, 2023
74705ca
updates
Nov 10, 2023
8f2fc30
minor change
Nov 10, 2023
dd1bb76
[js/webgpu] Fix scalar uniform (#18318)
axinging Nov 10, 2023
87744e5
fix reference to Microsoft.GSL::GSL in CMake build scripts when enabl…
bverhagen Nov 10, 2023
2d23b4e
Update min macos version (#18251)
snnn Nov 10, 2023
a6f8ec6
refine and fix the split op builder logic
Nov 10, 2023
28c23ae
[js/webgpu] Fix conv2d with activation (#18388)
qjia7 Nov 10, 2023
64c91d7
Fix ability to use patch on Windows CI machines (#18356)
skottmckay Nov 10, 2023
8dba6ef
[js/webgpu] Add uniforms support to concat op (#18238)
axinging Nov 10, 2023
aee52cb
minor update
Nov 10, 2023
eb3518b
minor update
Nov 10, 2023
6b0c97b
[js/web] fix typescript type check (#18343)
fs-eire Nov 11, 2023
0c8c001
[js/webgpu] Use builtin num_workgroups to fix shader key conflict (#1…
axinging Nov 11, 2023
d87d480
Remove deprecated vscode settings (#18349)
justinchuby Nov 11, 2023
646f77a
Align context virtuals (#18396)
RandySheriffH Nov 11, 2023
a46c79d
fix llama2-70b bug, add document (#18398)
frank-dong-ms Nov 11, 2023
8d298f6
Fix xnnpack compile error on arm32 (#18291)
skottmckay Nov 11, 2023
cd4b487
add support for softmax 13- cases
Nov 11, 2023
af462fe
unit test
Nov 11, 2023
f375d83
update
Nov 12, 2023
5dba904
update
Nov 12, 2023
cbf0cf0
[WebNN EP] Disable clamp fusion for WebNN GPU (#18386)
Honry Nov 12, 2023
73ed34a
[WebNN EP] Support numThreads option for WebNN CPU device (#18054)
Honry Nov 13, 2023
4a82030
[ORTModule] Symbolic Shape Support for Triton Codegen (#18317)
centwang Nov 13, 2023
949ac4b
[js/webgpu] Support uniforms for gather (#18312)
axinging Nov 13, 2023
0d22d64
Update SDXL demo and documents (#18395)
tianleiwu Nov 13, 2023
4f2bd38
[QNN EP] Ensure QDQ Split input/output quant params are equal (#18332)
adrianlizarraga Nov 13, 2023
c3b5479
Remove extra CUDA version flag (#18397)
snnn Nov 13, 2023
a62a500
[ROCm] Update CK version (#17628)
PeixuanZuo Nov 13, 2023
f19c673
If Branch Constant Folding (#18105)
yuslepukhin Nov 14, 2023
888ef95
updates
Nov 14, 2023
37d8bed
[ROCm] add migraphx into onnxruntime-training-rocm package (#18339)
PeixuanZuo Nov 14, 2023
8ff41ae
Fix 4 more bad delegates missing the attribute that cause iOS AOT err…
skottmckay Nov 14, 2023
897c1c1
Set DML package name correctly in CI (#18405)
skottmckay Nov 14, 2023
0b16185
build wasm with linux (#18106)
mszhanyi Nov 14, 2023
a09099f
Remove XNNPack from web pipelines (#18419)
snnn Nov 14, 2023
3e1cf71
[TensorRT EP] Fix bug for handling outer scope values in GetCapabilit…
chilo-ms Nov 14, 2023
5aeed62
Bump axios from 1.3.4 to 1.6.1 in /js/node (#18400)
dependabot[bot] Nov 14, 2023
c9d5345
[QNN EP] Clean-up todo for OnnxInputInfo (#18416)
adrianlizarraga Nov 14, 2023
a6b515f
[QNN EP] Update Where Op UT to include the issue relate to data layou…
HectorSVC Nov 14, 2023
cbde30c
update build.py args
Nov 14, 2023
d22b1af
[js/web] add CI steps to log info for test failure investigating (#18…
fs-eire Nov 14, 2023
27d0685
Remove Node.js tool installer task from web ci pipeline (#18434)
snnn Nov 14, 2023
2e55138
update objc pod script
Nov 14, 2023
fc7926a
update framework_info path for ios/macos to make it consistent
Nov 14, 2023
d30ffb4
add arm64 arch in config
Nov 15, 2023
47ff783
modify c.podspec template
Nov 15, 2023
f9af940
onboard MoE (#18279)
wangyems Nov 15, 2023
05526b3
Adding new yaml file for downloading cuda, and trt from azure blob (#…
jchen351 Nov 15, 2023
d738ff1
SDXL demo: consistent opt shape and seed (#18445)
tianleiwu Nov 15, 2023
b0699d9
Support Graph Input and Initializer for GatherToSplit Fusion (#18412)
centwang Nov 15, 2023
ed89ca5
[ORTModule] Support User Config for Triton Codegen, Bugfix for Reduce…
centwang Nov 15, 2023
586f06f
[js/web] set noUnusedParameters to true and fix a few bugs (#18404)
fs-eire Nov 15, 2023
0a4d76d
MLAS AArch64 quantized int4 Gemm kernel (#18031)
edgchen1 Nov 15, 2023
83dcadd
updates config
Nov 15, 2023
b653bd6
update
Nov 15, 2023
3e79055
fix split op logic and enable tests
Nov 15, 2023
cc840c5
Fix a bug in SaveInputOutputNamesToNodeMapping function (#18456)
snnn Nov 15, 2023
6f863ae
Allow optional axes tensor to be null and ignore it as optional (#18423)
yuslepukhin Nov 16, 2023
6f9f653
[wasm] increase test max memory from 2G to 4G (#18459)
fs-eire Nov 16, 2023
18a3675
[TensorRT EP] Only instantiate TRT builder once (#18100)
chilo-ms Nov 16, 2023
751aa8d
fix axis of layernorm for UpstreamReshape (#18425)
guyang3532 Nov 16, 2023
16d7f55
lora conv1d replacement (#16643)
zhijxu-MS Nov 16, 2023
e31fe55
fix softmax handling
Nov 16, 2023
119e86e
SDXL demo: Add Option to disable refiner (#18455)
tianleiwu Nov 16, 2023
999752a
[WebNN EP] Support GreaterOrEqual and LessOrEqual ops (#18411)
Honry Nov 16, 2023
b291b20
[JS/Web]Added uniforms support to Slice op. (#18422)
satyajandhyala Nov 16, 2023
3588fba
[TensorRT EP] Fix memory leak for cudnn/cublas (#18467)
chilo-ms Nov 16, 2023
b6b9aff
Allow empty shapes and do not validate them for inputs/outputs (#18442)
yuslepukhin Nov 16, 2023
e7a524f
Update to allow large models to be checked for mobile support. (#18357)
skottmckay Nov 16, 2023
6a4e448
[QNN EP] Support Qnn MatMul with 2 dynamic inputs which are uint16 qu…
HectorSVC Nov 16, 2023
adb56df
Aciddelgado/gqa local (#18375)
aciddelgado Nov 16, 2023
2273c98
merge nested framework_info.json
Nov 16, 2023
f1cb9e1
update naming
Nov 16, 2023
1448f36
update json templates
Nov 17, 2023
d8dd54b
simplier impl with softmaxnd
Nov 17, 2023
f17b6af
[TensorRT EP] Fix bug for no nodes in subgraph at GetCapability (#18449)
chilo-ms Nov 17, 2023
1f1119f
minor updates
Nov 17, 2023
d73073d
remove full protobuf requirement for tensorrt ep (#18413)
jywu-msft Nov 17, 2023
5eb5056
Always run emsdk_env.sh before build.py, even when ccache is disabled…
snnn Nov 17, 2023
1a29460
rope support 4D input tensor (#18454)
kailums Nov 17, 2023
a5537f2
[WebNN Ep] Slice's axes and steps inputs should be constant initializ…
Honry Nov 17, 2023
fac3e33
[js/web] JSEP Attention & MultiHeadAttention (#17742)
dakenf Nov 17, 2023
3a68692
build only mac arch for testing on packaging pipeline temp
Nov 17, 2023
06418c6
test config
Nov 17, 2023
41f9379
Update NDK version to 26.1.10909125 (#18493)
snnn Nov 17, 2023
e4acd53
expand time out
Nov 17, 2023
0bd59d6
address pr comments
Nov 18, 2023
2e89ac4
fix
Nov 18, 2023
0d96ebe
update
Nov 18, 2023
cbb85b4
[CoreML] Adapt to `MLMultiArray.dataPointer` deprecation (#17726)
NickLucche Nov 18, 2023
0233329
Removed all the deprecated python training code and related tests and…
askhade Nov 18, 2023
34c5424
[js] update a few packages (#18499)
fs-eire Nov 18, 2023
53ea59d
add macos test package target in the test app
Nov 18, 2023
9364c05
Update web-ci.yml: remove depth=1 (#18500)
snnn Nov 18, 2023
20cf553
update license comments
Nov 18, 2023
84be6eb
podfile.template
Nov 18, 2023
172ab19
enable macos package test in script
Nov 18, 2023
53917a3
Move up members in Lite Custom Op hierarchy for possible memleaks. (#…
RandySheriffH Nov 18, 2023
97cc40d
Add fusion patterns for conformer-transducer model (#18461)
apsonawane Nov 19, 2023
dc9ab4f
Update setup.py: replace libcudart.so.12.0 with libcudart.so.12 (#18501)
snnn Nov 20, 2023
3bcc137
Tiny change to trigger the update of DORT's CI image (#18507)
wschin Nov 20, 2023
d97fc18
Create a new Python Package pipeline for CUDA 12 (#18348)
jchen351 Nov 20, 2023
1af0681
Bfloat16 support for MatMulBnb4, Training support bitsandbytes>=0.41.…
jambayk Nov 20, 2023
1dd9bf5
Remove setup_env_azure.bat (#18482)
jchen351 Nov 20, 2023
247ce21
[js] optimize eslint config (#18460)
fs-eire Nov 20, 2023
65356e6
configure optional exclude macos target for not pod onnxruntime-c
Nov 20, 2023
cc54202
Create edges with arg positons correctly accounting for non-existing …
yuslepukhin Nov 20, 2023
abdf8b7
[js/webgpu] Optimize broadcast binary. (#18185)
qjia7 Nov 21, 2023
9630aa7
run pod deintegrater
Nov 21, 2023
33f5010
rename the test project
Nov 21, 2023
0e5c660
rename folder to apple_package_test
Nov 21, 2023
e306561
modify path names/xcworkspace name in pipelines, etc.
Nov 21, 2023
5634893
address pr comments
Nov 21, 2023
c7fd930
[js/web] unify resolve rules for "Clip" (#18527)
fs-eire Nov 21, 2023
d8d5b29
address pr comments
Nov 21, 2023
a608c00
fix past-kv in general LLM exporter (#18529)
wejoncy Nov 21, 2023
29a409a
Add missing flags DISABLE_FLOAT8_TYPES in GemmFloat8 custom operator …
xadupre Nov 21, 2023
2a01622
Hide NPU Adapter selection behind macro (#18515)
smk2007 Nov 21, 2023
bfa68b2
minor updates and fix ci
Nov 21, 2023
680a526
Training packaging pipeline for cuda12 (#18524)
ajindal1 Nov 21, 2023
81a763a
Make TensorShapeVector to use InlinedVector<Int64_t> to reduce on tem…
yuslepukhin Nov 21, 2023
ac8598a
[js/webgpu] enable f16 for concat (#18528)
qjia7 Nov 21, 2023
7497316
update file names
Nov 21, 2023
8a3e2dc
address pr comments
Nov 22, 2023
a86fbbd
Merge branch 'yguo/coreml-softmax-op-support' of https://github.com/m…
YUNQIUGUO Nov 22, 2023
0a69346
linting
YUNQIUGUO Nov 22, 2023
bc30357
Merge branch 'yguo/macos-pod-support' of https://github.com/microsoft…
YUNQIUGUO Nov 22, 2023
b7ab82e
linting
YUNQIUGUO Nov 22, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
44 changes: 44 additions & 0 deletions .github/actions/rust-toolchain-setup/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# yaml-language-server: $schema=https://json.schemastore.org/github-action.json

name: 'Rust toolchain setup'
description: 'Common setup steps for GitHub workflows for Rust projects'

runs:
using: composite
steps:
- uses: dtolnay/[email protected]
with:
components: clippy, rustfmt
- uses: extractions/setup-just@v1
with:
just-version: '1.15.0' # optional semver specification, otherwise latest

###
### Linux setup
###
- name: rustup
# We need to use the nightly rust tool change to enable registry-auth / to connect to ADO feeds.
if: ${{ (runner.os == 'Linux') }}
run: |
rustup set profile minimal
rustup install
shell: bash
# - name: Cargo login
# if: ${{ (runner.os == 'Linux') }}
# run: just cargo-login-ci
# shell: bash

###
### Windows setup
###
- name: rustup
# We need to use the nightly rust tool change to enable registry-auth / to connect to ADO feeds.
if: ${{ (runner.os == 'Windows') }}
run: |
rustup set profile minimal
rustup install
shell: pwsh
# - name: Cargo login
# if: ${{ (runner.os == 'Windows') }}
# run: just cargo-login-ci-windows
# shell: pwsh
132 changes: 132 additions & 0 deletions .github/workflows/rust-ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
name: Rust

on: [pull_request]

env:
CARGO_TERM_COLOR: always
RUST_LOG: onnxruntime=debug,onnxruntime-sys=debug
RUST_BACKTRACE: 1
MANIFEST_PATH: ${{ github.workspace }}/rust/Cargo.toml

jobs:
fmt:
name: Rustfmt
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/rust-toolchain-setup
- name: vendor onnxruntime source
run: just vendor
- name: fmt
run: cargo fmt --all -- --check

download:
name: Download prebuilt ONNX Runtime archive from build.rs
runs-on: ubuntu-latest
env:
ORT_RUST_STRATEGY=download
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/rust-toolchain-setup
- run: rustup target install x86_64-unknown-linux-gnu
- run: rustup target install x86_64-apple-darwin
- run: rustup target install i686-pc-windows-msvc
- run: rustup target install x86_64-pc-windows-msvc
# ******************************************************************
- name: Download prebuilt archive (CPU, x86_64-unknown-linux-gnu)
run: cargo build --target x86_64-unknown-linux-gnu --manifest-path ${{ env.MANIFEST_PATH }}
- name: Verify prebuilt archive downloaded (CPU, x86_64-unknown-linux-gnu)
run: ls -lh target/x86_64-unknown-linux-gnu/debug/build/onnxruntime-sys-*/out/onnxruntime-linux-x64-1.*.tgz
# ******************************************************************
- name: Download prebuilt archive (CPU, x86_64-apple-darwin)
run: cargo build --target x86_64-apple-darwin --manifest-path ${{ env.MANIFEST_PATH }}
- name: Verify prebuilt archive downloaded (CPU, x86_64-apple-darwin)
run: ls -lh target/x86_64-apple-darwin/debug/build/onnxruntime-sys-*/out/onnxruntime-osx-x64-1.*.tgz
# ******************************************************************
- name: Download prebuilt archive (CPU, i686-pc-windows-msvc)
run: cargo build --target i686-pc-windows-msvc --manifest-path ${{ env.MANIFEST_PATH }}
- name: Verify prebuilt archive downloaded (CPU, i686-pc-windows-msvc)
run: ls -lh target/i686-pc-windows-msvc/debug/build/onnxruntime-sys-*/out/onnxruntime-win-x86-1.*.zip
# ******************************************************************
- name: Download prebuilt archive (CPU, x86_64-pc-windows-msvc)
run: cargo build --target x86_64-pc-windows-msvc --manifest-path ${{ env.MANIFEST_PATH }}
- name: Verify prebuilt archive downloaded (CPU, x86_64-pc-windows-msvc)
run: ls -lh target/x86_64-pc-windows-msvc/debug/build/onnxruntime-sys-*/out/onnxruntime-win-x64-1.*.zip
# ******************************************************************
- name: Download prebuilt archive (GPU, x86_64-unknown-linux-gnu)
env:
ORT_USE_CUDA: "yes"
run: cargo build --target x86_64-unknown-linux-gnu --manifest-path ${{ env.MANIFEST_PATH }}
- name: Verify prebuilt archive downloaded (GPU, x86_64-unknown-linux-gnu)
run: ls -lh target/x86_64-unknown-linux-gnu/debug/build/onnxruntime-sys-*/out/onnxruntime-linux-x64-gpu-1.*.tgz
# ******************************************************************
- name: Download prebuilt archive (GPU, x86_64-pc-windows-msvc)
env:
ORT_USE_CUDA: "yes"
run: cargo build --target x86_64-pc-windows-msvc --manifest-path ${{ env.MANIFEST_PATH }}
- name: Verify prebuilt archive downloaded (GPU, x86_64-pc-windows-msvc)
run: ls -lh target/x86_64-pc-windows-msvc/debug/build/onnxruntime-sys-*/out/onnxruntime-win-gpu-x64-1.*.zip

test:
name: Test Suite
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
target:
[
x86_64-unknown-linux-gnu,
x86_64-apple-darwin,
x86_64-pc-windows-msvc,
i686-pc-windows-msvc,
]
include:
- target: x86_64-unknown-linux-gnu
os: ubuntu-latest
- target: x86_64-apple-darwin
os: macos-latest
- target: x86_64-pc-windows-msvc
os: windows-latest
- target: i686-pc-windows-msvc
os: windows-latest
env:
CARGO_BUILD_TARGET: ${{ matrix.target }}
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/rust-toolchain-setup
- name: vendor onnxruntime source
run: just vendor
- run: rustup target install ${{ matrix.target }}
- name: Install additional packages (macOS)
if: contains(matrix.target, 'x86_64-apple-darwin')
run: brew install libomp
- name: Build (cargo build)
run: cargo build --all --manifest-path ${{ env.MANIFEST_PATH }}
- name: Build tests (cargo test)
run: cargo test --no-run --manifest-path ${{ env.MANIFEST_PATH }}
- name: Build onnxruntime with 'model-fetching' feature
run: cargo build --manifest-path ${{ env.MANIFEST_PATH }} --features model-fetching
- name: Test onnxruntime-sys
run: cargo build --package onnxruntime-sys -- --test-threads=1 --nocapture
- name: Test onnxruntime
run: cargo test --manifest-path ${{ env.MANIFEST_PATH }} --features model-fetching -- --test-threads=1 --nocapture

clippy:
name: Clippy
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/rust-toolchain-setup
- name: vendor onnxruntime source
run: just vendor
- run: clippy --all-features --manifest-path ${{ env.MANIFEST_PATH }} -- -D warnings

package-sys:
name: Package onnxruntime-sys
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: ./.github/actions/rust-toolchain-setup
- name: vendor onnxruntime source
run: just vendor
- run: cargo package --allow-dirty --package onnxruntime-sys
5 changes: 3 additions & 2 deletions .github/workflows/stale.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,15 @@ jobs:
issues: write
pull-requests: write
steps:
- uses: actions/stale@v4.1.1
- uses: actions/stale@v8.0.0
with:
# Comma separated list of labels that can be assigned to issues to exclude them from being marked as stale
exempt-issue-labels: contributions welcome, feature request, regression
# Override exempt-all-assignees but only to exempt the issues with an assignee to be marked as stale automatically
exempt-all-issue-assignees: true
# Used to ignore the issues and pull requests created before the start date
start-date: 20220419
# Start date should be April 19, 2022 - corresponds to the day previous stale bot stopped working
start-date: '2022-04-19T00:00:00Z'
# Number of days without activity before the actions/stale action labels an issue
days-before-issue-stale: 30
# Number of days without activity before the actions/stale action closes an issue
Expand Down
19 changes: 0 additions & 19 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -16,25 +16,6 @@
},
// Enable Python linting and Pylance type checking
"python.analysis.typeCheckingMode": "basic",
"python.formatting.provider": "black",
"python.formatting.blackArgs": [
"--line-length",
"120"
],
"python.sortImports.args": [
"--profile",
"black",
"--line-length",
"120"
],
"python.linting.enabled": true,
"python.linting.flake8Enabled": true,
"python.linting.pylintEnabled": true,
"python.linting.pydocstyleEnabled": true,
"python.linting.pydocstyleArgs": [
"--convention=google"
],
"python.linting.banditEnabled": true,
"cpplint.lineLength": 120,
"cpplint.filters": [
"-build/include_subdir",
Expand Down
4 changes: 2 additions & 2 deletions cgmanifests/generated/cgmanifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,7 @@
"component": {
"type": "git",
"git": {
"commitHash": "c4f6b8c6bc94ff69048492fb34df0dfaf1983933",
"commitHash": "6f47420213f757831fae65c686aa471749fa8d60",
"repositoryUrl": "https://github.com/NVIDIA/cutlass.git"
},
"comments": "cutlass"
Expand Down Expand Up @@ -316,7 +316,7 @@
"component": {
"type": "git",
"git": {
"commitHash": "d52ec01652b7d620386251db92455968d8d90bdc",
"commitHash": "a4f72a314a85732ed67d5aa8d1088d207a7e0e61",
"repositoryUrl": "https://github.com/ROCmSoftwarePlatform/composable_kernel.git"
},
"comments": "composable_kernel"
Expand Down
18 changes: 15 additions & 3 deletions cmake/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -114,9 +114,7 @@ option(onnxruntime_ENABLE_LTO "Enable link time optimization" OFF)
option(onnxruntime_CROSS_COMPILING "Cross compiling onnx runtime" OFF)
option(onnxruntime_GCOV_COVERAGE "Compile with options necessary to run code coverage" OFF)
option(onnxruntime_DONT_VECTORIZE "Do not vectorize operations in Eigen" OFF)

#It's preferred to turn it OFF when onnxruntime is dynamically linked to PROTOBUF. But Tensort always required the full version of protobuf.
cmake_dependent_option(onnxruntime_USE_FULL_PROTOBUF "Link to libprotobuf instead of libprotobuf-lite when this option is ON" OFF "NOT onnxruntime_USE_TENSORRT" ON)
option(onnxruntime_USE_FULL_PROTOBUF "Link to libprotobuf instead of libprotobuf-lite when this option is ON" OFF)
option(tensorflow_C_PACKAGE_PATH "Path to tensorflow C package installation dir")
option(onnxruntime_ENABLE_LANGUAGE_INTEROP_OPS "Enable operator implemented in language other than cpp" OFF)
option(onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS "Dump debug information about node inputs and outputs when executing the model." OFF)
Expand Down Expand Up @@ -526,7 +524,21 @@ if(NOT WIN32 AND NOT CMAKE_SYSTEM_NAME STREQUAL "Android")
find_package(Iconv REQUIRED)
set(ICONV_LIB Iconv::Iconv)
endif()

find_package(Patch)
if (WIN32 AND NOT Patch_FOUND)
# work around CI machines missing patch from the git install by falling back to the binary in this repo.
# replicate what happens in https://github.com/Kitware/CMake/blob/master/Modules/FindPatch.cmake but without
# the hardcoded suffixes in the path to the patch binary.
find_program(Patch_EXECUTABLE NAMES patch PATHS ${PROJECT_SOURCE_DIR}/external/git.Win32.2.41.03.patch)
if(Patch_EXECUTABLE)
set(Patch_FOUND 1)
if (NOT TARGET Patch::patch)
add_executable(Patch::patch IMPORTED)
set_property(TARGET Patch::patch PROPERTY IMPORTED_LOCATION ${Patch_EXECUTABLE})
endif()
endif()
endif()
if(Patch_FOUND)
message("Patch found: ${Patch_EXECUTABLE}")
endif()
Expand Down
4 changes: 2 additions & 2 deletions cmake/deps.txt
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ pytorch_cpuinfo;https://github.com/pytorch/cpuinfo/archive/959002f82d7962a473d8b
re2;https://github.com/google/re2/archive/refs/tags/2022-06-01.zip;aa77313b76e91b531ee7f3e45f004c6a502a5374
safeint;https://github.com/dcleblanc/SafeInt/archive/refs/tags/3.0.28.zip;23f252040ff6cb9f1fd18575b32fa8fb5928daac
tensorboard;https://github.com/tensorflow/tensorboard/archive/373eb09e4c5d2b3cc2493f0949dc4be6b6a45e81.zip;67b833913605a4f3f499894ab11528a702c2b381
cutlass;https://github.com/NVIDIA/cutlass/archive/refs/tags/v3.0.0.zip;0f95b3c1fc1bd1175c4a90b2c9e39074d1bccefd
cutlass;https://github.com/NVIDIA/cutlass/archive/refs/tags/v3.1.0.zip;757f90a795034a89d4f48a79d1f009f7a04c8dee
utf8_range;https://github.com/protocolbuffers/utf8_range/archive/72c943dea2b9240cd09efde15191e144bc7c7d38.zip;9925739c9debc0efa2adcb194d371a35b6a03156
extensions;https://github.com/microsoft/onnxruntime-extensions/archive/94142d8391c9791ec71c38336436319a2d4ac7a0.zip;4365ac5140338b4cb75a39944a4be276e3829b3c
composable_kernel;https://github.com/ROCmSoftwarePlatform/composable_kernel/archive/d52ec01652b7d620386251db92455968d8d90bdc.zip;6b5ce8edf3625f8817086c194fbf94b664e1b0e0
composable_kernel;https://github.com/ROCmSoftwarePlatform/composable_kernel/archive/a4f72a314a85732ed67d5aa8d1088d207a7e0e61.zip;f57357ab6d300e207a632d034ebc8aa036a090d9
1 change: 0 additions & 1 deletion cmake/external/abseil-cpp.natvis
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@
<Intrinsic Name="_capacity" Expression="_commonfields().capacity_"/>
<Intrinsic Name="_control" Expression="_commonfields().control_"/>
<Intrinsic Name="_slots" Expression="(slot_type*)(_commonfields().slots_)"/>
<DisplayString Condition="_size() == 0">empty</DisplayString>
<DisplayString IncludeView="noparens">size={ _size() }</DisplayString>
<DisplayString ExcludeView="noparens">size=({_size()})</DisplayString>
<Expand>
Expand Down
3 changes: 2 additions & 1 deletion cmake/external/composable_kernel.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,14 @@ if(NOT composable_kernel_POPULATED)
FetchContent_Populate(composable_kernel)
set(BUILD_DEV OFF CACHE BOOL "Disable -Weverything, otherwise, error: 'constexpr' specifier is incompatible with C++98 [-Werror,-Wc++98-compat]" FORCE)
# Exclude i8 device gemm instances due to excessive long compilation time and not being used
set(DTYPES fp32 fp16 bf16)
set(DTYPES fp32 fp16 bf16 fp8)
set(INSTANCES_ONLY ON)
add_subdirectory(${composable_kernel_SOURCE_DIR} ${composable_kernel_BINARY_DIR} EXCLUDE_FROM_ALL)

add_library(onnxruntime_composable_kernel_includes INTERFACE)
target_include_directories(onnxruntime_composable_kernel_includes INTERFACE
${composable_kernel_SOURCE_DIR}/include
${composable_kernel_BINARY_DIR}/include
${composable_kernel_SOURCE_DIR}/library/include)
target_compile_definitions(onnxruntime_composable_kernel_includes INTERFACE __fp32__ __fp16__ __bf16__)
endif()
1 change: 0 additions & 1 deletion cmake/external/cutlass.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ if (onnxruntime_USE_FLASH_ATTENTION OR onnxruntime_USE_MEMORY_EFFICIENT_ATTENTIO
cutlass
URL ${DEP_URL_cutlass}
URL_HASH SHA1=${DEP_SHA1_cutlass}
PATCH_COMMAND ${Patch_EXECUTABLE} --binary --ignore-whitespace -p1 < ${PROJECT_SOURCE_DIR}/patches/cutlass/cutlass.patch
)

FetchContent_GetProperties(cutlass)
Expand Down
21 changes: 6 additions & 15 deletions cmake/external/eigen.cmake
Original file line number Diff line number Diff line change
@@ -1,23 +1,14 @@

if (onnxruntime_USE_PREINSTALLED_EIGEN)
add_library(eigen INTERFACE)
file(TO_CMAKE_PATH ${eigen_SOURCE_PATH} eigen_INCLUDE_DIRS)
target_include_directories(eigen INTERFACE ${eigen_INCLUDE_DIRS})
else ()
if (onnxruntime_USE_ACL)
FetchContent_Declare(
eigen
URL ${DEP_URL_eigen}
URL_HASH SHA1=${DEP_SHA1_eigen}
PATCH_COMMAND ${Patch_EXECUTABLE} --ignore-space-change --ignore-whitespace < ${PROJECT_SOURCE_DIR}/patches/eigen/Fix_Eigen_Build_Break.patch
)
else()
FetchContent_Declare(
eigen
URL ${DEP_URL_eigen}
URL_HASH SHA1=${DEP_SHA1_eigen}
)
endif()
FetchContent_Declare(
eigen
URL ${DEP_URL_eigen}
URL_HASH SHA1=${DEP_SHA1_eigen}
)

FetchContent_Populate(eigen)
set(eigen_INCLUDE_DIRS "${eigen_SOURCE_DIR}")
endif()
Binary file not shown.
Binary file not shown.
Binary file added cmake/external/git.Win32.2.41.03.patch/patch.exe
Binary file not shown.
1 change: 1 addition & 0 deletions cmake/external/onnxruntime_external_deps.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -335,6 +335,7 @@ if(onnxruntime_USE_CUDA)
URL ${DEP_URL_microsoft_gsl}
URL_HASH SHA1=${DEP_SHA1_microsoft_gsl}
PATCH_COMMAND ${Patch_EXECUTABLE} --binary --ignore-whitespace -p1 < ${PROJECT_SOURCE_DIR}/patches/gsl/1064.patch
FIND_PACKAGE_ARGS 4.0 NAMES Microsoft.GSL
)
else()
FetchContent_Declare(
Expand Down
6 changes: 1 addition & 5 deletions cmake/onnxruntime.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -282,11 +282,7 @@ endif()

# Assemble the Apple static framework (iOS and macOS)
if(onnxruntime_BUILD_APPLE_FRAMEWORK)
if(${CMAKE_SYSTEM_NAME} STREQUAL "iOS")
set(STATIC_FRAMEWORK_OUTPUT_DIR ${CMAKE_CURRENT_BINARY_DIR}/${CMAKE_BUILD_TYPE}-${CMAKE_OSX_SYSROOT})
else() # macOS
set(STATIC_FRAMEWORK_OUTPUT_DIR ${CMAKE_CURRENT_BINARY_DIR})
endif()
set(STATIC_FRAMEWORK_OUTPUT_DIR ${CMAKE_CURRENT_BINARY_DIR}/${CMAKE_BUILD_TYPE}-${CMAKE_OSX_SYSROOT})

# Setup the various directories required. Remove any existing ones so we start with a clean directory.
set(STATIC_LIB_DIR ${CMAKE_CURRENT_BINARY_DIR}/static_libraries)
Expand Down
3 changes: 3 additions & 0 deletions cmake/onnxruntime_mlas.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ onnxruntime_add_static_library(onnxruntime_mlas
${MLAS_SRC_DIR}/qpostprocessor.cpp
${MLAS_SRC_DIR}/qlgavgpool.cpp
${MLAS_SRC_DIR}/qdwconv_kernelsize.cpp
${MLAS_SRC_DIR}/sqnbitgemm.cpp
)

if (NOT onnxruntime_ORT_MINIMAL_BUILD)
Expand Down Expand Up @@ -68,6 +69,7 @@ function(setup_mlas_source_for_windows)
${MLAS_SRC_DIR}/qgemm_kernel_neon.cpp
${MLAS_SRC_DIR}/qgemm_kernel_udot.cpp
${MLAS_SRC_DIR}/qgemm_kernel_sdot.cpp
${MLAS_SRC_DIR}/sqnbitgemm_kernel_neon.cpp
)

set(mlas_platform_preprocess_srcs
Expand Down Expand Up @@ -334,6 +336,7 @@ else()
${MLAS_SRC_DIR}/qgemm_kernel_neon.cpp
${MLAS_SRC_DIR}/qgemm_kernel_udot.cpp
${MLAS_SRC_DIR}/qgemm_kernel_sdot.cpp
${MLAS_SRC_DIR}/sqnbitgemm_kernel_neon.cpp
)
if (NOT APPLE)
set(mlas_platform_srcs
Expand Down
Loading
Loading