Fix dynamic linking issues with prebuilt pip packages #3049

dbort · 2024-04-16T00:09:07Z

We can now prebuild pip packages, but we're hitting issues when importing pybindings and torch at runtime.

More context in T185860350

Test Plan

Ensure that the PR is against release/0.2

PR must touch one of the files in the pull_request.paths entry for a given build-wheels workflow:

executorch/.github/workflows/build-wheels-m1.yml

Lines 5 to 8 in 90d0c1a

    
           pull_request: 
        
             paths: 
        
               - build/packaging/** 
        
               - .github/workflows/build-wheels-m1.yml

This is necessary to run the build-wheel jobs in the test channel (using the pytorch release pip package) instead of the nightly channel

Add the label ciflow/binaries/all to the PR so that it builds for all versions of python and not just 3.8 (which doesn't work for executorch)
After the artifacts are built, follow the instructions in the comment below: Fix dynamic linking issues with prebuilt pip packages #3049 (comment)
Also, this PR adds a smoke test that will test the wheels during the build job

pytorch-bot · 2024-04-16T00:09:10Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/3049

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 5 New Failures

As of commit 358a03f with merge base d3326a2 ():

NEW FAILURES - The following jobs have failed:

Build Linux Wheels / pytorch/executorch / upload / manywheel-py3_12-cpu (gh)
Unable to find an artifact with the name: pytorch_executorch__3.12_cpu_x86_64
Build Linux Wheels / pytorch/executorch / upload / manywheel-py3_8-cpu (gh)
Unable to find an artifact with the name: pytorch_executorch__3.8_cpu_x86_64
Build M1 Wheels / pytorch/executorch / upload / wheel-py3_12-cpu (gh)
Unable to find an artifact with the name: pytorch_executorch__3.12_cpu_
Build M1 Wheels / pytorch/executorch / upload / wheel-py3_8-cpu (gh)
Unable to find an artifact with the name: pytorch_executorch__3.8_cpu_
trunk / test-coreml-delegate / macos-job (gh)
RuntimeError: Command bash /Users/runner/work/_temp/exec_script failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

dbort · 2024-04-16T00:13:03Z

Building wheels:

macos: https://github.com/pytorch/executorch/actions/runs/8697807901
- Triggered from https://github.com/pytorch/executorch/actions/workflows/build-wheels-m1.yml
linux: https://github.com/pytorch/executorch/actions/runs/8697825002
- Triggered from https://github.com/pytorch/executorch/actions/workflows/build-wheels-linux.yml

dbort · 2024-04-18T23:37:08Z

To test out a prebuilt wheel on Linux or macOS:

Ensure that you have conda installed
Click a link to download a zipped wheel file:
- macOS: pytorch_executorch__3.10_cpu_.zip
- Linux: pytorch_executorch__3.10_cpu_x86_64.zip
  - If you're downloading this on your mac you may need to scp the .zip file to your devserver, like scp ~/Downloads/pytorch_executorch__3.10_cpu_x86_64.zip MYDEVERSVER:/tmp/
Download the test script: curl -o /tmp/test_wheel_install.sh https://raw.githubusercontent.com/pytorch/executorch/1f3c8ff6a6eae81edcb7d192dbe982b59165e3ff/build/test_wheel_install.sh
cd to the root of a relatively recent executorch repo. Exact version doesn't matter as long as it can run the xnnpack examples.
Run the script against the zip file you downloaded: bash /tmp/test_wheel_install.sh /path/to/pytorch_executorch__xyz.zip | tee /tmp/wheel-install-logs
- NOTE: This will create a new conda environment called executorch-temp that you can delete with conda remove -y --name executorch-temp --all

If you see PASS at the end, it worked! If not, please create a paste of /tmp/wheel-install-logs and let me know.

+ python /tmp/test-wheel-install-lWkozN8xUl/test_xnnpack_e2e.py
I 00:00:00.003848 executorch:program.cpp:129] InternalConsistency verification requested but not available
E 00:00:00.023344 executorch:method.cpp:936] Overriding output data pointer allocated by memory plan is not allowed.
I 00:00:00.023371 executorch:pybindings.cpp:196] Cannot set_output_data_ptr(): this likely means the outputs were MemoryPlanned inspect the error code to know for sure, but likely this is not an issue. 0x2
PASS <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
+ rm -rf /tmp/test-wheel-install-lWkozN8xUl

kirklandsign · 2024-04-19T21:46:54Z

PASS
Apple M1 Pro
Darwin Kernel Version 23.4.0

larryliu0820 · 2024-04-21T16:05:41Z

I ran into some issues on my Mac m1: https://gist.github.com/larryliu0820/7f95f711c2a40835c3e39680a7a6e44d

From @dbort:
@larryliu0820 tracked this down to an issue in the test script. python3 was falling back to the system python, which didn't have torch installed. Running python was able to find the installed torch package.

dbort · 2024-04-22T19:24:10Z

Added a smoke test script to the wheel-build job: succeeding for the expected-compatible python versions, like 3.10 for linux and 3.10 for mac

libtorch.so builds with the old glibc ABI, so we need to as well, for any source files that include torch headers.

pip wheels will need to be able to find the torch libraries. On Linux, the .so has non-absolute dependencies on libs like "libtorch.so" without paths; as long as we `import torch` first, those dependencies will work. But Apple dylibs do not support non-absolute dependencies, so we need to tell the loader where to look for its libraries. The LC_LOAD_DYLIB entries for the torch libraries will look like "@rpath/libtorch.dylib", so we can add an LC_RPATH entry to look in a directory relative to the installed location of our _portable_lib.so file. To see these LC_* values, run `otool -l _portable_lib*.so`.

The executorch build system will ensure that .dylib/.so files have LC_LOAD_DYLIB and LC_RPATH entries that will work when they're installed. Delocating (i.e., making copies of the .dylibs that ET's libs depend on) will break any libs that depend on the torch libraries if users ever import both `torch` and the executorch library. Both import paths must load exactly the same file, not just a copy of it.

This script is run by CI after building the executorch wheel. Before running this, the job will install the matching torch package as well as the newly-built executorch package and its dependencies. For now we test the export of a simple model, and try executing it using the runtime pybindings. Test Plan: ``` ./install_requirements.sh python build/packaging/smoke_test.py ```

larryliu0820

Thanks for adding the test

dbort · 2024-04-22T22:47:41Z

Successfully ran the test script under WSL running Ubuntu 20.04.4 LTS on Windows 10 22H2, AMD Ryzen 7 2700 Eight-Core Processor 3.20 GHz, 32GB RAM

guangy10 · 2024-04-23T00:46:17Z

@dbort We expect the prebuilt works for 3.10 only right?

dbort · 2024-04-23T16:23:38Z

@dbort We expect the prebuilt works for 3.10 only right?

That's the main version that we need to work. Currently there's no way to disable the python versions that we don't expect to work (like 3.8 and 3.12).

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 16, 2024

dbort added the ciflow/binaries/all label Apr 16, 2024

seemethere added ciflow/binaries/all and removed ciflow/binaries/all labels Apr 16, 2024

dbort force-pushed the sandbox/dbort/pip-linking branch from 516ba48 to ae1a658 Compare April 16, 2024 20:31

dbort added ciflow/binaries and removed ciflow/binaries/all labels Apr 16, 2024

dbort force-pushed the sandbox/dbort/pip-linking branch from de14f34 to ae1a658 Compare April 16, 2024 22:15

dbort added ciflow/binaries/all Release PRs with this label will build wheels for all python versions and removed ciflow/binaries labels Apr 17, 2024

dbort force-pushed the sandbox/dbort/pip-linking branch from ba3b0a4 to f99f0cf Compare April 17, 2024 18:46

dbort added ciflow/binaries and removed ciflow/binaries labels Apr 17, 2024

dbort force-pushed the sandbox/dbort/pip-linking branch 2 times, most recently from aa9965d to 0a7566d Compare April 18, 2024 23:25

dbort force-pushed the sandbox/dbort/pip-linking branch 2 times, most recently from 52afdb2 to aca550b Compare April 19, 2024 20:56

dbort force-pushed the sandbox/dbort/pip-linking branch from aca550b to 5cf96f5 Compare April 20, 2024 01:07

dbort force-pushed the sandbox/dbort/pip-linking branch from 5cf96f5 to 2a58198 Compare April 22, 2024 17:52

dbort added 4 commits April 22, 2024 14:57

Build pybindings with -D_GLIBCXX_USE_CXX11_ABI=0 to match libtorch.so

f4e5086

libtorch.so builds with the old glibc ABI, so we need to as well, for any source files that include torch headers.

dbort force-pushed the sandbox/dbort/pip-linking branch from 9eed1f6 to 358a03f Compare April 22, 2024 21:57

larryliu0820 approved these changes Apr 22, 2024

View reviewed changes

dbort added the ciflow/trunk label Apr 22, 2024

dbort mentioned this pull request Apr 22, 2024

[v0.2.0] Release Tracker #2666

Closed

guangy10 approved these changes Apr 23, 2024

View reviewed changes

guangy10 merged commit 1fd5562 into release/0.2 Apr 23, 2024
97 of 106 checks passed

dbort deleted the sandbox/dbort/pip-linking branch April 23, 2024 20:26

mergennachin mentioned this pull request Apr 25, 2024

fix llama readme #3339

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix dynamic linking issues with prebuilt pip packages #3049

Fix dynamic linking issues with prebuilt pip packages #3049

dbort commented Apr 16, 2024 •

edited

Loading

pytorch-bot bot commented Apr 16, 2024 •

edited

Loading

dbort commented Apr 16, 2024 •

edited

Loading

dbort commented Apr 18, 2024 •

edited

Loading

kirklandsign commented Apr 19, 2024

larryliu0820 commented Apr 21, 2024 •

edited by dbort

Loading

dbort commented Apr 22, 2024

larryliu0820 left a comment

dbort commented Apr 22, 2024

guangy10 commented Apr 23, 2024

dbort commented Apr 23, 2024

	pull_request:
	paths:
	- build/packaging/**
	- .github/workflows/build-wheels-m1.yml

Fix dynamic linking issues with prebuilt pip packages #3049

Fix dynamic linking issues with prebuilt pip packages #3049

Conversation

dbort commented Apr 16, 2024 • edited Loading

Test Plan

pytorch-bot bot commented Apr 16, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/3049

❌ 5 New Failures

dbort commented Apr 16, 2024 • edited Loading

dbort commented Apr 18, 2024 • edited Loading

kirklandsign commented Apr 19, 2024

larryliu0820 commented Apr 21, 2024 • edited by dbort Loading

dbort commented Apr 22, 2024

larryliu0820 left a comment

Choose a reason for hiding this comment

dbort commented Apr 22, 2024

guangy10 commented Apr 23, 2024

dbort commented Apr 23, 2024

dbort commented Apr 16, 2024 •

edited

Loading

pytorch-bot bot commented Apr 16, 2024 •

edited

Loading

dbort commented Apr 16, 2024 •

edited

Loading

dbort commented Apr 18, 2024 •

edited

Loading

larryliu0820 commented Apr 21, 2024 •

edited by dbort

Loading