Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Erroneous Conda Windows MKL dependency #126279

Open
baszalmstra opened this issue May 15, 2024 · 5 comments · May be fixed by pytorch/builder#1830
Open

Erroneous Conda Windows MKL dependency #126279

baszalmstra opened this issue May 15, 2024 · 5 comments · May be fixed by pytorch/builder#1830
Assignees
Labels
module: binaries Anything related to official binaries that we release to users module: intel Specific to x86 architecture module: mkl Related to our MKL support module: windows Windows support for PyTorch triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@baszalmstra
Copy link
Contributor

baszalmstra commented May 15, 2024

pytorch/setup.py

Line 1112 in a0aaf56

'mkl>=2021.1.1,<=2021.4.0; platform_system == "Windows"',

This line adds a dependency to the python package on mkl >=2021.1.1,<=2021.4.0 on windows. However, the built conda packages do not align with this requirement.

For instance take the conda package win-64_pytorch-2.3.0-py3.12_cuda11.8_cudnn8_0:

The PKG_INFO inside the package lists:

Requires-Dist: mkl<=2021.4.0,>=2021.1.1; platform_system == "Windows"

However the index.json has:

  "depends": [
    ...
    "mkl 2023.1.*",
    ...
  ],

Running pip-check will fail.

However, the conda package win-64_pytorch-2.3.0-py3.9_cuda11.8_cudnn8_0 (note the py3.9) does specify:

  "depends": [
    ...
    "mkl 2021.4.*",
    ...
  ],

I think its because of these lines in the meta.yaml:

https://github.com/pytorch/builder/blob/a4b440ec0f687c84d089310d68ae3345d6eca7fc/conda/pytorch-nightly/meta.yaml#L26-27

Regardless, I think this dependency should depend on the BLAS implementation chosen, if BLAS is not set to mkl I think this dependency shouldn't be there in the first place.

cc @seemethere @malfet @osalpekar @atalman @peterjc123 @mszhanyi @skyline75489 @nbcsm @vladimir-aubrecht @iremyux @Blackhex @cristianPanaite @frank-wei @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

@atalman atalman added the module: binaries Anything related to official binaries that we release to users label May 15, 2024
@atalman atalman added this to the 2.3.1 milestone May 15, 2024
@cpuhrsch cpuhrsch added module: windows Windows support for PyTorch triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels May 16, 2024
@malfet malfet added module: mkl Related to our MKL support module: intel Specific to x86 architecture labels May 17, 2024
@malfet
Copy link
Contributor

malfet commented May 17, 2024

I think conda and wheel packages should depends on the same version of mkl across all x86 platforms.

Regardless, I think this dependency should depend on the BLAS implementation chosen, if BLAS is not set to mkl I think this dependency shouldn't be there in the first place.

Unfortunately one can not replace MKL with OpenBLAS without sacrificing some of the functionality (not to mention performance). Namely MKL implements some FFT functions as well as sparse functionality. If you know of any plans to open alternatives with replaceable interfaces (for example we build with PocketFFT on arm machines), please do not hesitate to share those ideas here.

@jgong5 do you know if mkldnn necessarily depends on MKL?

@atalman
Copy link
Contributor

atalman commented May 17, 2024

Yes, let me try setting same dependency as we have in windows wheels for our conda package.

@atalman atalman self-assigned this May 17, 2024
@atalman
Copy link
Contributor

atalman commented May 22, 2024

Hi @baszalmstra this requires more investigation. Simply aligning the dependency does not work with python 3.12:
https://github.com/pytorch/pytorch/actions/runs/9193698377/job/25287599393?pr=121979#step:13:443

C:\actions-runner\_work\pytorch\pytorch\builder\windows>conda.bat activate testenv 
C:\actions-runner\_work\_temp\artifacts\pytorch-2.4.0.dev20240522-py3.12_cpu_0.tar.bz2
1 File(s) copied
2024-05-22T19:19:52 Subdir: noarch Gathering repodata
2024-05-22T19:19:52 noarch Writing pre-patch repodata
2024-05-22T19:19:52 noarch Applying patch instructions
2024-05-22T19:19:52 noarch Writing patched repodata
2024-05-22T19:19:52 noarch Building current_repodata subset
2024-05-22T19:19:52 noarch Writing current_repodata subset
2024-05-22T19:19:52 noarch Writing index HTML
Channels:
 - file:///C:/actions-runner/_work/_temp/artifacts
 - pytorch
 - numba/label/dev
 - nvidia
 - defaults
Platform: win-64
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: C:\actions-runner\_work\pytorch\pytorch\builder\windows\conda\envs\testenv

  added / updated specs:
    - pytorch==2.4.0.dev20240522


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    blas-1.0                   |              mkl           6 KB
    intel-openmp-2023.1.0      |   h59b6b97_46320         2.7 MB
    libuv-1.44.2               |       h2bbff1b_0         288 KB
    mkl-2023.1.0               |   h6b88ed4_46[358](https://github.com/pytorch/pytorch/actions/runs/9193698377/job/25287599393?pr=121979#step:13:359)       155.9 MB
    mpmath-1.3.0               |  py312haa95532_0         989 KB
    networkx-3.1               |  py312haa95532_0         2.9 MB
    pytorch-2.4.0.dev20240522  |     py3.12_cpu_0       146.5 MB  file:///C:/actions-runner/_work/_temp/artifacts
    pytorch-mutex-1.0          |              cpu           3 KB  pytorch
    sympy-1.12                 |  py312haa95532_0        14.0 MB
    tbb-2021.8.0               |       h59b6b97_0         149 KB
    typing_extensions-4.11.0   |  py312haa95532_0          75 KB
    ------------------------------------------------------------
                                           Total:       323.4 MB

The following NEW packages will be INSTALLED:

  blas               pkgs/main/win-64::blas-1.0-mkl 
  filelock           pkgs/main/win-64::filelock-3.13.1-py312haa95532_0 
  intel-openmp       pkgs/main/win-64::intel-openmp-2023.1.0-h59b6b97_46320 
  jinja2             pkgs/main/win-64::jinja2-3.1.3-py312haa95532_0 
  libuv              pkgs/main/win-64::libuv-1.44.2-h2bbff1b_0 
  markupsafe         pkgs/main/win-64::markupsafe-2.1.3-py312h2bbff1b_0 
  mkl                pkgs/main/win-64::mkl-2023.1.0-h6b88ed4_46358 
  mpmath             pkgs/main/win-64::mpmath-1.3.0-py312haa95532_0 
  networkx           pkgs/main/win-64::networkx-3.1-py312haa95532_0 
  pytorch            artifacts/win-64::pytorch-2.4.0.dev20240522-py3.12_cpu_0 
  pytorch-mutex      pytorch/noarch::pytorch-mutex-1.0-cpu 
  pyyaml             pkgs/main/win-64::pyyaml-6.0.1-py312h2bbff1b_0 
  sympy              pkgs/main/win-64::sympy-1.12-py312haa95532_0 
  tbb                pkgs/main/win-64::tbb-2021.8.0-h59b6b97_0 
  typing_extensions  pkgs/main/win-64::typing_extensions-4.11.0-py312haa95532_0 
  yaml               pkgs/main/win-64::yaml-0.2.5-he774522_0 


Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
subdir mismatch
subdir mismatch
Channels:
 - defaults
 - file:///C:/actions-runner/_work/_temp/artifacts
 - pytorch
Platform: win-64
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: C:\actions-runner\_work\pytorch\pytorch\builder\windows\conda\envs\testenv

  added / updated specs:
    - numpy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    mkl-service-2.4.0          |  py312h2bbff1b_1          55 KB
    mkl_fft-1.3.8              |  py312h2bbff1b_0         160 KB
    mkl_random-1.2.4           |  py312h59b6b97_0         196 KB
    numpy-1.26.4               |  py312hfd52020_0          11 KB
    numpy-base-1.26.4          |  py312h4dde[369](https://github.com/pytorch/pytorch/actions/runs/9193698377/job/25287599393?pr=121979#step:13:370)_0         6.6 MB
    ------------------------------------------------------------
                                           Total:         7.0 MB

The following NEW packages will be INSTALLED:

  mkl-service        pkgs/main/win-64::mkl-service-2.4.0-py312h2bbff1b_1 
  mkl_fft            pkgs/main/win-64::mkl_fft-1.3.8-py312h2bbff1b_0 
  mkl_random         pkgs/main/win-64::mkl_random-1.2.4-py312h59b6b97_0 
  numpy              pkgs/main/win-64::numpy-1.26.4-py312hfd52020_0 
  numpy-base         pkgs/main/win-64::numpy-base-1.26.4-py312h4dde369_0 


Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
subdir mismatch
subdir mismatch
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\actions-runner\_work\pytorch\pytorch\builder\windows\conda\envs\testenv\Lib\site-packages\torch\__init__.py", line 144, in <module>
    raise err
OSError: [WinError 126] The specified module could not be found. Error loading "C:\actions-runner\_work\pytorch\pytorch\builder\windows\conda\envs\testenv\Lib\site-packages\torch\lib\shm.dll" or one of its dependencies.
Error: Process completed with exit code 1.

Looks like issue maybe related to numpy:

    numpy-1.26.4               |  py312hfd52020_0          11 KB
    numpy-base-1.26.4          |  py312h4dde369_0         6.6 MB

@atalman atalman removed this from the 2.3.1 milestone May 22, 2024
@atalman
Copy link
Contributor

atalman commented May 22, 2024

Removing this from milestone for now. Will follow up on this

@jgong5
Copy link
Collaborator

jgong5 commented May 22, 2024

@jgong5 do you know if mkldnn necessarily depends on MKL?

Just notice this question, no, mkldnn doesn't depend on MKL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: binaries Anything related to official binaries that we release to users module: intel Specific to x86 architecture module: mkl Related to our MKL support module: windows Windows support for PyTorch triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants