Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Built library is ~5x slower on recent Intel Macs #56

Open
Tomcc opened this issue Sep 24, 2023 · 5 comments
Open

Built library is ~5x slower on recent Intel Macs #56

Tomcc opened this issue Sep 24, 2023 · 5 comments
Labels
bug Something isn't working

Comments

@Tomcc
Copy link

Tomcc commented Sep 24, 2023

Hi,
I realize that this may be intentional & not a bug, but by default, I found out that the dylib that comes with whisper.unity is 5x slower than one built with the default AVX1=1, AVX2=1 and F16C=1 settings on my MacBook Pro 2019.

I noticed that build_cpp.sh goes out of its way to turn them off, I guess because they aren't supported on ARM Macs.
If that's the case, maybe it could make sense to build each architecture separately and then merge them with lipo?

Eg.

  cmake -DCMAKE_OSX_ARCHITECTURES="x86_64" -DWHISPER_BUILD_TESTS=OFF -DWHISPER_BUILD_EXAMPLES=OFF
  make

  mv "$build_path/libwhisper.dylib" "$build_path/libwhisper.x86_64.dylib"

  cmake -DCMAKE_OSX_ARCHITECTURES="arm64" -DWHISPER_BUILD_TESTS=OFF -DWHISPER_BUILD_EXAMPLES=OFF
  make

  lipo "$build_path/libwhisper.x86_64.dylib" "$build_path/libwhisper.dylib" -create -output "$build_path/libwhisper.dylib"

It's more verbose, but it would use whatever ggerganov thinks are good defaults for each arch.

I'd make a PR, but I don't have access to a ARM Mac to make sure that this code works :/

@Macoron
Copy link
Owner

Macoron commented Sep 24, 2023

I noticed that build_cpp.sh goes out of its way to turn them off, I guess because they aren't supported on ARM Macs.
If that's the case, maybe it could make sense to build each architecture separately and then merge them with lipo?

The problem is in whisper.cpp Cmake. If you try to build it for x86_64 on ARM Mac, it won't include AVX instructions. It seems cmake checks not target architecture, but compiler CPU architecture.

if (${CMAKE_SYSTEM_PROCESSOR} MATCHES "arm" OR ${CMAKE_SYSTEM_PROCESSOR} MATCHES "aarch64")
    message(STATUS "ARM detected")
else()
    message(STATUS "x86 detected")
    if (MSVC)
            if(NOT WHISPER_NO_AVX2)
                set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /arch:AVX2")
                set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} /arch:AVX2")
                set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} /arch:AVX2")
            else()
                if(NOT WHISPER_NO_AVX)
                    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /arch:AVX")
                    set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS_RELEASE} /arch:AVX")
                    set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} /arch:AVX")
                endif()
            endif()
    else()
        if (EMSCRIPTEN)
            set(CMAKE_C_FLAGS   "${CMAKE_C_FLAGS}   -pthread")
            set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -pthread")
        else()
            if(NOT WHISPER_NO_AVX)
                set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -mavx")
            endif()
            if(NOT WHISPER_NO_AVX2)
                set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -mavx2")
            endif()
            if(NOT WHISPER_NO_FMA)
                set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -mfma")
            endif()
            if(NOT WHISPER_NO_F16C)
                set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -mf16c")
            endif()
        endif()
    endif()
endif()

I recompiled library for x64 Mac with AVX enabled on Github Runner. @Tomcc could you please test it? On my Mac it runs fine on ARM version of Unity, but crashes if I use Rosetta. Not sure if problem in Rosetta AVX support or something else.

I put updated library here #57

@Tomcc
Copy link
Author

Tomcc commented Sep 24, 2023

Thanks for the help @Macoron !
I can't try it out for now, but I'll check it ASAP.

Off topic: I noticed that ggerganov is adding Metal support, so maybe this issue won't matter when all Macs can use either Metal or CoreML.

@Tomcc
Copy link
Author

Tomcc commented Sep 25, 2023

@Macoron I downloaded the lib and patched it in and the performance is the same as a locally built one!

Feel free to close the issue, thanks.

@Macoron
Copy link
Owner

Macoron commented Sep 25, 2023

I need to think how to handle this. It means that output of cmake is depended on CPU used to build it (ARM and Intel will produce different libs). It also means that I lose support for older versions of Unity running on my ARM Mac (because they use Rosetta which crashes with enabled AVX). For now, I'll keep this issue and PR open.

Off topic: I noticed that ggerganov is adding Metal support, so maybe this issue won't matter when all Macs can use either Metal or CoreML.

Yeah, not sure about CoreML, but Metal sounds fun. I will check when whisper.cpp makes new stable release.

@Macoron Macoron added the bug Something isn't working label Sep 25, 2023
@Tomcc
Copy link
Author

Tomcc commented Sep 25, 2023

Oh I thought your PR fixed it for all platforms... I didn't realize it was just a fix you did for me.

Yeah this is an annoying issue... and also I think it's a bug in whisper.cpp then, not in your repo.
Tbh I don't know if it's worth fixing, either, the newest Intel Macs are almost 4 years old now and very deprecated. It may just make sense to put a warning in the README and call it a day.
Also, Metal support may sidestep the issue when it's released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants