-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
let CMakeMake
easyblock also set Python_EXECUTABLE
option, as well as Python3_EXECUTABLE
and Python2_EXECUTABLE
derivatives (when appropriate)
#3463
Conversation
@Flamefire You could probably give the most informed review of this. I would be inclined to accept it. |
Python3_FIND_VIRTUALENV=STANDARD
to CMakeMake
I don't see any problem with this in general besides increasing the number of (potentially superflous) parameters we pass to configure. However: I'd prefer to pass the Python binary path explicitly as done in this PR This not only avoids it finding a virtualenv Python ("fixed" here) but also covers many other cases we might not found yet. @Crivella What do you think about that? Can you try #3282 with your #3373 not setting the |
@Flamefire Started the test it will take a couple hours to finish, but since stage 1 passed i don't think there would be a problem. Just ran a minimal test with
and
which results in
Was there some weirder reason that did not make GROMACS work with #3282 ? |
I'll try to compile my findings on the order of paths considered. The documentation states:
The relevant file is https://gitlab.kitware.com/cmake/cmake/-/blob/master/Modules/FindPython/Support.cmake which gets "called" by "FindPython*" with We can ignore the "STANDARD" for now because when Which leads to:
I excluded all of the ones mentioned in the docs that are excluded by the explicit So maybe we need to also set the |
By looking at the CMake source you linked (never realized it was so messy XD) and experimenting a bit more, I am wondering if we should just cut to the source and set It seems to also find the correct components when specifying test.sh
CMakeLists.txt
CMakeCache.txt
The fact that |
Yes I guess that would also work although I don't know if there is some (old?) CMake user code out there that relies on I also guess there is no trouble in using |
aca3df7
to
47f21b9
Compare
I've been browsing through the CMake source code previous versions. I think the logic was properly implemented in 3.16, using
This is still working since we are manually setting the result to what we want and I also tested that there is no problem in case the python version does not matches the requirements since the version is later validated even if we are forcing the I've made a new commit with the proposed changes (47f21b9). I would argue that unless we want to analyze the CMakeLists.txt we have to set both |
Python3_FIND_VIRTUALENV=STANDARD
to CMakeMakePython[X/2/3]_EXECUTABLE
to CMakeMake
I guess there is also an argument to be made to actually set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this works. https://gitlab.kitware.com/cmake/cmake/-/blob/d0ad8fd49c2d211081d32cdc2a7f96990947c340/Modules/FindPython/Support.cmake?page=2#L1877-1878 checks for an absolute path in that variable and always keeps it so the block starting at https://gitlab.kitware.com/cmake/cmake/-/blob/d0ad8fd49c2d211081d32cdc2a7f96990947c340/Modules/FindPython/Support.cmake?page=2#L1909 gets skipped.
However https://github.com/easybuilders/easybuild-easyblocks/pull/3282/files should work for CMake >= 3.12 while this only works for >= 3.16 which includes https://gitlab.kitware.com/cmake/cmake/-/commit/cea2010b5c7 that honors this variable
I've also tested up to CMake v3.12 where FindPython was introduced and Python_EXECUTABLE works for all of them.
What exactly have you tested? Did you test 3.12 and below or 3.12 and up? Because how I understand the change in 3.16 the *_EXECUTABLE
isn't honored before.
However checking the code for 3.15 it eventuall runs
find_program (${_PYTHON_PREFIX}_EXECUTABLE
NAMES ${_${_PYTHON_PREFIX}_NAMES}
${_${_PYTHON_PREFIX}_IRON_PYTHON_NAMES}
NAMES_PER_DIR
PATHS ${_${_PYTHON_PREFIX}_REGISTRY_PATHS}
PATH_SUFFIXES bin ${_${_PYTHON_PREFIX}_IRON_PYTHON_PATH_SUFFIXES}
NO_DEFAULT_PATH)
_python_validate_interpreter (${${_PYTHON_PREFIX}_FIND_VERSION})
So unless the python cmd set is invalid it should be kept because find_program
does nothing if the variable is already set.
Hence I think this change is OK
Co-authored-by: Alexander Grund <[email protected]>
Checked 3.12 and above since before there is only
Yes this is what i meant in my previous comment, even if it is not explicitly the behavior of |
Test report by @Flamefire Overview of tested easyconfigs (in order)
Build succeeded for 12 out of 15 (7 easyconfigs in total) |
GROMACS 2023.x fails during the installation of the Python extension which uses CMake through I'm not sure how to solve this. The extension in this EasyConfig is a "regular" We could likely also set However: Any ideas how to pass either the arguments or the env variables to the extension? I guess the gromacs easyblock could set the env variables before installing the extensions but then it would be good to have some way to get the values from the cmake easyblock to avoid duplication |
Have not worked much with extensions yet, but by looking at the code I think the easiest solution would be to modify the
or possibly defining an hook for the extension (https://github.com/easybuilders/easybuild-framework/blob/8ff6ba0d426b79a9b3da1248a7359d922b50596d/easybuild/framework/easyblock.py#L1916) I am not sure if there is a way to modify the extension steps in some other way
I guess this would be a design choice. I can think of two way:
Also I would say the 2 are not mutually exclusive and both could benefit also from implementing a scoped version of the |
That specific variable is only for GROMACS: https://github.com/easybuilders/easybuild-easyconfigs/blob/ef56e11c4066828fa4af98b4709ba40728fc4cc6/easybuild/easyconfigs/g/GROMACS/GROMACS-2024.1-foss-2023b.eb#L78 Hence the idea for that specific easyblock to get required env vars from cmakemake (e.g.
Yes that is what I had in mind: I'm bringing this up here to have a method that can be used in all similar places. And GROMACS was the only example I know of where the virtual env caused problems before. I was thinking if you could access the |
I tested the snippet in the previous comment (added to the gromacs easyblock) and it works also by passing
I've also this problem in the LLVM easyblock, since I have to do a multistage build while changing CMake variables across the stages. I ended up just reimplementing part of the CMakeMake easyblock and working with a new dictionary, but also there it might help to be able to inherit all/selectively the options from the CMakeMake.configure |
I'm not sure how that works from within the easyblock as the variable is set in the easyconfig. So as far as I see even if you set it in the easyblock the easyconfig will overwrite this. Or what exactly did you change and which EC did you test?
That might be a valuable datapoint: If it works easily for GROMACS base, the extension, and LLVM we have quite a few cases covered. But I guess for LLVM setting either the env variable or the I'm just wondering if using the env variables is more flexible as we won't need to figure out how to pass flags to all the cmake invocation, such as in the |
Sorry forgot to mention i also removed the
I am not sure, I think the env is not preserved (or reinitialized) across steps so it is possible the environment variable will have to be set in the needed step on a case by case basis? |
IIRC the environment is (partially?) reset when loading the fake module for the extension installation. But I think we can use either
That would be slightly more generic than only getting the |
8ffffde
to
606533b
Compare
I've added both the storing of the generated Concerning the usage of the environment variables one thing to be careful of: i could also see the edge case where the buldsystem itself reset/starts from a new environment before making the call to CMake, but at that point I do not think there is much that can be done beside abiding to the buildsystem way of passing args to CMake or patching. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added both the storing of the generated
options
inside the class and the suggested convenience function. Indeed when calling CMake from within a build process there is no defined way to pass it arguments and it will dependent on the implementation.
Yes storing the options is a good idea.
Concerning the usage of the environment variables one thing to be careful of: i could also see the edge case where the buldsystem itself reset/starts from a new environment before making the call to CMake
Yes, Bazel is one buildsystem that does reset the environment. However that doesn't call CMake AFAIK. I'm not aware of anything that calls CMake but resets the environment.
I do not think there is much that can be done beside abiding to the buildsystem way of passing args to CMake or patching.
Fully agreed.
I'm not sure if we should use 2 different approaches here (args and env variables). We might end up using almost always one approach and never notice when the rarely used one fails or becomes outdated. Using just the environment variables, not the *_EXECUTABLES
args might be enough:
Before CMake 3.12 there was only FindPythonInterp
which uses a plain find_program
Hence that will always pick up the one we have first in PATH which is ours.
After 3.12 it should pick up the one from the env variable according to my earlier analysis. However only when Python*_FIND_STRATEGY
is set to LOCATION
or the policy is set to new. If that is not the case (e.g. in the GROMACS python extension case) I think it searches for the python3.13
binary first in this loop and if it finds it in the virtualenv after not finding it in our module (e.g. Python 3.10 is the dependency) it will still use the virtualenv python.
So after writing all this I see that we can't reliably control the chosen Python binary via the environment. Hence for direct cmake invocations we should use *_EXECUTABLE
and remove the policy workaround. The env variables might still be useful in some cases. E.g. IIRC GROMACS changes the search strategy to LOCATION so it will pick up the env variables first.
@@ -320,6 +336,22 @@ def configure_step(self, srcdir=None, builddir=None): | |||
|
|||
return out | |||
|
|||
def set_cmake_env_vars_python(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused by this one, what's the plan with this?
As is, it can only be used in easyblocks that derive from CMakeMake
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I would make 2 free functions out of that:
- Return a configure string with
-DPython*_EXECUTABLE=...
, maybe optionally only as a dict - Return these env variables
The first function is the safe way for every situation we can pass CMake variables. We can use that to enhance the GROMACS easyblock to move the CMake config options out from the EasyConfig to the easyblock and add those options.
The 2nd function can be used whenever we cannot use the first. It might still pick up a python
from a virtualenv when Python*_FIND_STRATEGY
isn't set to LOCATION
("older" codes default this to LOCATION if unset) but at least we did a best effort to avoid that which might work for most cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the 2 separate functions to return the options as a dict or as a string:
get_cmake_python_config_dict
get_cmake_python_config_str
and changed the name of the function to set the env vars to set_cmake_python_env_hints
(9a31f79)
There might be an argument of making that function a context manager to be used with with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are taking the convenience function route, i am wondering if we should (probably another PR) also change the detection of all other options into separate/grouped convenience functions to make them reusable by other easyblocks than need to implement their own more complex configure step (eg the new LLVM one)
Python[X/2/3]_EXECUTABLE
to CMakeMakeCMakeMake
easyblock also set Python_EXECUTABLE
option, as well as Python3_EXECUTABLE
and Python2_EXECUTABLE
derivatives (when appropriate)
5e9bf93
to
f963962
Compare
f963962
to
9a31f79
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd do 2 more changes:
- Make the 2 functions freestanding. They don't use anything from the class instance so other easyblocks not deriving from CMakeMake directly should be able to use them too
- Remove the policy code at
easybuild-easyblocks/easybuild/easyblocks/generic/cmakemake.py
Lines 276 to 280 in bd7fe7e
# make sure that newer CMAKE picks python based on location, not just the newest python # Avoids issues like e.g. https://github.com/EESSI/software-layer/pull/370#issuecomment-1785594932 if LooseVersion(self.cmake_version) >= '3.15': options['CMAKE_POLICY_DEFAULT_CMP0094'] = 'NEW'
def get_cmake_python_config_dict(self): | ||
"""Get a dictionary with the CMake configuration options for Python hints.""" | ||
return self._get_cmake_python_config() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a need for this function if it does nothing but forwarding, so they are the same function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented with fa30b9c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But there are still 2 functions where get_cmake_python_config_dict == _get_cmake_python_config
. I'd just use one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was to have them more modularized in case in the future there is need for more separation between the dict and str functions, but i guess for now they can just be one
c88446f
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok almost done, thanks for bearing with me so far!
If you want to keep the cmake_version
in the function I'd suggest to include it also in the others: For CMake < 3.12 we only need PYTHON_EXECUTABLE
And some suggestions to improve the wording of the docstrings as I'd like to convey the following:
- The options/argument-str should be preferred
- That is not a hint but (pretty much) forces CMake to use that.
- All functions only return some non-empty value when a Python module is loaded
- Avoid suggesting that setting the env variables fully avoids CMake picking up the wrong Python. Due to the policies it can still happen although it is less likely
Besides that I wouldn't mention "FindPython" which likely isn't clear to anyone not familiar with CMake internals. If at all we could mention find_package(Python*)
as that is what is visible from reading the CML. Also "Convenience function" sounds a bit to verbose for describing what the function does.
Check if you like the suggestions or if there's a better wording.
Co-authored-by: Alexander Grund <[email protected]>
Co-authored-by: Alexander Grund <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until @Flamefire goes the last mile and becomes a maintainer, I'll need to merge this!
@boegelbot please test @ generoso |
@ocaisa: Request for testing this PR well received on login1 PR test command '
Test results coming soon (I hope)... - notification for comment with ID 2407211069 processed Message to humans: this is just bookkeeping information for me, |
Test report by @boegelbot Overview of tested easyconfigs (in order)
Build succeeded for 12 out of 12 (4 easyconfigs in total) |
Test report by @boegel Overview of tested easyconfigs (in order)
Build succeeded for 56 out of 58 (58 easyconfigs in total) |
|
EDIT
Following the thread discussion this has changed from setting
Python3_FIND_VIRTUALENV
to settingPython_EXECUTABLE
and all its variants in order to force the CMakeMake easyblock to pick up the correct python when loaded as a module.This would replace:
OLD
This is related to:
-DPython3_FIND_VIRTUALENV=STANDARD
#3283LLVM
easyblock for compilation of clang/flang + other llvm-projects #3373FindPython3
viafind_package(Python3)
Python3_FIND_VIRTUALENV
controls how CMake finds the python3 interpreter in case avirtualenv
orconda
environment is activated, by default prioritizing the virtual environment.Setting it to
STANDARD
switch the behavior to looking for python3 inPATH
which allows the correct python to be found when invokingeb
from within a virtual environmentNo version check has been added as the flag is supported since CMake 3.15 which now is tied to an archived toolchain (GCC 8.3.0 https://docs.easybuild.io/policies/toolchains/)
I think this should always be on by default.
Even if a build process tries to work with virtual environments (eg creating a venv, activating it and tha running cmake trying to use the python from the venv) i think that activating the virtualenv inside the build process would prepend to the PATH and still allow finding the correct python