Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrating with language-native installation practises #483

Open
multimeric opened this issue Jan 10, 2023 · 11 comments
Open

Integrating with language-native installation practises #483

multimeric opened this issue Jan 10, 2023 · 11 comments

Comments

@multimeric
Copy link

multimeric commented Jan 10, 2023

Is your feature request related to a problem? Please describe.

Currently the programming language (python, R) wrapper scripts are awkward to run, and often require evaling the wrapper script, which is often considered bad or dangerous so users may not feel comfortable doing this. This is also confusingly very different to the native "load a library" syntax that most languages have. In addition, loading them often requires a knowledge of the precise location of the wrapper script on the filesystem. For example, on my HPC system we have modules installed at /usr/local/modules/5.2.0 which differs from the documentation which tells us to use /usr/share/Modules. I will use Python as a motivating example. In the docs we are told to do this to enable modules in Python:

import os
exec(open('/usr/share/Modules/init/python.py').read())

As noted, this won't even actually work with my modules installation path.

Describe the solution you'd like

A nice solution would be to compile actual native language packages as part of the build process. For example, this would be possible in Python if we built a Python package that contained roughly the function function output by modulecmd python autoinit, and published it into a given (configurable) Python path, and then exported PYTHONPATH=/path/to/module/script:$PYTHONPATH so that Python can find it.

With this done, a user could simply import modules or perhaps import env_modules to distinguish this package from Python's own modules system. This would avoid the issues mentioned above. Similar arguments apply to R and other languages that I understand less well.

Describe alternatives you've considered

The current system of evaling specific files does work, but has disadvantages as described above.

@xdelaruelle
Copy link
Member

Thanks for your report.

Regarding locations in documentation, this is set when building Modules. Public documentation describes a commonly used location (/usr/share/Modules) but what you build on your side should reflect the installation path you have chosen.

Would it be possible to copy the Python initialization script of Modules in a 'Python path' location and make the import env_modules work?

@multimeric
Copy link
Author

Regarding locations in documentation, this is set when building Modules. Public documentation describes a commonly used location (/usr/share/Modules) but what you build on your side should reflect the installation path you have chosen.

Thanks, if I man module, I get a correct, localised script:

import os
exec(open('/usr/local/modules/5.2.0/init/python.py').read())
module('load', 'modulefile', 'modulefile', '...')

This is helpful, although I think my arguments above still apply in that it would be nice to use each language's native import process.

Would it be possible to copy the Python initialization script of Modules in a 'Python path' location and make the import env_modules work?

Are you asking if I could do it? Probably, yes, if I edited the script and put it into the PYTHONPATH then I could set it up for my own user. But I think it would be helpful if this was configured as part of the standard environtment modules installation, so all users could get easy module initialisation.

@xdelaruelle
Copy link
Member

I agree that what you suggest would be nice to have. Are you willing to provide a pull request for such enhancement? (maybe not for all languages but some of them)

@multimeric
Copy link
Author

Sure, I can look into it. I guess I would appreciate some guidance though. There are several ways I can envisage this working:

  1. At compile time, we create a Python package, with the module path MODULES_CMD and MODULESHOME hardcoded into that package, making it non portable. This would require some mechanism for editing the user's PYTHONPATH variable, which tells Python where to find packages.
  2. As above, but instead of harcoding these variables, they are read from the environment, and used to locate the modules infrastructure
  3. We create a Python package that we publish to PyPi, rather than distributing it with Modules. This decouples the python package and the modules installation, which has some advantages, but the downside is that users will have to manually install it.
  4. We compile the Python modules wrapper as a module, which users can then module load. This has the advantage of utilising the existing infrastructure without much effort, but on the other hand, if users are able to load the wrapper module via their shell then they can probably load all the other modules that way, making this approach a bit redundant.

@xdelaruelle
Copy link
Member

xdelaruelle commented Jan 13, 2023

I would go for 1 with an installation option added to the ./configure script which locates the site-package directory where to install the Python package. This way, there is no need to change the PYTHONPATH of user's environment. Maybe there is already some naming convention for such ./configure option?

@multimeric
Copy link
Author

So you think it should be installed into the system's Python, and so install it into /usr/lib/python3.X/site-packages? What I'm worried about is that if, you then load a custom Python either via modules or via conda, then this package probably won't be available. e.g. when I module load python/3; python3 -c 'import sys; print(sys.path)', it doesn't appear.

@xdelaruelle
Copy link
Member

So it could be two installation options:

  • one to give an installation location for this package
  • another to indicate that PYTHONPATH should be appended (or prepended) with this installation location during module's autoinit process

@pgierz
Copy link

pgierz commented Apr 3, 2023

I'd like to add here; the Python integration (at least for me) indeed feels "unPythonic", and it seems to not work (at least in a unit-test scenario). I have not looked further into detail yet, but this for example breaks:

$ cat test_module_command.py
def test_module_command():
    module_python_init = "/usr/share/Modules/init/python.py"
    try:
        exec(open(module_python_init).read())
        result = module("list")
    except NameError:
        assert False, "module function not correctly defined!"
    try:
        assert result
    except AssertionError:
        assert False, "module list did not work"
$ pytest test_module_command.py
===================================================== test session starts =====================================================
platform linux -- Python 3.8.16, pytest-7.2.2, pluggy-1.0.0
rootdir: /albedo/work/user/pgierz/SciComp/User-Support/fabagh001/OpenCV_Issue
collected 1 item

test_module_command.py F                                                                                                [100%]

========================================================== FAILURES ===========================================================
_____________________________________________________ test_module_command _____________________________________________________

    def test_module_command():
        module_python_init = "/usr/share/Modules/init/python.py"
        try:
            exec(open(module_python_init).read())
>           result = module("list")
E           NameError: name 'module' is not defined

test_module_command.py:5: NameError

During handling of the above exception, another exception occurred:

    def test_module_command():
        module_python_init = "/usr/share/Modules/init/python.py"
        try:
            exec(open(module_python_init).read())
            result = module("list")
        except NameError:
>           assert False, "module function not correctly defined!"
E           AssertionError: module function not correctly defined!
E           assert False

test_module_command.py:7: AssertionError
=================================================== short test summary info ===================================================
FAILED test_module_command.py::test_module_command - AssertionError: module function not correctly defined!
====================================================== 1 failed in 0.06s ======================================================

If there is indeed ever a push towards making the scripting language integration feel more intuitive, I can happily offer for writing unit tests (for Python at least)

@pgierz
Copy link

pgierz commented Apr 3, 2023

And interestingly, this works fine:

$ python
Python 3.8.16 | packaged by conda-forge | (default, Feb  1 2023, 16:01:55)
[GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> module_python_init = "/usr/share/Modules/init/python.py"
>>> exec(open(module_python_init).read())
>>> result = module("list")
Currently Loaded Modulefiles:
 1) git/2.35.2   2) conda/22.9.0-2
>>> result is True
True

@xdelaruelle
Copy link
Member

@pgierz You need to make the module def defined in the global scope when calling exec() from a def.

def test_module_command():
    module_python_init = "/usr/share/Modules/init/python.py"
    try:
        exec(open(module_python_init).read(), globals())
        result = module("list")
    except NameError:
        assert False, "module function not correctly defined!"
    try:
        assert result
    except AssertionError:
        assert False, "module list did not work"

More explanation available here for instance: https://stackoverflow.com/questions/24733831/using-a-function-defined-in-an-execed-string-in-python-3

xdelaruelle added a commit that referenced this issue Apr 4, 2023
Update Python initialization example to precise that the init script
should be exec-ed in the global scope. This is important when init
script is executed from a function.

This change helps to clarify a question asked on #483.
@pgierz
Copy link

pgierz commented Apr 4, 2023

Thank you @xdelaruelle! Another little trick to keep written in my book.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants