Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with running sanity check for OpenMPI #3541

Open
asp8200 opened this issue Dec 26, 2024 · 0 comments
Open

Issues with running sanity check for OpenMPI #3541

asp8200 opened this issue Dec 26, 2024 · 0 comments

Comments

@asp8200
Copy link

asp8200 commented Dec 26, 2024

EasyBuild couldn't run the sanity check for OpenMPI-5.0.3-GCC-13.3.0.eb

I ran EasyBuild 4.9.4 (framework: 4.9.4, easyblocks: 4.9.4) on Rocky Linux v9.4 with Python v3.9.18.

I did manage to run eb OpenMPI-5.0.3-GCC-13.3.0.eb--robot --skip-sanity-check, and then afterwards I ran eb OpenMPI-5.0.3-GCC-13.3.0.eb --robot --sanity-check-only, which gave me the following error msg:

== sanity checking...
ERROR: Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/easybuild/main.py", line 137, in build_and_install_software
    (ec_res['success'], app_log, err) = build_and_install_one(ec, init_env)
  File "/usr/local/lib/python3.9/site-packages/easybuild/framework/easyblock.py", line 4276, in build_and_install_one
    result = app.run_all_steps(run_test_cases=run_test_cases)
  File "/usr/local/lib/python3.9/site-packages/easybuild/framework/easyblock.py", line 4155, in run_all_steps
    self.run_step(step_name, step_methods)
  File "/usr/local/lib/python3.9/site-packages/easybuild/framework/easyblock.py", line 3990, in run_step
    step_method(self)()
  File "/usr/local/lib/python3.9/site-packages/easybuild/easyblocks/o/openmpi.py", line 222, in sanity_check_step
    ranks = min(8, self.cfg['parallel'])
TypeError: '<' not supported between instances of 'NoneType' and 'int'

The problem seems to be that self.cfg['parallel'] in line 222 evaluates to None. I tried to add --parallel in the eb-command, that is, eb OpenMPI-5.0.3-GCC-13.3.0.eb --sanity-check-only --parallel=10 but that didn't help.

Hence I chaned line 222 in openmpi.py to

ranks = 8 if self.cfg['parallel'] == None else min(8, self.cfg['parallel'])

That got the sanity check running a bit furhter:

== sanity checking...
ERROR: Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/easybuild/main.py", line 137, in build_and_install_software
    (ec_res['success'], app_log, err) = build_and_install_one(ec, init_env)
  File "/usr/local/lib/python3.9/site-packages/easybuild/framework/easyblock.py", line 4276, in build_and_install_one
    result = app.run_all_steps(run_test_cases=run_test_cases)
  File "/usr/local/lib/python3.9/site-packages/easybuild/framework/easyblock.py", line 4155, in run_all_steps
    self.run_step(step_name, step_methods)
  File "/usr/local/lib/python3.9/site-packages/easybuild/framework/easyblock.py", line 3990, in run_step
    step_method(self)()
  File "/usr/local/lib/python3.9/site-packages/easybuild/easyblocks/o/openmpi.py", line 234, in sanity_check_step
    src_path = os.path.join(self.cfg['start_dir'], srcdir, src)
  File "/usr/lib64/python3.9/posixpath.py", line 76, in join
    a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType

I hacked my way around that by changing the code around line 234 in openmpi.py into

            scr_path_exists = False
            if self.cfg['start_dir'] != None:
                src_path = os.path.join(self.cfg['start_dir'], srcdir, src)
                scr_path_exists = os.path.exists(src_path)
            if scr_path_exists:

and then OpenMPI-5.0.3-GCC-13.3.0.eb passed the sanity check.

Details also described on EasyBuild Slack.

Thanks to @sassy-crick for helping out with the debug.

I'm very new to EasyBuild but I'd be happy to try and make a PR with the changes as listed above - if you agree that there is an issue with openmpi.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant