Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a warning about Slurm-related envar #2077

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 40 additions & 0 deletions src/docs/show-help-files/help-prte-runtime.txt
Original file line number Diff line number Diff line change
Expand Up @@ -144,3 +144,43 @@ other nodes using the "prte_remote_tmpdir_base" param.
This is only a warning advisory and your job will continue. You can
disable this warning in the future by setting the
"prte_silence_shared_fs" MCA param to "1".

[prte:slurm:envar]
PRTE detected the presence of an MCA parameter in the environment that
assigns custom command line arguments to the `srun` command used to
start PRTE's daemons on remote nodes:

Paramater value: %s

This warning is provided to alert you (the user) that this parameter
value will be ignored.

Background: Starting with Slurm version 23.11, a command line argument
(`--external-launcher`) was added to `srun` to indicate that the
command was being initiated from within a third-party launcher (e.g.,
`prte` or `prterun`). This allows Slurm to essentially freely modify
the `srun` command line while retaining a backward compatibility
capability when explicitly told to use it. Notably, Slurm
did this by automatically setting the PRTE_MCA_plm_slurm_args environment
variable to pass in its own command line arguments.

Unfortunately, this had the side effect of overriding most user- or
system-level settings. In addition, arguments passed on the
PRTE command line overrode any Slurm setting of the environment
variable, but with potentially undesirable side effects if newer
versions of `srun` misinterpret or fail to understand the user-specified
arguments.

PRRTE now directly determines the `srun` version and sets its `srun` cmd
line arguments accordingly. These arguments are set in addition to any
provided by the user, and therefore the user no longer needs to concern
themselves with changes induced by Slurm.

If the setting of the MCA parameter was intentional, or if the
parameter value looks acceptable to you, then please set the MCA parameter
either in the default MCA param file (either the system or user-level file),
or on the cmd line:

Cmd line: --prtemca plm_slurm_args %s
Default MCA param file: plm_slurm_args = %s

Comment on lines +149 to +186
Copy link
Contributor

@jsquyres jsquyres Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per our discussion on this PR, how about this for the message? (this was typed directly in github; please word wrap as appropriate)

PRTE detected the following MCA parameter in the environment:

  Parameter name:  %s
  Paramater value: %s

Unfortunately, SLURM will silently overwrite this value with its own value.  

To ensure that your MCA parameter value is used (and not the value that SLURM will silently set), 
you have two options:

1. Set this MCA parameter on the command line (vs. setting it in the environment).
2. Set this MCA parameter in a config file (vs. setting it in the environment).

Please see the following URL for a more complete description of the issue:

  ...some URL under https://docs.openpmix.org/en/latest/...

Note that this is slightly different %s expansion than the current message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of issues here. First, the value we can see is the final envar value - i.e., if Slurm is going to overwrite it, they already have done so. We have no idea if the user attempted to set an envar for that param, nor what that value might have been.

Second, we still need to address the problem this PR creates - it will block any execution under Slurm 23.11 and beyond. I do think that (if we are going to do this) we need some flag the user can set to indicate that we should ignore the envar value (i.e., unsetenv it) and continue executing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will block any execution under Slurm 23.11 and beyond.

Oh, gotcha -- no, I don't think we want that. I thought we talked about adding (abstraction-breaking, but probably still necessary) code in the tools codes directly -- before prte_init() and friends...?

That's why I was thinking we could show the value and have good confidence that it's the user's value -- not the SLURM-set value.

Am I thinking incorrectly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm afraid you're off - Slurm sets their envar before invoking mpirun (they just throw it into every allocation, just in case it is needed), so we only see the end result of what they do. If the user provided an envar, Slurm will have already overwritten it before invoking mpirun - so we can't know about it.

I guess that's only true for batch runs, though, as otherwise the user would have the chance to overwrite it (probably without realizing they did). In that case, our srun launch would fail due to the changed cmd line. Very awkward situation. Probably should just focus on the non-interactive use-case as that is the one where the problem would appear (i.e., envars getting overwritten).

If we detect an envar and error out, then there is no way the user can execute mpirun under Slurm 23.11 or above because Slurm will always have set their new envar. I guess we could detect and parse to see if other things beyond their addition are present - IIRC, they did release a fix to try and avoid overwriting the user's settings (not sure about that, but maybe). But that starts making this rather complicated.

So it seems to me we have a few non-ideal choices:

  1. just silently unset the envar
  2. warn and error out, but provide another envar that disables that behavior so people can execute. However, you can pretty much guarantee Slurm will just start setting that envar to alleviate user annoyance, so I'm not sure it will actually accomplish anything
  3. check the envar to see if it includes the Slurm value and edit it out since we are dealing with the new cmd line internally. If the only thing we find is the Slurm value, then just ignore it. Otherwise, just silently let the rest of the value continue being processed as we will add the required new cmd line option before we launch.
  4. or...??

Option 3 is beginning to sound like the best to me, but I'm open 🤷‍♂️

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slurm sets their envar before invoking mpirun (they just throw it into every allocation, just in case it is needed)

Ah, I guess I was thinking something different. Are you saying that this will happen:

shell$ salloc -N 1
shell$ echo $PRTE_MCA_plm_slurm_args
...will show the SLURM-set value here...

(and similar if I srun or sbatch a script that executes echo $PRTE_MCA_plm_slurm_args)

And if the above is correct, is the script below enough to defeat the SLURM silent env var writing?

shell$ cat myscript.sh
export PRTE_MCA_plm_slurm_args=foo
mpirun a.out
shell$ sbatch -N 2 ./myscript.sh

Or does SLURM set it in every allocation and also set it after mpirun and before prte (i.e., in the remote environment before it fork/exec's prte)? I.e., in all possible cases, SLURM sets this env variable, and it's just 100% not possible for the user to overwrite it.

If we detect an envar and error out, then there is no way the user can execute mpirun under Slurm 23.11 or above because Slurm will always have set their new envar.

Yeah, that's clearly bad -- we don't want that.

Per your specific options:

  1. Just silently unset: we probably don't want to compound bad behavior with more bad behavior.
  2. warn and error: I agree that this is not what we want.
  3. check the env var: this becomes an arms race, doesn't it? I.e., would a future release of slurm change the env var value so that our code doesn't find the magic value and therefore not think it's the SLURM-set value?

Hmm. I guess I need to think about this more...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying that this will happen:

shell$ salloc -N 1
shell$ echo $PRTE_MCA_plm_slurm_args
...will show the SLURM-set value here...

Pretty sure this is true

Or does SLURM set it in every allocation and also set it after mpirun and before prte (i.e., in the remote

Pretty sure this is not true, but I have no way of testing it.

I.e., in all possible cases, SLURM sets this env variable, and it's just 100% not possible for the user to overwrite it.

I don't believe this is true, but I honestly am not sure. There fix was aimed at mpirun users, but IIUC the envar was set at time-of-allocation. I don't see any way to set it between the mpirun and fork/exec of prterun as Slurm isn't involved there. So it would seem like users could override it if they chose, so long as they realized they need to include the new cmd line option.

The issue, in my mind, is that the envar is overriding what users put in their system and user-level default param files because the envar gets processed after those files. It isn't a question of someone setting an envar that doesn't work - it's a problem of putting values into default param files and unexplainingly finding them ignored. The user didn't put anything in the environment - their set values just "disappear" somehow.

Not sure what to do with it. Still a puzzlement on right path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...but IIUC the envar was set at time-of-allocation. I don't see any way to set it between the mpirun and fork/exec of prterun as Slurm isn't involved there.

Gotcha.

But even at the mpirun level (which we do have control of in Open MPI, obviously), how can we tell the difference between the user setting this env var and SLURM itself setting this env var?

It isn't a question of someone setting an envar that doesn't work - it's a problem of putting values into default param files and unexplainingly finding them ignored.

Oh, so I totally misunderstood this, then. Ok.

There is an MCA API that finds the source of an MCA param value -- but IIRC, it just checks for the value that will be used (which, if the value is in both the param file and the environment, it'll report that the value is in the environment).

Sigh. What a pickle. I'm honestly not sure what the right answer is here. The only way I can think to do this is to have a horrible abstraction break of putting a check for these specific variables when reading param files, and if we're in the SLURM job and the corresponding getenv() returns non-NULL, emit the warning (this is arguably quite a bit worse abstraction break than checking for these env vars in main() or the like).

The goal is simple: warn users that their value is silently being ignored. How to detect that is the problem. ☹️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the problem is that it is impossible to tell who set the envar.

Only thing I can think of to try is to check for the envar and then parse the value if it is found. We know what Slurm is setting - so if we see that value, we could silently ignore it. If we see any other value, then we probably can safely assume that the user provided it, and let the MCA system handle it as usual.

Would take a little checking to ensure we correctly identify the Slurm value. I can give that a try.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afraid I cannot find a new enough version of Slurm I can put in my container - and as reported elsewhere, I'm having zero success getting Slurm to run in the containers. I give up. Let's just silently ignore it?

43 changes: 25 additions & 18 deletions src/runtime/prte_init.c
Original file line number Diff line number Diff line change
Expand Up @@ -172,24 +172,6 @@ int prte_init_minimum(void)
return PRTE_ERR_SILENT;
}

/* Protect against the envar version of the Slurm
* custom args MCA param. This is an unfortunate
* hack that hopefully will eventually go away.
* See both of the following for detailed
* explanations and discussion:
*
* https://github.com/openpmix/prrte/issues/1974
* https://github.com/open-mpi/ompi/issues/12471
*
* Orgs/users wanting to add custom args to the
* internal "srun" command used to spawn the
* PRRTE daemons must do so via the default MCA
* param files (system or user), or via the
* prterun (or its proxy) cmd line
*/
unsetenv("PRTE_MCA_plm_slurm_args");
unsetenv("OMPI_MCA_plm_slurm_args");

/* carry across the toolname */
pmix_tool_basename = prte_tool_basename;

Expand Down Expand Up @@ -239,6 +221,31 @@ int prte_init_minimum(void)
return ret;
}

/* Protect against the envar version of the Slurm
* custom args MCA param. This is an unfortunate
* hack that hopefully will eventually go away.
* See both of the following for detailed
* explanations and discussion:
*
* https://github.com/openpmix/prrte/issues/1974
* https://github.com/open-mpi/ompi/issues/12471
*
* Orgs/users wanting to add custom args to the
* internal "srun" command used to spawn the
* PRRTE daemons must do so via the default MCA
* param files (system or user), or via the
* prterun (or its proxy) cmd line
*/
if (NULL != (evar = getenv("PRTE_MCA_plm_slurm_args"))) {
pmix_show_help("help-prte-runtime.txt", "prte:slurm:envar", true,
evar, evar, evar);
return PRTE_ERR_SILENT;
} else if (NULL != (evar = getenv("OMPI_MCA_plm_slurm_args"))) {
pmix_show_help("help-prte-runtime.txt", "prte:slurm:envar", true,
evar, evar, evar);
return PRTE_ERR_SILENT;
}

/* pre-load any default mca param files */
prte_preload_default_mca_params();

Expand Down
Loading