You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 27, 2024. It is now read-only.
It's my understanding that the default for the node_location_configuration is True, i.e. all ranks will invoke the compiler inside pyop2.compilation.Compiler.get_so(). However, when this setting is False, only rank 0 will invoke the compiler, and the assumption is that all other ranks are able to access the outputs from this. (As an aside, this doesn't seem to capture the scenario of running on only one rank of a physical node when running across nodes with MPI?)
For context, we're running Firedrake with a fairly large number of ranks, and there's a significant disk impact when all these ranks simultaneously invoke the compiler for version sniffing. As far as I can tell, there are two places this happens:
Within pyop2.compilation.sniff_compiler() itself, as called from pyop2.compilation.load(). In this case, the non-compiling ranks can probably just use the AnonymousCompiler or Compiler base class: they just need to wait for the MPI barrier to signal the shared library is available, and load it. I think in this case, the defaults are such that Compiler.sniff_compiler_version() will simply run into an exception because no compiler is defined.
If, e.g. pyop2.compilation.set_default_compiler(pyop2.compilation.LinuxGnuCompiler) is called to avoid the above call to sniff_compiler(), the compiler is still invoked on all ranks: Compiler.__init__() calls Compiler.sniff_compiler_version(), which will now have a valid executable.
As far as I can tell, Compiler.sniff_compiler_version() is only required for cflags patching for certain (very old) compiler versions. In order not to incur the wrath of our HPC sysadmins, I've resorted to the following preamble to patch out all the sniffing behaviour:
This makes sense, though I think from a style perspective that non-compiling ranks should not be creating Compiler instances at all. @JDBetteridge is the expert on this corner of PyOP2 so it would be good to get his feedback once he's back from holiday.
This should be a straightforward fix, thanks for pointing it out. The version sniffing is used for the repr and is incredibly useful for debugging.
You are right the Compiler shouldn't need to be instantiated on non-compiling ranks, the get_so being a method of the compiler is a legacy feature and is only invoked in one place anyway. I will work on a quick fix that broadcasts the version over the compilation comm rather than making a subprocess call on all ranks.
It's my understanding that the default for the
node_location_configuration
is True, i.e. all ranks will invoke the compiler insidepyop2.compilation.Compiler.get_so()
. However, when this setting is False, only rank 0 will invoke the compiler, and the assumption is that all other ranks are able to access the outputs from this. (As an aside, this doesn't seem to capture the scenario of running on only one rank of a physical node when running across nodes with MPI?)For context, we're running Firedrake with a fairly large number of ranks, and there's a significant disk impact when all these ranks simultaneously invoke the compiler for version sniffing. As far as I can tell, there are two places this happens:
pyop2.compilation.sniff_compiler()
itself, as called frompyop2.compilation.load()
. In this case, the non-compiling ranks can probably just use theAnonymousCompiler
orCompiler
base class: they just need to wait for the MPI barrier to signal the shared library is available, and load it. I think in this case, the defaults are such thatCompiler.sniff_compiler_version()
will simply run into an exception because no compiler is defined.pyop2.compilation.set_default_compiler(pyop2.compilation.LinuxGnuCompiler)
is called to avoid the above call tosniff_compiler()
, the compiler is still invoked on all ranks:Compiler.__init__()
callsCompiler.sniff_compiler_version()
, which will now have a valid executable.As far as I can tell,
Compiler.sniff_compiler_version()
is only required for cflags patching for certain (very old) compiler versions. In order not to incur the wrath of our HPC sysadmins, I've resorted to the following preamble to patch out all the sniffing behaviour:However I think the core issue is that, in my opinion, the compiler simply doesn't need to be run on non-compiling ranks.
The text was updated successfully, but these errors were encountered: