-
Notifications
You must be signed in to change notification settings - Fork 47
OMPITesting
Howard Pritchard edited this page May 6, 2022
·
6 revisions
Note this wiki has references to the legacy perl MTT but is still relevant, in parts, to the python based MTT client.
The MTT is currently under active development, but is it also being used for testing of the Open MPI project. Here are the steps that Open MPI core organizations can do to use MTT for OMPI testing:
- Apply for an MTT username/password. This is used to submit testing results back to the central database; no results will be accepted without it. Usernames are given out on an organizational basis; for example, typical usernames are "hlrs", "iu", "cisco", "sun", etc. Send a mail to the MTT users mailing list requesting an account.
- Apply for access to the Github
open-mpi/ompi-tests
repository. Send your request to the MTT users mailing list. NOTE: This Github repository is only open to Open MPI core developer organizations. - Requirements for running MTT:
- The MTT client is written almost entirely in Perl. We have not taken any effort to see how old your Perl can be to run MTT. Let us know if you have problems with older versions.
- MTT currently only supports direct connections to the internet. Specifically, the machines that you run MTT on must be able to download information from the web and/or perform Git operations (depending on how you configure MTT) and be able to push HTTPS information back to a secure web site. In the future, MTT will support scenarios with limited or no internet connectivity (via relaying), but that support does not yet exist.
- A lot of disk space. Depending on how you configure MTT, you could have many different compilations and installations of MPI, each of which will drive builds and runs of a variety of test suites.
- Get a checkout of the current MTT release. Since MTT is still under development, it does not yet have traditional source or binary distributions. Instead, get a git clone of http://github.com/open-mpi/mtt with v3.0.0 tag. (Note that the Subversion code at http://svn.open-mpi.org/svn/mtt/trunk/ is no longer maintained).
- Read the MTT overview to understand the general flow of how MTT works. It is important to understand this before trying to configure MTT.
- Copy the file samples/ompi-core-template.ini to your own private copy (source:/branches/ompi-core-testers/samples/ompi-core-template.ini). Edit this private copy in your favorite editor.
- Read the comments / documentation in this file.
- See this wiki page for a complete listing of all available INI file fields.
- See this wiki page for a complete listing of all funclets.
- You will need to examine, at a minimum, the places where it says "OMPI Core:" in the documentation to see if those values are suitable for you.
- You will definitely need to change the values of "mttdatabase_username" and "mttdatabase_password" to be the username/password that you received in step 1.
- If you are not running in a scheduled environment, fill in either the hostfile, hostlist, or max_np values in the global defaults section. See the comments in the file for explanations of each. The MTT client currently understands SLURM and PBS-like environments (e.g, Torque, PBS Pro).
- We strongly encourage everyone to test at least the v1.3 nightly snapshot tarballs.
- If you have more compute cycles available, testing the trunk nightly snapshot tarballs is also strongly encouraged.
- NOTE: tarballs are only generated as changes occur on that branch
- Pick which compilers you want to test with and setup an "MPI Install" section for each.
- Edit the configure arguments for OMPI as suitable for your environment (e.g., set the compiler that you want, configure flags for high-speed interconnects, resource managers, support libraries, etc.).
- Setup to run MTT at some frequency. The following are recommended guidelines:
- Run via cron. Nightly would be great, but not everyone has resources to do that.
- Run with a dedicated user for MTT. Because of known bugs in Open MPI, the templated INI file has a "cleanup" step after each test run that kills all ORTE daemons and deletes all session directories belonging to the user who is running the MTT tests.
- User the "--verbose" flag to the MTT client to get some output to verify that MTT is working properly. As we get more confidence in MTT, you can remove this flag and it should run silently (good for cron runs). Verbose output is a good indicator of which versions have been tested, which tests pass/fail, etc. If you run via cron and use --verbose to stdout, cron may automatically e-mail you this output (depending on your local setup).
- The "--print-time" flag can be used to ask MTT how long each phase took (as well as how long the entire run took). This can be useful for tweaking your INI file since all of us only have so many hours a day to dedicate resources to testing (i.e., if you're not careful, MTT can be configured to run for days!).
- The "--scratch" option should be used to identify the absolute pathname to a root of a tree where MTT can do all of its builds. This should be the root of a large disk area (perhaps even a local disk for speed). You must use a different scratch tree for every instance of MTT that you run. For example, if you're running MTT on two different architectures, you must use a different scratch tree for each.
- The "--file" option should be used to identify the absolute pathname to your INI file.
- Here's a sample script to run MTT:
:
# Go to the right directory
cd /path/to/my/mtt/checkout
# Get the latest release version
git pull --rebase
# Run MTT
./client/mtt --scratch /my/scratch/space --file /my/ini/file.ini --verbose --print-time
- Results from MTT runs can be viewed at https://www.open-mpi.org/mtt/. You will need your username/password to view the results.
From Absoft, who is running MTT successfully on OS X platforms:
This one just worked after I installed the necessary packages which are default installs on OS X. In case anyone else wants to set this up, here are the steps I did on both the PowerPC and Intel OS X Leopard systems:
- Downloaded and installed the Darwinports 1.5 package from http://darwinports.com/
- Downloaded and installed the p5-crypt-ssleay package using the darwinports installer:
$ sudo port install p5-crypt-ssleay
- Downloaded and installed the coreutils package using the Darwinports installer:
$ sudo port install coreutils
- Make a sym link to "md5sum":
$ cd /opt/local/bin
$ sudo ln -s gmd5sum md5sum
- Ensure that /usr/sbin is in the path when you run MTT (e.g., via cron -- it's not by default) because MTT uses the system profiler in /usr/bin to figure out the OS name, version, and architecture. Failure to add /usr/sbin to the path will result in "unknown_darwin_hardware_please_send_us_a_patch" in the MTT output for that platform.
This installs an SSL-enabled version of wget (used for submitting results back to the OMPI testing database) and md5sum / sha1sum (used for verifying nightly OMPI tarball downloads).
See this page.
- MTT does not yet clean up after itself. The scratch tree is left unmodifed at the end of a run (the intent is to allow humans to go examine the tree, particualrly after a failure). Future functionality will have MTT selectively clean up after itself (even failures will time out and eventually be deleted) such that MTT should never fill up a disk. However, since this is not yet implemented, you will need some sort of disk-cleaning mechanism (perhaps even "rm -rf /your/scratch/space" periodically).
- MTT does not yet support "disconnected" scenarios (i.e., where nodes are not directly connected to the internet and cannot download/upload data).
- Since MTT is still in development, it is possible that we will periodically be clearing and resetting the results database.
- The MTT client currently understands SLURM and PBS-like environments (e.g., Torque, PBS Pro). If you need support for other resource managers, let us know.
- Perl "syntax error" messages from the client have been reported with perl v4.036. It's best to use a more recent perl.
- The MTT client does not properly handle "&" characters in functlet arguments when "&" is not the beginning of another functlet name.