Developing large, parallel, scalable applications is arguably the most demanding effort that end-users of HPC systems like Summit face. However, once an application is ready for production runs, a strong understanding of and familiarity with the user environment can be just as critical for a team to be productive.
The user environment includes interfaces to the batch scheduler, parallel job laucher, structure of available file systems, along with any user-configurable parts of the system. This challenge will rely on interaction with Summit's batch scheduler, IBM Spectrum LSF.
As you may have already seen in Basic_Workflow, LSF provides the fundamental mechanisms to submit batch jobs, monitor their status after they've been enqueued, and control them if needed.
We won't be submitting any new jobs here, but rather looking at others that have already been run and gathering information about them. To do this, we'll primarily use the bhist
command.
(See the bhist
manual page by running man bhist
for a full list of command options.)
-
How many jobs were completed on Ascent between 00:00 (midnight) on June 1, 2022 and 23:59 on June 15, 2022?
-
How many unique users did the jobs from question 1 belong to?
- Of the jobs found in question 1, what's the job ID of the longest running job?
- How long was it pending (pre-execution), and how long did it run (actual execution time)?
- When was it submitted?