-
Notifications
You must be signed in to change notification settings - Fork 14
Timing Measurements
The pilot sends a timing string to the server during the final job update with the following condense format:
pilotTiming = time_getjob | time_stagein | time_payload | time_stageout | time_initial_setup | time_payload_setup
where
- time_getjob: time for getJob curl operation to finish.
- time_stagein: time for entire stage-in to complete, including replica lookup. Note: the pilot cannot measure the time for direct i/o as this operation is handled by the transform.
- time_payload: time for payload execution. Note: this includes any pre- or post-processing.
- time_stageout: time for stage-out to complete, including log transfer.
- time_initial_setup: the initial setup time is the time measured from pilot startup to the get job operation. During this time the pilot downloads queue data, checks the proxy lifetime, etc.
- time_payload_setup: the time measured from before to after the payload setup (in case a '%H:%M:%S %Y/%m/%d'-time/date string is present at the beginning of the payload.stdout, the pilot will use it to improve the setup time measurement at the end of the payload)
(as of June 2022, currently the time_total_setup is still being reported which means time_initial_setup + time_payload_setup).
From pilot version 3.9.2, in the case of remote i/o verification, the time to complete the verification is subtracted from the setup time, and added to the stage-in time. Prior to this version, it was included in the setup time.
The Pilot reports CPU timing information on every server update. The measurements (system+user time for all child processes) are done during running approximately once a minute (using /prod/pid/stat) and a final measurement done immediately after the payload has finished (using os.times()).
Given an initial t0, user+system time is calculated like so:
- t1 = os.times()
- user_time = t1[2] - t0[2]
- system_time = t1[3] - t0[3]
The instant CPU timing calculation extracts the system+user time from /proc/pid/stat for a given pid (using os.sysconf_names['SC_CLK_TCK'] for conversion) and loops over all child process stat files.
- Introduction
- Pilot Architecture
- Pilot Workflows
- Event service
- Metadata
- Direct Access
- Signal Handling
- Error Codes
- Containers
- Special Algorithms
- Pilot Configuration
- Timing Measurements
- Copy Tools
- Pilot release procedure