Skip to content

Latest commit

 

History

History
130 lines (101 loc) · 4.98 KB

README.md

File metadata and controls

130 lines (101 loc) · 4.98 KB

Machine Queue

This project contains the scripts to allow remote access to the suite of machines maintained by the Trustworthy Systems group at UNSW, Sydney for testing seL4 and systems built on seL4.

Close collaborators and seL4 Foundation members can request access, and use these scripts to deploy their payloads onto our CI infrastructure; they are also used by GitHub's CI Actions for seL4.

Set Up

After you have a Trustworthy Systems account set up, clone this repository, and create a symlink from mq.sh to somewhere in your $PATH.

For example, you can do:

git clone [email protected]:seL4/machine_queue.git
H=$(pwd)/mq.sh
cd ~/bin
ln -s $H/mq.sh mq

The mq scripts assume a standard POSIX system, that /bin/sh is POSIX compliant, and that you can use ssh to reach tftp.keg.cse.unsw.edu.au without password. To set up the latter, add to your ~/.ssh/config:

Host tftp.keg.cse.unsw.edu.au
    ProxyJump login.trustworthy.systems
    ControlMaster auto
    ControlPersist 300
    ControlPath /home/%u/.ssh/controlmaster/%h-%p-%r.sock
    ServerAliveInterval 60
    TCPKeepAlive yes

or clauses with similar effect. The script makes many ssh connections in a row; without the control multiplexing this can get painfully slow.

Usage

mq has many subcommands:

  • run --- run a payload on a system
  • systems --- list available systems
  • sem --- interact directly with locks
  • system-tsv --- list available systems as tab-separated values
  • pool-tsv --- list available pools of systems as tab-separated values

mq run

mq.sh run -r|-c [-l logfile ] -s system [-w retry-time ] [-t retry-count ] [-n] [-a] [-d timeout ] [-e ] [-k ] [-L] -f file1 [-f file2] .. [-f filen ]

Acquires a lock for the machine called system (or a machine in the pool called system), and, once locked, runs the specified job.

Output from the machine is collected and passed back to the user both on stdout and into an optional logfile.

Jobs can be cancelled at any time with ^C, which will notify the server (if the job is running) and remove the job from the queue.

Returns 0 on success, nonzero if something went wrong

Options:

  • -r Reserves the device. Will not reboot or run an image
  • -n No lock changes. Checks that you have the lock, and then runs an image. Will not unlock afterwards.
  • -a Keep the machine alive after completion or error text detected. The console becomes read-write after the text has been found.
  • -c TEXT Image is run until the specified regular completion text.
  • -e TEXT Image is run until the specified error text is found.
  • -d TIME Timeout (in seconds) to wait for the completion text (default -1 AKA no timeout)
  • -k KEY Key for obtaining the lock
  • -l FILE Optional location to write all the console output to
  • -L This is a Linux image not seL4
  • -s TEXT Specifies which machine this job is for
  • -f FILE [+] Files to use as the job image. Most systems need a single image file; x86 currently expects two, the kernel and the root task.
  • -w TIME Number of seconds to wait between each attempt to acquire the lock (default 8)
  • -t RETRIES Number of retries to perform for acquiring the lock (default -1)

mq sem dumpall|-signal|-wait|-cancel|-info [-f] [-wretry-time] [-tretry-count] [-kLOCK_KEY ] [-T timeout]

Manually manipulate locks for machines. The lock for each system can be acquired or released.

You can forcibly release a lock for a system that you do not currently own by using the -f flag

Options:

  • -info SYSTEM Display lock information for the specified SYSTEM
  • -mr-info SYSTEM Display lock information for the specified SYSTEM in machine-readable format
  • -signal SYSTEM Release the lock for the specified SYSTEM
  • -wait SYSTEM Acquire the lock for the specified SYSTEM
  • -cancel SYSTEM Cancel -wait processes on the server that are waiting for specified SYSTEM and key
  • -w TIME Number of seconds to wait between each attempt to acquire the lock (default 8)
  • -t RETRIES Number of retries to perform for acquiring the lock (default -1, which means infinity)
  • -f Forcefully releases a lock even if you are not the owner
  • -k LOCK_KEY Set a key inside the lock
  • -T timeout Allow lock to be reclaimed after timeout seconds
  • dumpall Prints all currently locked systems

mq systems [help|simple]

Print list of available systems. With simple just give their names

Two other commands are more for use in scripts: mq system-tsv and mq pool-tsv take no arguments, and write to stdout all the systems, and all the pools (respectively).