Netdev CI system

High level flow of the full system:

The design is supposed to be very loosely connected. The NIPA services run on netdev infrastructure. Other services (in grey) run both on netdev infrastructure and remote sites.

Communication between services happens via JSON files fetched over HTTPs. The purple interface boxes correspond to the JSON formats consumed and produced by executors (the services actually running tests). Since we want to give people the ability to run their own tests in whatever environment suits them best, the implementation of the executor is entirely up to whoever hosts it. These interfaces will not change in a backward-incompatible way.

Branch constructor

Branch constructor creates git branches and instructs the rest of the pipeline to test them. It maintains a manifest file (branches.json) and a set of files describing particular branches (branches-info.json). The pipeline only consumes the manifest, the info files are consumed only by the Status updater to map branches back to patchwork patches.

The manifest (branches.json) is an array of objects with following fields:

Field	Format	Description
branch	String	Name of the branch tested. Unique ID of the branch.
url	String	Git URL for fetching the branch.
date	String	ISO formatted date (UTC). When the branch was created.
base	String	Hash of the base commit, on top of which changes to be tested are applied.

The detailed file (branches-info.json) is an object keyed by branch name, each entry contains an object with following fields:

Field	Format	Description
series	Array of Number	IDs of patchwork series applied in the branch.
prs	Array of Number	IDs of patchwork pull requests applied in the branch.
extras	Array of String	Local patches branch constructor applied (usually workarounds and fixes).
base-pulls	String	Extra branches pulled in to construct the tree (usually the `net` tree in case of `net-next` testing).
new-changes	Boolean	Whether the branch differs from the previous branch.

Branches are deleted a month after creation.

References:

Code: https://github.com/linux-netdev/nipa/blob/master/trivial-net-brancher.py
Repo where branches to test are pushed: https://github.com/linux-netdev/testing
JSON manifest of branches: https://netdev.bots.linux.dev/static/nipa/branches.json

Executors

Executors periodically check branch manifest and kick off tests if new branches were created. It’s possible to also hook executors to GitHub actions. There should be no filtering whether tests should run for a given branch (any code exercised by the tests has been changed), to simplify analysis and reasoning about reliability of the infra (if the tests are not reported - it’s a bug).

Simplest executor runs a configured script / binary and reports the result based on its exit code - exec.py

GitHub executor "forwards" the testing to a GitHub CI - gh.py It expects a branch which already has an open pull request. The executor pushes to this branch and watches the CI results for the already open pull request. It converts GitHub results into NIPA format. This executor is built for BPF GitHub repo, specifically.

Kernel selftest executor uses virtme-ng to execute kernel selftests inside a virtual machine - vmksft-p.py. It's the our main workhorse for all software testing.

Important: keep in mind that NIPA executors are just samples, the only real requirement is that the executor produces a correct results JSON.

remotes

Static / hand edited file listing all remote sites. Its primary purpose is to inform the Collector where to fetch results from. Collector uses it to poll appropriate sources of results, it also times out the search if the results do not arrive within 8 hours.

Results

Results are the central component of the design. Basic remote site should host a manifest file (results.json) listing all runs and URLs for full results from those runs. The manifest is an array of objects with following fields:

Field	Format	Description
executor	String	Name of the setup running the test, should include enough information to identify the entity (vendor) running the test. Used for grouping to exclude unreliable executors from final verdict.
branch	String	Name of the branch tested. Used to ID the code which was tested.
url	String	URL of the results for this run. null if tests have not finished.

When a remote site ingests a new branch it should spawn entries for all executors in its results manifest leaving the URL as null. Collector will wait until all URLs are populated before declaring remote as done with a branch. Note that Remote DB only lists sites, not executors, so Collector does not know about all executors upfront. This allows sites to create new executors without updating Remote DB.

Each run of tests on an executor generates a separate results file (linked in the manifest as URL).

Results file contains:

Filed			Format	Description
executor				See doc in manifest.
branch				See doc in manifest.
start			String	ISO formatted date (UTC). When execution started.
end			String	ISO formatted date (UTC). When execution finished.
link			String	(Optional) URL to detailed outputs for the entire executor run (including executor setup and infrastructure logs).
results is an array of objects listing tests and their results
results	test		String	Name of the test.
	group		String	Group of tests, used to collapse outputs in UI, and potentially selectively run tests based on what changed in the tree.
	result		String	pass / fail / skip.
	link		String	URL to where the detailed outputs can be viewed. Could be the same for all tests if the executor generates a single combined output for all tests. Splitting up results per test is preferred.
	time		Number	(Optional) Test runtime in seconds (may include fractional part).
	retry		String	(Optional) result for the retry run if first run failed - pass / fail / skip.
	crashes		String	(Optional) Array of strings. Fingerprints of crashes which happened during the test.
	results	test	String	Subcase - subcase name.
		result	String	Subcase - pass / fail / skip.
		time	Number	(Optional) Subcase - subcase runtime (in sec).
		retry	String	(Optional) Subcase - pass / fail / skip.
results within results is an array of objects listing sub-tests and their results. Most fields are optional and inherited from the main test.

The contents of the manifest and result files should be rotated monthly.

References:

Example results manifest file: https://netdev-3.bots.linux.dev/vmksft-net-dbg/jsons/results.json
Example results files: https://netdev-3.bots.linux.dev/vmksft-net-dbg/jsons/

Collector

Collector fetches results from remote sites periodically and populates a local database.

Status updater

Periodically reads the results collected by Collector and reports results for individual patches to patchwork. Status updater performs reverse mapping of branch to list of patches it contained. One patch can be in multiple branches (in fact it's present in all branches until discarded from PW or merged). The status in patchwork will be the "best" result over all branches.

filters

Static / hand edited file listing flaky tests which should be ignored.

Hacking

Running executors locally

If you need to modify or add executors, it is easy to test them locally. First, you need to a config file, following the template usually put in comment at the top of each executor, something like this:

[executor]
name=kunit
group=all
test=kunit-alltests
init=force
[remote]
branches=https://netdev.bots.linux.dev/static/nipa/branches.json
[local]
tree_path=/PATH/TO/linux
base_path=/PATH/TO/testing/kunit
results_path=results
json_path=jsons
[www]
url=https://something

Then, make sure that the repo under local → tree_path already has a branch with the same name as the newest branch in the testing repo (from remote → branches). That way executors won't even touch your Linux repo, it should do a build in-place.

Finally, the executor can be started, e.g. /PATH/TO/contest/remote/kunit.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly