Skip to content
Jakub K edited this page Nov 11, 2024 · 10 revisions

High level flow of the full system:

The design is supposed to be very loosely connected. The NIPA services run on netdev infrastructure. Other services (in grey) run both on netdev infrastructure and remote sites.

Communication between services happens via JSON files fetched over HTTPs. The purple interface boxes correspond to the JSON formats consumed and produced by executors (the services actually running tests). Since we want to give people the ability to run their own tests in whatever environment suits them best, the implementation of the executor is entirely up to whoever hosts it. These interfaces will not change in a backward-incompatible way.

Branch constructor

Branch constructor creates git branches and instructs the rest of the pipeline to test them. It maintains a manifest file (branches.json) and a set of files describing particular branches (branches-info.json). The pipeline only consumes the manifest, the info files are consumed only by the Status updater to map branches back to patchwork patches.

The manifest (branches.json) is an array of objects with following fields:

Field Format Description
branch String Name of the branch tested. Unique ID of the branch.
url String Git URL for fetching the branch.
date String ISO formatted date (UTC). When the branch was created.
base String Hash of the base commit, on top of which changes to be tested are applied.

The detailed file (branches-info.json) is an object keyed by branch name, each entry contains an object with following fields:

Field Format Description
series Array of Number IDs of patchwork series applied in the branch.
prs Array of Number IDs of patchwork pull requests applied in the branch.
extras Array of String Local patches branch constructor applied (usually workarounds and fixes).
base-pulls String Extra branches pulled in to construct the tree (usually the net tree in case of net-next testing).
new-changes Boolean Whether the branch differs from the previous branch.

Branches are deleted a month after creation.

References:

Executors

Executors periodically check branch manifest and kick off tests if new branches were created. It’s possible to also hook executors to GitHub actions. There should be no filtering whether tests should run for a given branch (any code exercised by the tests has been changed), to simplify analysis and reasoning about reliability of the infra (if the tests are not reported - it’s a bug).

Simplest executor runs a configured script / binary and reports the result based on its exit code - exec.py

GitHub executor "forwards" the testing to a GitHub CI - gh.py It expects a branch which already has an open pull request. The executor pushes to this branch and watches the CI results for the already open pull request. It converts GitHub results into NIPA format. This executor is built for BPF GitHub repo, specifically.

Kernel selftest executor uses virtme-ng to execute kernel selftests inside a virtual machine - vmksft-p.py. It's the our main workhorse for all software testing.

Important: keep in mind that NIPA executors are just samples, the only real requirement is that the executor produces a correct results JSON.

remotes

Static / hand edited file listing all remote sites. Its primary purpose is to inform the Collector where to fetch results from. Collector uses it to poll appropriate sources of results, it also times out the search if the results do not arrive within 8 hours.

Results

Results are the central component of the design. Basic remote site should host a manifest file (results.json) listing all runs and URLs for full results from those runs. The manifest is an array of objects with following fields:

Field Format Description
executor String Name of the setup running the test, should include enough information to identify the entity (vendor) running the test. Used for grouping to exclude unreliable executors from final verdict.
branch String Name of the branch tested. Used to ID the code which was tested.
url String URL of the results for this run. null if tests have not finished.

When a remote site ingests a new branch it should spawn entries for all executors in its results manifest leaving the URL as null. Collector will wait until all URLs are populated before declaring remote as done with a branch. Note that Remote DB only lists sites, not executors, so Collector does not know about all executors upfront. This allows sites to create new executors without updating Remote DB.

Each run of tests on an executor generates a separate results file (linked in the manifest as URL).

Results file contains:

Filed Format Description
executor See doc in manifest.
branch See doc in manifest.
start String ISO formatted date (UTC). When execution started.
end String ISO formatted date (UTC). When execution finished.
link String (Optional) URL to detailed outputs for the entire executor run (including executor setup and infrastructure logs).
results is an array of objects listing tests and their results
results test String Name of the test.
group String Group of tests, used to collapse outputs in UI, and potentially selectively run tests based on what changed in the tree.
result String pass / fail / skip.
link String URL to where the detailed outputs can be viewed. Could be the same for all tests if the executor generates a single combined output for all tests. Splitting up results per test is preferred.
time Number (Optional) Test runtime in seconds (may include fractional part).
retry String (Optional) result for the retry run if first run failed - pass / fail / skip.
crashes String (Optional) Array of strings. Fingerprints of crashes which happened during the test.
results test String Subcase - subcase name.
result String Subcase - pass / fail / skip.
time Number (Optional) Subcase - subcase runtime (in sec).
retry String (Optional) Subcase - pass / fail / skip.
results within results is an array of objects listing sub-tests and their results. Most fields are optional and inherited from the main test.

The contents of the manifest and result files should be rotated monthly.

References:

Collector

Collector fetches results from remote sites periodically and populates a local database.

Status updater

Periodically reads the results collected by Collector and reports results for individual patches to patchwork. Status updater performs reverse mapping of branch to list of patches it contained. One patch can be in multiple branches (in fact it's present in all branches until discarded from PW or merged). The status in patchwork will be the "best" result over all branches.

filters

Static / hand edited file listing flaky tests which should be ignored.

Hacking

Running executors locally

If you need to modify or add executors, it is easy to test them locally. First, you need to a config file, following the template usually put in comment at the top of each executor, something like this:

[executor]
name=kunit
group=all
test=kunit-alltests
init=force
[remote]
branches=https://netdev.bots.linux.dev/static/nipa/branches.json
[local]
tree_path=/PATH/TO/linux
base_path=/PATH/TO/testing/kunit
results_path=results
json_path=jsons
[www]
url=https://something

Then, make sure that the repo under localtree_path already has a branch with the same name as the newest branch in the testing repo (from remotebranches). That way executors won't even touch your Linux repo, it should do a build in-place.

Finally, the executor can be started, e.g. /PATH/TO/contest/remote/kunit.py