The tests are built on top of the Beats Test Framework, where you can find a detailed description on how to run the test suite.
To run the unit tests, you can use make test
or simply go test ./...
. The unit tests do not require any external services.
The APM Server "system tests" run the APM Server in various scenarios, with the Elastic Stack running inside Docker containers.
To run the system tests locally, you can run go test
inside the systemtest directory.
Some tests make use of the concept of snapshot or approvals testing. If running tests leads to changed snapshots, you can use the approvals
tool to update the snapshots.
Following workflow is intended:
- Run
make update
to create theapprovals
binary that supports reviewing changes. - Run
make test
, which will create a*.received.json
file for every newly created or changed snapshot. - Run
make check-approvals
to review and interactively accept the changes.
To run simple benchmark tests, run:
make bench
A good way to present your results is by using benchcmp
.
With your changes in the current working tree, do:
$ go get -u golang.org/x/tools/cmd/benchcmp
$ make bench > new.txt
$ git checkout main
$ make bench > old.txt
$ benchcmp old.txt new.txt
The macro benchmarking focuses on measuring the APM Server's performance (throughput) and how changes in the codebase impact that performance.
Our legacy benchmarking leverages Hey APM to run daily benchmarks that aim to measure the overal APM Server's throughput covering a variety of cases, all of which are generated with the APM Go agent at the same time the benchmark is executed, limitting the complexity and variety of data that is generated for the benchmark scenarios. The results of these benchmarks are then indexed into Elasticsearch weekly reports that compare the current results against the last week, month and 3 months are reported in Slack.
The new benchmarking framework using apmbench
uses pre-recorded APM Agent events
for the benchmarks. This allows us to generate richer event which can be used to assess the Server's throughput.
The APM Integration testing is used to generate the events,
and intake-receiver
will capture the events that are sent to the intake API.
apmbench
will also generate additional metrics compared to Hey APM, allowing for better understanding of where
any potential bottlenecks may be, and how APM Server consumes the available resources.
TODO(marclop): convert the dot diagrams from dot to mermaid so they can be read in Markdown documents
The applications that are used to generate the stored traces, may not always use the apm-integration-testing
,
instead, we may want to write specific applications that generate a specific type of events, rather than re-use
the existing opbeans applications.
The events are currently commited in the apm-server
repository (apm-server/systemtest/benchtest/events
). This
may change in the near future, and instead, we'll download the stored traces on-demand and upload/update them
periodically.
# Navigate to your local copy of 'elastic/apm-integration-testing'.
$ SLEEP=180 STACK_VERSION=8.1.2 RPM=5000; ./scripts/compose.py start $STACK_VERSION --opbeans-go-loadgen-rpm ${RPM} --opbeans-python-loadgen-rpm ${RPM} --opbeans-node-loadgen-rpm $((${RPM} * 2)) --opbeans-ruby-loadgen-rpm ${RPM} --with-opbeans-go --no-apm-server-self-instrument --with-opbeans-python --with-opbeans-ruby --with-opbeans-node --apm-server-record --loadgen-no-ws && sleep $SLEEP && make copy-events; docker-compose down
...
# Copy the generated traces to the location where `apmbench` expects them to be (`apm-server/systemtest/benchtest/events`).
# Assuming that the `apm-server` repository has been checked out at the same level as `apm-integration-testing`.
$ cp -r events ../apm-server/systemtest/benchtest/events
apmbench
is located in systemtest/cmd/apmbench
and can target any APM Server with apm-server.expvar.enabled
set to true
to be able to calculate basic throughput measurements, but apm-server.pprof.enabled
should also
be set to true
if any of -blockprofile
, -cpuprofile
, -memprofile
or -mutexprofile
flags are set.
The default behavior of apmbench
is to send the captured events to the target APM Server as fast as possible
with the configured number of -agents
. The -agents
flag determines how many concurrent goroutines will be used
to send the events to the APM Server in parallel. The -max-rate
can be used to specify rate of events, as eps
or epm
to send to the APM server instead of the default behaviour. To benchmark the APM Server in setup similar
to what we'd see in production, the number of agents should be high (>500
).
By default, apmbench
will warm up the APM Server by sending N events to the APM Server before any of the
benchmark scenarios are run. That N can be configured via -warmup-events
and defaults to a conservative number.
The default -benchtime
is 1s
which, for our purposes isn't a great default, so if you're benchmarking
changes to the APM Server you'll want to set the duration to at least 30s
to have some quick feedback, our
periodic benchmarks should aim to benchmark for longer to allow any long-queue effects to be detected.
The rest of the flags configure the apmbench
so it can target an APM Server, these can be configured via the
set flags, or their ELASTIC_APM_<UPPERCASE FLAG NAME>
alternative, for example, to configure the server URL
set ELASTIC_APM_SERVER_URL
to the full URL of the APM Server you'd like to benchmark.
Often, we need to manually test the integration between different features, PR testing or pre-release testing.
Our docker-compose.yml
contains the basic components that make up the Elastic Stack for the APM Server.
APM Server publishes a set of metrics that are consumed either by Metricbeat or sent by the APM Server to an
Elasticsearch cluster. Some of these metrics are used to power the Stack Monitoring UI. The stack monitoring
setup is non trivial and has been automated in testing/stack-monitoring.sh
. The script will launch the
necessary stack components, modify the necessary files and once finished, you'll be able to test or ensure
that Stack Monitoring is working as expected.
Note that the testing/stack-monitoring.sh
script relies on systemtest/cmd/runapm
, and will use a locally built
version of APM Server (see more information below).
APM Server can be run in either standalone or managed mode by the ELastic Agent. To facilitate manual testing of
APM Server in managed mode, it is possible to inject a locally built apm-server
binary via systemtest/cmd/runapm
.
It requires having the apm-server docker-compose project running and creates the required fleet policies, and exposes
the APM Server port using a random binding that is printed to the standard output after the container has started.
$ cd systemtest/cmd/runapm
$ go run main.go -h
Usage of /var/folders/35/r4w8sbqj2md1sg944kpnzyth0000gn/T/go-build3644709196/b001/exe/main:
-arch string
The architecture to use for the APM Server and Docker Image (default runtime.GOARCH)
-d If true, runapm will exit after the agent container has been started
-f Force agent policy creation, deleting existing policy if found
-keep
If true, agent policy and agent will not be destroyed on exit
-name string
Docker container name to use, defaults to random
-namespace string
Agent policy namespace (default "default")
-policy string
Agent policy name (default "runapm")
-reinstall
Reinstall APM integration package (default true)
-var value
Define a package var (k=v), with values being YAML-encoded; can be specified more than once
It's possible to run runapm
(as pictured above) and re-use the image that runapm
builds to use in docker-compose
files or run in ECE / ESS. However, it's also possible to only build a docker image without requiring the docker-compose
project containers to be up and running with systemtest/cmd/buildapm
.
buildapm
reads the docker-compose.yml
at the root of the repository and uses that information to build an Elastic Agent
docker image with an APM Server bundled that contains any local changes you might have made.
By default, the amd64
architecture (or platform in Docker lingo) will be used. This may not be ideal if you run a machine
with a different architecture than amd64
, but you can specify the -arch
flag.
Additionally, if -cloud
is set, the Elastic Agent cloud image will be used as the base image, so changes can be packaged
and tested in ESS / ECE (See our internal documentation on these for how to use them).
$ cd systemtest/cmd/buildapm
$ go run main.go -arch arm64
2022/05/05 17:50:18 Building elastic-agent-systemtest:8.3.0-e4aa1f83-SNAPSHOT (arm64) from docker.elastic.co/beats/elastic-agent:8.3.0-e4aa1f83-SNAPSHOT...
2022/05/05 17:50:18 Building apm-server...
2022/05/05 17:50:18 Built /Users/marclop/repos/elastic/apm-server/build/apm-server-linux
2022/05/05 17:50:25 Built elastic-agent-systemtest:8.3.0-e4aa1f83-SNAPSHOT (arm64)
$ go run main.go -arch amd64
2022/05/05 17:50:35 Building elastic-agent-systemtest:8.3.0-e4aa1f83-SNAPSHOT (amd64) from docker.elastic.co/beats/elastic-agent:8.3.0-e4aa1f83-SNAPSHOT...
2022/05/05 17:50:35 Building apm-server...
2022/05/05 17:50:43 Built /Users/marclop/repos/elastic/apm-server/build/apm-server-linux
2022/05/05 17:50:49 Built elastic-agent-systemtest:8.3.0-e4aa1f83-SNAPSHOT (amd64)
# go run main.go -cloud
2022/05/19 11:08:04 Building image elastic-agent-systemtest:8.3.0-e4aa1f83-SNAPSHOT (amd64) from docker.elastic.co/cloud-release/elastic-agent-cloud:8.3.0-e4aa1f83-SNAPSHOT...
2022/05/19 11:08:04 Building apm-server...
2022/05/19 11:08:04 Built /Users/marclop/repos/elastic/apm-server/build/apm-server-linux
2022/05/19 11:09:07 Built image elastic-agent-systemtest:8.3.0-e4aa1f83-SNAPSHOT (amd64)