Skip to content
Danail Branekov edited this page Jun 10, 2021 · 4 revisions

Tests in our CI run against live garden bosh deployments with different ops files (flavours) enabled. We also have full CF deployments. Here is a list of these deployments:

Eden Environments

Garden deployments in the gating group in CI are all deployed on a director named eden. In order to target the Eden director and list the deployed environments do the following:

  1. navigate to garden-ci/directors/eden
  2. execute direnv allow to init the environment (you need to be logged on to LastPass)
  3. bosh deployments

Here is a short description of each one:

Name Description
baobab This is a garden deployment with default properties that we run the periodic performance tests suite (GPATS) against. It is redeployed daily. Tests are run once a day and results are posted in the garden-ci channel in slack.
clean-garden This is another deployment with default properties. We run the acceptance tests (GATS) against it on each commit in garden-runc-release
ci-boshlite This is a lite deployment of the bosh director. We are running our GATS agains the garden server in that deployment to make sure it works fine.
ci-boshlite-latest-grr Same as above, but we are using the latest release candidate built by our pipeline
concourse This is concourse itself. It is deployed on Eden.
containerd-garden deployed with CONTAINERD_ENABLED=true. See Garden Modes
cputhrottle-garden deployed with experimental_cpu_throttling enabled. See CPU Entitlements (TODO: link)
jackalberry-garden a clean garden deployment used to run the garden-integration-tests/performance suite
nerdful-garden deployed with CONTAINERD_ENABLED=true and CONTAINERD_FOR_PROCESSES_ENABLED=true. See Garden-Modes
performance-garden a clean garden deployment used to run the garden-performance-acceptance-tests
rootless-garden a garden deployment with experimental_rootless_mode enabled
treehouse-garden a windows garden deployment

If you want to create a new garden deployment on eden, the easiest way is to get the clean-garden manifest, edit it until it fits your needs and deploy it under a different name.

cd "$HOME/workspace/garden-ci/directors/eden"
direnv allow # put director connection details in the env
bosh -d clean-garden manifest > new-garden-env.yml
# edit manifest
bosh -d new-garden-env deploy new-garden-env.yml

CF Deployments

Given that Garden is the container engine of Cloud Foundry, at some point it is natural to want to spin up a full cf deployment for testing, though we tend to be quite conservative about that, because CF is a biggie and its acceptance test suite (the CATS) is quite flaky and out of the expertise and control of the Garden team. Anyway, we have a script to do just that. It is as simple as running lite-me-up.sh create <env-name>. When it runs to completion the script will produce a directory called <env-name> that you should commit and push to github, so that you can destroy the envrironment when you no longer need it by running lite-me-up.sh destoy <env-name>.

As the name suggests the lite-me-up script will deploy CF on a bosh lite director. Each deployment has its own direcotr, and you can find some that we have been keepiong around if you look in the directors dir, right next to the Eden director. These are being deployed and periodically recreated in the non-gating section of the main pipeline. Then some tests are being run against them. Let's introduce each one in short.

Name Description
mel-b This is meant to be a cf deployment with a "spicy" garden config. It has experimental features like cpu throttling, containerd for processes and direct IO in grootfs turned on, the idea being that those are not widely deployed in production, hence not widely tested.
sleepygary This one is used for benchmarking app creation time on a standard cf deployment.
croptopmorty This one has OCI mode turned on and is used for running the same app creation benchmarks in order to show that OCI mode is (hopefully) more efficient.

Wavefront

Wavefront is a dashboard frontend which we use to monitor the health and performance of our test deployments. Its graphs are useful to detect abnormal behaviour, the performance impact of a change, etc.

Emitting Metrics Data to Wavefront

Wavefront dashboards are highly customisable and passive and it is up to its users to make sure they are feeded with data. Our test deployments are configured to emit three "flavours" of data: system health (load, disk usage, etc.), Garden server data (such as number of goroutines), performance date. As these flavours are quite different, we have implemented different approaches to implement them.

System Health

System helath data is collected from the output of various commands being run on the host. As system health is an ongoing thing, we use a cron job on every deployment VM to periodically (every hour) collect the health data and send it to Wavefront.

The definition of the cron job is contained in the very deployment manifest (feel free to have a look at e.g. containerd-garden). The cron job is created by the os-conf release's pre-start-script bosh job - it creates the cron job file /etc/cron.hourly/wavefront-metrics. Upon every run, the job creates a metrics file that is posted onto the /report endpoint of the Wavefront REST API, the so called Direct ingestion, thus emitting multiple values in a single shot.

Garden Server Data

The Garden server has a special debug endpoint /debug/vars which provides Garden related data such as number of goroutines in the server process. VMs emit that data to Wavefront via the metrics-adapter bosh job, defined in the vantablackbox release.

The metrics-adapter occasionally collects the Garden server data and emits it to the the wavefront-proxy that runs within the wavefront-proxy bosh job.

Performance data

Emitting performance data is built into the performance tests. They send the performance data to Wavefront via the wavefront proxy (see above)

Container creation time is sent by the the garden performance acceptance tests as well via a special Ginkgo reporter

The Dashboards

Deployment System Health

The Deployment System Health dashboard provides an overview on all test deployments. The dashboard can display the data for either a single deployment, or all the deployments.

Garden CI Monitor

The Garden CI Monitor dashboard is configured to display aggregated metrics that are important to the Garden project as a whole. For example, it displays the container creation time metric from the performance tests. In the past we have seen the metric to increase significantly after pushing a change and that made us aware that the change introduced has a significant performance impact.

The dashboard also displays important CI vitals such as concourse DB usage. In order to ensure smooth concourse DB migration during concourse version bumps it is advisable that the DB disk usage is below 50%.

Clone this wiki locally