Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buildkite: opentelemetry+elastic agent overhead benchmark on a weekly basis #3371

Merged
merged 34 commits into from
Oct 27, 2023
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
d029974
buildkite: run opentelemetry benchmark
v1v Oct 18, 2023
fc095ed
chore: for testing purposes
v1v Oct 18, 2023
218b863
no retry
v1v Oct 18, 2023
e2828df
Update .buildkite/scripts/opentelemetry-benchmark.sh
v1v Oct 18, 2023
38b42a4
improve: use gh and clone main with shallow cloning
v1v Oct 18, 2023
acc74fe
fixes: To have GitHub CLI store credentials instead, first clear the …
v1v Oct 18, 2023
b8f8b39
avoid load pipeline using pre-command
v1v Oct 18, 2023
1efccaf
avoid login
v1v Oct 18, 2023
eff8b4e
run gh within the apm-agent-java context
v1v Oct 19, 2023
e76c73b
buildkite: notify in the channel
v1v Oct 19, 2023
703824f
common steps
v1v Oct 19, 2023
fe3d9a5
chore: more log groups to help with printing output
v1v Oct 19, 2023
e811189
buildkite: use annotations to visualise report
v1v Oct 19, 2023
e3d2757
buildkite: archive report
v1v Oct 19, 2023
9d5ccc7
use markdown format and add entry for reporting output
v1v Oct 19, 2023
60733ec
buildkite: prepare ES credentials
v1v Oct 19, 2023
88fda88
add apm-server mock and tear down
v1v Oct 26, 2023
5c35ae1
chore: change logs
v1v Oct 26, 2023
04041aa
Merge remote-tracking branch 'upstream/main' into feature/support-ben…
v1v Oct 26, 2023
7ea0636
support generate ES docs
v1v Oct 26, 2023
3152142
fix: env variable and use the relative path to the jar file
v1v Oct 26, 2023
e846d15
reduce log levels for the maven thing
v1v Oct 26, 2023
d0c4fd6
chore: log
v1v Oct 26, 2023
00d6303
chore
v1v Oct 26, 2023
5c0e651
feat: change index name
v1v Oct 26, 2023
313354c
enable send benchmark
v1v Oct 26, 2023
8496297
consume github action artifact
v1v Oct 26, 2023
f8a3cc8
Apply suggestions from code review
v1v Oct 26, 2023
8c39208
feat: change index name
v1v Oct 26, 2023
4ae91f2
Merge branch 'feature/support-benchmark-otel-in-buildkite' of https:/…
v1v Oct 26, 2023
b637b8a
Merge branch 'feature/support-benchmark-otel-in-buildkite' of https:/…
v1v Oct 26, 2023
ac57ca0
fix location of the file
v1v Oct 26, 2023
7d02e6d
revert
v1v Oct 26, 2023
ac40f08
chore: archive test results to help with debugging if any test failures
v1v Oct 26, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .buildkite/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,12 @@ This is the Buildkite pipeline for the APM Agent java in charge of the snaposhot

To view the pipeline and its configuration, click [here](https://buildkite.com/elastic/apm-agent-java-snapshot) or
go to the definition in the `elastic/ci` repository.

## opentelemetry-benchmark pipeline

This is the Buildkite pipeline for the Opentelemetry Benchmark.

### Pipeline Configuration

To view the pipeline and its configuration, click [here](https://buildkite.com/elastic/apm-agent-java-opentelemetry-benchmark) or
go to the definition in `opentelemetry-benchmark.yml`.
85 changes: 21 additions & 64 deletions .buildkite/hooks/pre-command
Original file line number Diff line number Diff line change
Expand Up @@ -8,67 +8,24 @@

set -eo pipefail

echo "--- Prepare vault context :vault:"
VAULT_ROLE_ID_SECRET=$(vault read -field=role-id secret/ci/elastic-apm-agent-java/internal-ci-approle)
export VAULT_ROLE_ID_SECRET

VAULT_SECRET_ID_SECRET=$(vault read -field=secret-id secret/ci/elastic-apm-agent-java/internal-ci-approle)
export VAULT_SECRET_ID_SECRET

VAULT_ADDR=$(vault read -field=vault-url secret/ci/elastic-apm-agent-java/internal-ci-approle)
export VAULT_ADDR

# Delete the vault specific accessing the ci vault
PREVIOUS_VAULT_TOKEN=$VAULT_TOKEN
export PREVIOUS_VAULT_TOKEN
unset VAULT_TOKEN

echo "--- Prepare a secure temp :closed_lock_with_key:"
# Prepare a secure temp folder not shared between other jobs to store the key ring
export TMP_WORKSPACE=/tmp/secured
export KEY_FILE=$TMP_WORKSPACE"/private.key"

# Secure home for our keyring
export GNUPGHOME=$TMP_WORKSPACE"/keyring"
mkdir -p $GNUPGHOME
chmod -R 700 $TMP_WORKSPACE

echo "--- Prepare keys context :key:"
VAULT_TOKEN=$(vault write -field=token auth/approle/login role_id="$VAULT_ROLE_ID_SECRET" secret_id="$VAULT_SECRET_ID_SECRET")
export VAULT_TOKEN

# Nexus credentials
SERVER_USERNAME=$(vault read -field username secret/release/nexus)
export SERVER_USERNAME
SERVER_PASSWORD=$(vault read -field password secret/release/nexus)
export SERVER_PASSWORD

# Signing keys
vault read -field=key secret/release/signing >$KEY_FILE
KEYPASS_SECRET=$(vault read -field=passphrase secret/release/signing)
export KEYPASS_SECRET
export KEY_ID_SECRET=D88E42B4

# Import the key into the keyring
echo "$KEYPASS_SECRET" | gpg --batch --import "$KEY_FILE"

echo "--- Configure git context :git:"
# Configure the committer since the maven release requires to push changes to GitHub
# This will help with the SLSA requirements.
git config --global user.email "[email protected]"
git config --global user.name "apmmachine"

echo "--- Install JDK17 :java:"
# JDK version is defined in two different locations, here and .github/workflows/maven-goal/action.yml
JAVA_URL=https://jvm-catalog.elastic.co/jdk
JAVA_HOME=$(pwd)/.openjdk17
JAVA_PKG="$JAVA_URL/latest_openjdk_17_linux.tar.gz"
curl -L --output /tmp/jdk.tar.gz "$JAVA_PKG"
mkdir -p "$JAVA_HOME"
tar --extract --file /tmp/jdk.tar.gz --directory "$JAVA_HOME" --strip-components 1

export JAVA_HOME
PATH=$JAVA_HOME/bin:$PATH
export PATH

java -version || true
# Upload should not do much with the pre-command.
if [[ "$BUILDKITE_COMMAND" =~ .*"upload".* ]]; then
echo "Skipped pre-command when running the Upload pipeline"
exit 0
fi

## TODO: change name for the opentelemetry-benchmark
if [ "$BUILDKITE_PIPELINE_SLUG" == "apm-agent-java-load-testing" ]; then
source .buildkite/hooks/prepare-benchmark.sh
fi

if [ "$BUILDKITE_PIPELINE_SLUG" == "apm-agent-java-snapshot" ]; then
source .buildkite/hooks/prepare-release.sh
fi

if [ "$BUILDKITE_PIPELINE_SLUG" == "apm-agent-java-release" ]; then
source .buildkite/hooks/prepare-release.sh
fi

# Run always
source .buildkite/hooks/prepare-common.sh
25 changes: 25 additions & 0 deletions .buildkite/hooks/prepare-benchmark.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#!/usr/bin/env bash
set -euo pipefail

echo "--- Prepare elasticsearch secrets :vault:"
ES_URL_SECRET=$(vault read -field=es_url secret/ci/elastic-apm-agent-java/opentelemetry-benchmark)
ES_USER_SECRET=$(vault read -field=es_user secret/ci/elastic-apm-agent-java/opentelemetry-benchmark)
ES_PASS_SECRET=$(vault read -field=es_pass secret/ci/elastic-apm-agent-java/opentelemetry-benchmark)
export ES_URL_SECRET ES_USER_SECRET ES_PASS_SECRET

echo "--- Prepare github secrets :vault:"
GITHUB_SECRET=$(vault kv get -field token "kv/ci-shared/observability-ci/github-apmmachine")
GH_TOKEN=$GITHUB_SECRET
export GITHUB_SECRET GH_TOKEN
GITHUB_USERNAME=apmmachine
export GITHUB_USERNAME

echo "--- Install gh :github:"
GH_URL=https://github.com/cli/cli/releases/download/v2.37.0/gh_2.37.0_linux_amd64.tar.gz
GH_HOME=$(pwd)/.gh
curl -L --output /tmp/gh.tar.gz "$GH_URL"
mkdir -p "$GH_HOME"
tar --extract --file /tmp/gh.tar.gz --directory "$GH_HOME" --strip-components 1

PATH=$GH_HOME/bin:$PATH
export PATH
17 changes: 17 additions & 0 deletions .buildkite/hooks/prepare-common.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/usr/bin/env bash
set -euo pipefail

echo "--- Install JDK17 :java:"
# JDK version is defined in two different locations, here and .github/workflows/maven-goal/action.yml
JAVA_URL=https://jvm-catalog.elastic.co/jdk
JAVA_HOME=$(pwd)/.openjdk17
JAVA_PKG="$JAVA_URL/latest_openjdk_17_linux.tar.gz"
curl -L --output /tmp/jdk.tar.gz "$JAVA_PKG"
mkdir -p "$JAVA_HOME"
tar --extract --file /tmp/jdk.tar.gz --directory "$JAVA_HOME" --strip-components 1

export JAVA_HOME
PATH=$JAVA_HOME/bin:$PATH
export PATH

java -version || true
52 changes: 52 additions & 0 deletions .buildkite/hooks/prepare-release.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
#!/usr/bin/env bash
set -euo pipefail

echo "--- Prepare vault context :vault:"
VAULT_ROLE_ID_SECRET=$(vault read -field=role-id secret/ci/elastic-apm-agent-java/internal-ci-approle)
export VAULT_ROLE_ID_SECRET

VAULT_SECRET_ID_SECRET=$(vault read -field=secret-id secret/ci/elastic-apm-agent-java/internal-ci-approle)
export VAULT_SECRET_ID_SECRET

VAULT_ADDR=$(vault read -field=vault-url secret/ci/elastic-apm-agent-java/internal-ci-approle)
export VAULT_ADDR

# Delete the vault specific accessing the ci vault
PREVIOUS_VAULT_TOKEN=$VAULT_TOKEN
export PREVIOUS_VAULT_TOKEN
unset VAULT_TOKEN

echo "--- Prepare a secure temp :closed_lock_with_key:"
# Prepare a secure temp folder not shared between other jobs to store the key ring
export TMP_WORKSPACE=/tmp/secured
export KEY_FILE=$TMP_WORKSPACE"/private.key"

# Secure home for our keyring
export GNUPGHOME=$TMP_WORKSPACE"/keyring"
mkdir -p $GNUPGHOME
chmod -R 700 $TMP_WORKSPACE

echo "--- Prepare keys context :key:"
VAULT_TOKEN=$(vault write -field=token auth/approle/login role_id="$VAULT_ROLE_ID_SECRET" secret_id="$VAULT_SECRET_ID_SECRET")
export VAULT_TOKEN

# Nexus credentials
SERVER_USERNAME=$(vault read -field username secret/release/nexus)
export SERVER_USERNAME
SERVER_PASSWORD=$(vault read -field password secret/release/nexus)
export SERVER_PASSWORD

# Signing keys
vault read -field=key secret/release/signing >$KEY_FILE
KEYPASS_SECRET=$(vault read -field=passphrase secret/release/signing)
export KEYPASS_SECRET
export KEY_ID_SECRET=D88E42B4

# Import the key into the keyring
echo "$KEYPASS_SECRET" | gpg --batch --import "$KEY_FILE"

echo "--- Configure git context :git:"
# Configure the committer since the maven release requires to push changes to GitHub
# This will help with the SLSA requirements.
git config --global user.email "[email protected]"
git config --global user.name "apmmachine"
10 changes: 7 additions & 3 deletions .buildkite/load-testing.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
# @reakaleek: This is a place holder to create the pipeline in Buildkite. I will work on it, in a follow-up.
steps:
- label: ":wave: Greetings" # Label (with rich emojis https://ela.st/bk-emoji).
command: "echo 'My first pipeline!'" # Command to run (evaluated by Bash).
- label: "Run the microbenchmark"
commands: .buildkite/scripts/opentelemetry-benchmark.sh
agents:
queue: observability-microbenchmarks
artifact_paths:
- "**/build/reports/tests/test/classes/io.opentelemetry.OverheadTests.html"
- "output.json"
12 changes: 12 additions & 0 deletions .buildkite/opentelemetry-benchmark.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
steps:
- label: "Run the opentelemetry-benchmark"
commands: .buildkite/scripts/opentelemetry-benchmark.sh
agents:
queue: observability-microbenchmarks
artifact_paths:
- "**/build/reports/tests/test/classes/io.opentelemetry.OverheadTests.html"
- "output.json"

notify:
- slack: "#apm-agent-java"
if: 'build.state != "passed"'
84 changes: 84 additions & 0 deletions .buildkite/scripts/opentelemetry-benchmark.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
#!/usr/bin/env bash
set -eo pipefail

CONTAINER_NAME=mock-apm-server
JSON_FILE="$(pwd)/output.json"

function cleanup {
echo "--- Tear down the environment"
MOCK_APM_SERVER=$(docker ps | grep $CONTAINER_NAME | awk '{print $1}')
docker stop $MOCK_APM_SERVER
docker rm $MOCK_APM_SERVER
}

trap cleanup EXIT

echo "--- Download the latest elastic-apm-agent artifact"
# run earlier so gh can use the current github repository.
run_id=$(gh run list --branch main --status success --workflow main.yml -L 1 --json databaseId --jq '.[].databaseId')
echo "downloading the latest artifact 'elastic-apm-agent' (using the workflow run '$run_id')"
gh run download "$run_id" -n elastic-apm-agent
ELASTIC_SNAPSHOT_JAR=$(ls -1 elastic-apm-agent-*.jar)
ELASTIC_SNAPSHOT_JAR_FILE="$(pwd)/$ELASTIC_SNAPSHOT_JAR"
echo "$ELASTIC_SNAPSHOT_JAR_FILE has been downloaded."
gh run download "$run_id" -n apm-agent-benchmarks
BENCHMARKS_JAR=$(ls -1 benchmarks*.jar)
BENCHMARKS_JAR_FILE="$(pwd)/$BENCHMARKS_JAR"
echo "$BENCHMARKS_JAR_FILE has been downloaded."

echo "--- Start APM Server mock"
git clone https://github.com/elastic/apm-mutating-webhook.git
pushd apm-mutating-webhook/test/mock
docker build -t $CONTAINER_NAME .
docker run -dp 127.0.0.1:8027:8027 $CONTAINER_NAME
popd

echo "--- Build opentelemetry-java-instrumentation"
git clone https://github.com/open-telemetry/opentelemetry-java-instrumentation.git --depth 1 --branch main
pushd opentelemetry-java-instrumentation/
./gradlew assemble

echo "--- Customise the elastic opentelemetry java instrumentation"
pushd benchmark-overhead
cp "$ELASTIC_SNAPSHOT_JAR_FILE" .
ELASTIC_SNAPSHOT_ENTRY="new Agent(\\\"elastic-snapshot\\\",\\\"latest available snapshot version from elastic main\\\",\\\"file://$PWD/$ELASTIC_SNAPSHOT_JAR\\\", java.util.List.of(\\\"-Delastic.apm.server_url=http://host.docker.internal:8027/\\\"))"
ELASTIC_LATEST_VERSION=$(curl -s https://repo1.maven.org/maven2/co/elastic/apm/elastic-apm-agent/ | perl -ne 's/<.*?>//g; if(s/^([\d\.]+).*$/$1/){print}' | sort -V | tail -1)
ELASTIC_LATEST_ENTRY="new Agent(\\\"elastic-latest\\\",\\\"latest available released version from elastic main\\\",\\\"https://repo1.maven.org/maven2/co/elastic/apm/elastic-apm-agent/$ELASTIC_LATEST_VERSION/elastic-apm-agent-$ELASTIC_LATEST_VERSION.jar\\\", java.util.List.of(\\\"-Delastic.apm.server_url=http://host.docker.internal:8027/\\\"))"
ELASTIC_LATEST_ENTRY2="new Agent(\\\"elastic-async\\\",\\\"latest available released version from elastic main\\\",\\\"https://repo1.maven.org/maven2/co/elastic/apm/elastic-apm-agent/$ELASTIC_LATEST_VERSION/elastic-apm-agent-$ELASTIC_LATEST_VERSION.jar\\\", java.util.List.of(\\\"-Delastic.apm.delay_agent_premain_ms=15000\\\",\\\"-Delastic.apm.server_url=http://host.docker.internal:8027/\\\"))"
NEW_LINE=" .withAgents(Agent.NONE, Agent.LATEST_RELEASE, Agent.LATEST_SNAPSHOT, $ELASTIC_LATEST_ENTRY, $ELASTIC_LATEST_ENTRY2, $ELASTIC_SNAPSHOT_ENTRY)"
echo $NEW_LINE
perl -i -ne "if (/withAgents/) {print \"$NEW_LINE\n\"}else{print}" src/test/java/io/opentelemetry/config/Configs.java

echo "--- Run tests of benchmark-overhead"
./gradlew test

echo "--- Report in Buildkite"

REPORT_FILE=$(pwd)/build/reports/tests/test/classes/io.opentelemetry.OverheadTests.html
perl -ne '/Standard output/ && $on++; /\<\/pre\>/ && ($on=0);$on && s/\<.*\>//;$on && !/^\s*$/ && print' $REPORT_FILE | tee report.txt

# Buildkite annotation
if [ -n "$BUILDKITE" ]; then
REPORT=$(cat report.txt)
cat << EOF | buildkite-agent annotate --style "info" --context report
### OverheadTests Report

\`\`\`
${REPORT}
\`\`\`

EOF
fi

echo "--- Generate ES docs"
JSON_FILE="$(pwd)/output.json"
java -cp $BENCHMARKS_JAR_FILE \
co.elastic.apm.agent.benchmark.ProcessOtelBenchmarkResults \
"$REPORT_FILE" "$JSON_FILE" "$ELASTIC_LATEST_VERSION" ./opentelemetry-javaagent.jar

echo "--- Send Report"
curl -X POST \
--user "${ES_USER_SECRET}:${ES_PASS_SECRET}" \
"${ES_URL_SECRET}/_bulk?pretty" \
-H "Content-Type: application/x-ndjson" \
--data-binary @"$JSON_FILE"
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ private void writeBulkFile(String resultFilePath) throws IOException {
final File file = new File(resultFilePath);
final FileWriter fileWriter = new FileWriter(file);
for (JsonNode benchmark : bechmarkResultJson) {
fileWriter.append("{ \"index\" : { \"_index\" : \"microbenchmarks\" } }\n");
fileWriter.append("{ \"index\" : { \"_index\" : \"otel-microbenchmarks\" } }\n");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should help with using a specific index for this kind of overhead-benchmark. If the name of the index changes then it requires also changes in the elasticsearch role - so it can write to that particular new index.

fileWriter.append(objectMapper.writer().writeValueAsString(benchmark));
fileWriter.append("\n");
}
Expand Down
37 changes: 37 additions & 0 deletions catalog-info.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -91,3 +91,40 @@ spec:
access_level: MANAGE_BUILD_AND_READ
everyone:
access_level: READ_ONLY

---
# yaml-language-server: $schema=https://gist.githubusercontent.com/elasticmachine/988b80dae436cafea07d9a4a460a011d/raw/rre.schema.json
apiVersion: backstage.io/v1alpha1
kind: Resource
metadata:
name: buildkite-pipeline-apm-agent-java-opentelemetry-benchmark
description: Buildkite Opentelemetry Benchmark for apm-agent-java
links:
- title: Pipeline
url: https://buildkite.com/elastic/apm-agent-java-opentelemetry-benchmark
spec:
type: buildkite-pipeline
owner: group:apm-agent-java
system: buildkite
implementation:
apiVersion: buildkite.elastic.dev/v1
kind: Pipeline
metadata:
name: apm-agent-java-opentelemetry-benchmark
spec:
repository: elastic/apm-agent-java
pipeline_file: ".buildkite/opentelemetry-benchmark.yml"
default_branch: main
provider_settings:
publish_commit_status: false
teams:
apm-agent-java:
access_level: MANAGE_BUILD_AND_READ
observablt-robots:
access_level: MANAGE_BUILD_AND_READ
everyone:
access_level: READ_ONLY
schedules:
Weekly Benchmark on main Branch:
cronline: "@weekly"
message: "Run the quick benchmark weekly."
Loading