Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

misc: add service-level benchmarks #1006

Merged
merged 5 commits into from
Aug 9, 2023
Merged

misc: add service-level benchmarks #1006

merged 5 commits into from
Aug 9, 2023

Conversation

ianbotsf
Copy link
Contributor

Issue #

Closes #968

Description of changes

Add a new module :tests:benchmarks:service-benchmarks to run service-level benchmarks. See that module's README.md for more details.

There's a lot in the PR but here are some highlights:

  • ServiceBenchmark.kt defines the benchmark structure and BenchmarkHarness.kt's main method is the application entry point. It may be easiest to start your review from those two places.
  • I've added a way to bootstrap service dependencies for a given module (i.e., ./gradlew :tests:benchmarks:service-benchmarks:bootstrapAll). Also add a way to gracefully skip building a project when its service dependencies haven't been built. I think this could be a template for how we build higher-level libraries (e.g., S3 Transfer Manager) in the same repo as the SDK itself. If we like it, I think the build logic could be abstracted to be more reusable in future modules.
  • BenchmarkTelemetryProvider defines the metrics to capture, currently just SDK overhead. We should think about whether there are other metrics we'd want to capture as a baseline or in eventual dashboards.
  • MetricAggregator defines the statistics to compute on captured metrics, currently count, minimum, average, median/p50, p90, p99, and maximum. At the number of iterations we run, I think p90 is the most interesting.
  • Our overhead seems consistently highest on Cloudwatch and consistently lowest on DynamoDB. I haven't dug too deeply into why yet.

Companion PR: smithy-lang/smithy-kotlin#908

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link
Member

@lauzadis lauzadis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the logic around bootstrapAll and agree it'd be great to use for things like S3 Transfer Manager!

| | Overhead (ms) | n | min | avg | med | p90 | p99 | max |
| :--- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
| **S3** | | | | | | | | |
| —HeadObject | | 1618 | 0.340 | 0.605 | 0.417 | 0.638 | 4.864 | 14.672 |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: em dash not required, it's easy to read the table without them

}

val requiredServices = setOf(
// Top 7 services called by Kotlin SDK customers as of 7/25/2023
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would be helpful to note this is in descending order

"pinpoint",

// Services required as prerequisites for setup
"iam", // Create roles for SNS::AssumeRole
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: STS::AssumeRole

Copy link
Collaborator

@aajtodd aajtodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple minor comments/suggestions/cleanup but nothing major. Overall this looks great!

private const val NAME_FIELD = "name"
private const val COUNT_FIELD = "n"

private typealias Results = Map<String, Map<String, Map<String, MetricSummary>>>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Might help to explain the format of this what is this a key-value mapping of?

override fun createUpDownCounter(name: String, units: String?, description: String?) =
NoOpUpDownCounter

override fun createAsyncUpDownCounter(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment: In future I could see filling in the rest of these to be able to benchmark connection usage, request concurrency, etc. I don't think we need to do anything right now though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I actually tested out a few more metrics while working on this but decided just "overhead" was the clearest and simplest place to start. There's room to grow for sure.

)

@ExperimentalApi
class BenchmarkTelemetryProvider(private val metricAggregator: MetricAggregator) : TelemetryProvider {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment: I'm thinking we need a TelemetryProviderBuilder that allows you to easily override just the providers you want 🤔 . Not needed in this PR though of course.

* `./gradlew build`
This builds the whole SDK.
* `./gradlew :tests:benchmarks:service-benchmarks:run`
This runs the benchmark suite and prints the results to the console formatted as a Markdown table.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment: Markdown is nice but I'm wondering if we want to also support JSON at some point. It will be easier to do delta comparisons. IIRC java v2 stores their baseline in a json file in the repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we absolutely want to support other formats at some point. I can imagine an additional parameter to the benchmark target that sets the output type: markdown, JSON, maybe even directly to CloudWatch via environment credentials.

|----------------|------------------|-----------|
| EC2 m5.4xlarge | Amazon Linux 2 | 7/28/2023 |

### Results
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment: Eventually probably want to benchmark by HTTP client as well.

import aws.smithy.kotlin.runtime.client.SdkClient
import kotlin.time.Duration

interface ServiceBenchmark<C : SdkClient> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix: Docs on all of these types/methods would be helpful

override suspend fun setup(client: S3Client) {
client.putObject {
bucket = bucketName
key = KEY
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Any concern about key/prefix throttling? I know in our canary we use a wide range of keys to avoid throttling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No concern. Because we're running single-threaded benchmarks, our transaction rate is solely dependent on the end-to-end operation time. Given that each roundtrip takes at least 1ms, we're nowhere close to the throttling limit of 3500 TPS per prefix.

iamRoleArn = resp.role!!.arn!!
}

// It takes a while for newly-created roles to fully propagate to STS. In the meantime, trying to assume the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment: Seems like a waitUntilRoleExists waiter should exit 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

waitUntilRoleExists does exist but it doesn't do what we want. The role is created and usable in IAM almost immediately but that's only the first part. The real long wait is the propagation to STS, which appears to be slow and may be using multi-node replication since it can be transiently available for a short time. 🤦‍♂️

tasks.register("bootstrapAll") {
val bootstrapArg = requiredServices.joinToString(",") { "+$it" }
val bootstrapProj = project(":codegen:sdk")
bootstrapProj.ext.set("aws.services", bootstrapArg)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment: This is ok for now but this inter-project task dependency is probably not great. This is fine for now though as I have no better suggestion at the moment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the shortcomings you see? What kinds of problems may we encounter if we continue using this pattern here and elsewhere?

@sonarcloud
Copy link

sonarcloud bot commented Aug 8, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 9 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@ianbotsf ianbotsf merged commit 3a20e42 into main Aug 9, 2023
11 of 13 checks passed
@ianbotsf ianbotsf deleted the service-benchmarks branch August 9, 2023 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Benchmark Baseline
3 participants