Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce profile-guided optimization to APM Server #13859

Closed
1pkg opened this issue Aug 9, 2024 · 1 comment
Closed

Introduce profile-guided optimization to APM Server #13859

1pkg opened this issue Aug 9, 2024 · 1 comment
Assignees
Milestone

Comments

@1pkg
Copy link
Member

1pkg commented Aug 9, 2024

Context

Go compiler added support for profile-guided optimization (PGO) since Go 1.20, thus allowing to build further optimized Go binaries using CPU profiles. Depending on the workload, optimizations and the profile quality one can expect to extract somewhere from 2-14% of performance gains according to go.dev/blog which is quite meaningful. Further more for CPU bound services, in general, the expected results should be somewhere in the upper bound of the expected performance gains.

Requested Changes

We should capture this performance improvement in APM Server.

We should consider how we will collect the profiles, initially we could leverage existing benchmarks workflow to collect CPU profiles which will be included into the future PGO builds.

We should consider how we will store profiles and include them into the builds, the simplest and most convenient approach to commit them directly in the source code for easy distribution and repeatable builds.

Additional Links

go.dev/doc/pgo
go.dev/blog/pgo

@1pkg 1pkg added this to the 8.16 milestone Aug 9, 2024
@1pkg 1pkg self-assigned this Aug 9, 2024
@1pkg
Copy link
Member Author

1pkg commented Sep 11, 2024

The existing benchmarks in APM Server are highly coupled to the ElasticSearch stack, making them over indexing ES performance into the benchmark results. This could lead to a large distribution between individual benchmark results, up to 20%, and make them hard to measure and reason about. This is a particular problem for PGO, since PGO only adds an incremental gain to the performance. In my synthetical local test when benchmarking an isolated instance of APM Server the result was an average 5% increase in the throughput. These results are really hard to observe when the existing benchmark hide them between ES performance.

The solution to this problem would be to use a separate second set of benchmark workflow that targets a http API stub instead of real ES instance. This way we should be able to sufficiently isolate APM Server performance from underlying ES performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant