Update README

codetalkio · Jan 4, 2024 · 629e0ac · 629e0ac
1 parent 673ba78
commit 629e0ac
Showing 1 changed file with 59 additions and 72 deletions.
diff --git a/README.md b/README.md
@@ -4,7 +4,7 @@
 
 In this repo we do a comparison of Cold/Warm Start times of Federated GraphQL solutions (Cosmo, Mesh, Apollo Gateway, Apollo Router) and provide a minimal/scrappy build of Apollo Router built for Lambda via Amazon Linux 2, as well as a version of Cosmo built in Go that you can utilize.
 
-- ⚡️ **TL;DR** I'd recommend the `lambda-cosmo-custom` alternative which is a lot less hacky and much more performant (300-500ms Cold Starts). See [the README for details on how to use](./lambda-cosmo-custom).
+- ⚡️ **TL;DR** I'd recommend the `lambda-cosmo-custom` (available as `bootstrap-cosmo-arm` in the Releases) alternative which is a lot less hacky and much more performant (300-500ms Cold Starts). See [the README for details on how to use](./lambda-cosmo-custom).
 
 - ⚡️ **TL;DR 2**: Of the Apollo Router variants, `lambda-directly-optimized` beats all other variants and is on par with the alternatives for Cold and Warm Starts (use the `bootstrap-directly-optimized-graviton-arm-size` binary).
 
@@ -20,7 +20,6 @@ Overview:
 - [Comparison: Federation via GraphQL Mesh](#comparison-federation-via-graphql-mesh)
 - [Comparison: Federation via Cosmo Router](#comparison-federation-via-cosmo-router)
 
-
 ## Motivation
 
 Serverless is great to get started with a low-cost Cloud setup that'll scale you from zero to profitable without having to worry about infrastructure overhead. That said, there are not many great Federated GraphQL solutions that work out-of-the box for Serverless. Router from Apollo and Cosmo from Wundergraph are both tailoered to long-running processes, e.g. in a k8s cluster. Mesh and Apollo Gateway are both JavaScript programs which incur a massive penalty in Cold Start times and are thus not a great solution.
@@ -40,6 +39,7 @@ This repository contains five examples:
 - `lambda-cosmo-custom`: Spins up a Cosmo sever using the [Cosmo Router](https://github.com/wundergraph/cosmo/tree/main/router) and proxies Lambda Events to the HTTP server locally, similar to `lambda-with-server`.
 
 We do some additional tricks to reduce the size of the Apollo variants in the `bootstrap-directly-optimized-graviton-arm-size` binary, which has an impact on Cold Starts:
+
 - We [remove location details](https://github.com/johnthagen/min-sized-rust#remove-location-details), [panic string formatting](https://github.com/johnthagen/min-sized-rust#remove-panic-string-formatting-with-panic_immediate_abort), and [abort on panic](https://github.com/johnthagen/min-sized-rust#abort-on-panic)
 - We [rebuild and optimize libstd](https://github.com/johnthagen/min-sized-rust#optimize-libstd-with-build-std) with build-std, which combined with the above brings us from ~71MB down to ~49MB.
 - ~~We use [upx](https://github.com/upx/upx) to reduce the size of the binaries.~~ Unfortuntately, the overhead of decompressing the binary significantly increases Cold Start times, e.g. `lambda-directly-optimized` goes up from 0.8s to 2.5s, despite a binary reduction from 73.71MB to 18MB.
@@ -48,21 +48,21 @@ Check out the code and `Dockerfile` for each. There's really not a lot going on,
 
 ## Measurements
 
-| Measurement (ms) | `GraphQL Mesh` (512 MB) | `GraphQL Mesh` (1024 MB) | `GraphQL Mesh` (2048 MB) | `lambda-directly-optimized` (512 MB) | `lambda-directly-optimized` (1024 MB) | `lambda-directly-optimized` (2048 MB) | `Cosmo` (512 MB) | `Cosmo` (1024 MB) | `Cosmo` (2048 MB) | `Apollo Gateway` (512 MB) | `Apollo Gateway` (1024 MB) | `Apollo Gateway` (2048 MB) |
-|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|-------------|
-| Average warm start response time | 10.2 ms | 10 ms | 10.3 ms | 6.8 ms | 6.2 ms | 6.8 ms | 10.7 ms | 10 ms | 9.8 ms | 8.8 ms | 8.9 ms | 9.8 ms |
-| Average cold start response time | 615.9 ms | 609.8 ms | 565.2 ms | 703.3 ms | 681.8 ms | 678 ms | 442.9 ms | 464.7 ms | 427.7 ms | 1037.7 ms | 871.2 ms | 851 ms |
-| Fastest warm response time | 6.9 ms | 7.9 ms | 8 ms | 5 ms | 5 ms | 6 ms | 6.9 ms | 7.9 ms | 7.9 ms | 6.9 ms | 6.9 ms | 6.9 ms |
-| Slowest warm response time | 38.9 ms | 38.9 ms | 38.9 ms | 11 ms | 11 ms | 9 ms | 19 ms | 11.9 ms | 10.9 ms | 12 ms | 12 ms | 10.9 ms |
-| Fastest cold response time  | 495.9 ms | 495.9 ms | 495.9 ms | 625 ms | 625 ms | 625 ms | 328 ms | 328 ms | 328 ms | 797 ms | 797 ms | 797 ms |
-| Slowest cold response time | 877 ms | 786.9 ms | 786.9 ms | 2724 ms | 804 ms | 724.9 ms | 581 ms | 531 ms | 505 ms | 1170 ms | 1039.9 ms | 898 ms |
+| Measurement (ms)                 | `GraphQL Mesh` (512 MB) | `GraphQL Mesh` (1024 MB) | `GraphQL Mesh` (2048 MB) | `lambda-directly-optimized` (512 MB) | `lambda-directly-optimized` (1024 MB) | `lambda-directly-optimized` (2048 MB) | `Cosmo` (512 MB) | `Cosmo` (1024 MB) | `Cosmo` (2048 MB) | `Apollo Gateway` (512 MB) | `Apollo Gateway` (1024 MB) | `Apollo Gateway` (2048 MB) |
+| -------------------------------- | ----------------------- | ------------------------ | ------------------------ | ------------------------------------ | ------------------------------------- | ------------------------------------- | ---------------- | ----------------- | ----------------- | ------------------------- | -------------------------- | -------------------------- |
+| Average warm start response time | 10.2 ms                 | 10 ms                    | 10.3 ms                  | 6.8 ms                               | 6.2 ms                                | 6.8 ms                                | 10.7 ms          | 10 ms             | 9.8 ms            | 8.8 ms                    | 8.9 ms                     | 9.8 ms                     |
+| Average cold start response time | 615.9 ms                | 609.8 ms                 | 565.2 ms                 | 703.3 ms                             | 681.8 ms                              | 678 ms                                | 442.9 ms         | 464.7 ms          | 427.7 ms          | 1037.7 ms                 | 871.2 ms                   | 851 ms                     |
+| Fastest warm response time       | 6.9 ms                  | 7.9 ms                   | 8 ms                     | 5 ms                                 | 5 ms                                  | 6 ms                                  | 6.9 ms           | 7.9 ms            | 7.9 ms            | 6.9 ms                    | 6.9 ms                     | 6.9 ms                     |
+| Slowest warm response time       | 38.9 ms                 | 38.9 ms                  | 38.9 ms                  | 11 ms                                | 11 ms                                 | 9 ms                                  | 19 ms            | 11.9 ms           | 10.9 ms           | 12 ms                     | 12 ms                      | 10.9 ms                    |
+| Fastest cold response time       | 495.9 ms                | 495.9 ms                 | 495.9 ms                 | 625 ms                               | 625 ms                                | 625 ms                                | 328 ms           | 328 ms            | 328 ms            | 797 ms                    | 797 ms                     | 797 ms                     |
+| Slowest cold response time       | 877 ms                  | 786.9 ms                 | 786.9 ms                 | 2724 ms                              | 804 ms                                | 724.9 ms                              | 581 ms           | 531 ms            | 505 ms            | 1170 ms                   | 1039.9 ms                  | 898 ms                     |
 
 Of the Apollo variants specifically:
 
-| Approach | Advantage     | Performance |
-|----------| ------------- |-------------|
-| `lambda-with-server` | · Full router functionality (almost) | · Cold Start: ~1.58s <br/>· Warm Start: ~49ms |
-| `lambda-directly` | · No need to wait for a server to start first (lower overhead) | · Cold Start: ~1.32s <br/>· Warm Start: ~314ms |
+| Approach                    | Advantage                                                                                                               | Performance                                                                                                                                  |
+| --------------------------- | ----------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- |
+| `lambda-with-server`        | · Full router functionality (almost)                                                                                    | · Cold Start: ~1.58s <br/>· Warm Start: ~49ms                                                                                                |
+| `lambda-directly`           | · No need to wait for a server to start first (lower overhead)                                                          | · Cold Start: ~1.32s <br/>· Warm Start: ~314ms                                                                                               |
 | `lambda-directly-optimized` | · No need to wait for a server to start first (lower overhead)<br/>· Built for ARM<br/>· Optimized for the Graviton CPU | Optimized for size<br/>· Cold Start: ~0.7s <br/>· Warm Start: ~20ms<br/>Optimized for speed<br/>· Cold Start: ~0.9s <br/>· Warm Start: ~20ms |
 
 # How to use
@@ -119,15 +119,14 @@ A good 450ms of this is spent just waiting for the Router to spin up:
 
 Breakdown of only the router (making no queries to subgraphs):
 
-| Measurement (ms) | 128 MB | 256 MB | 512 MB | 1024 MB | 2048 MB |
-|-------------|-------------|-------------|-------------|-------------|-------------|
-| Average warm start response time | 8.3 ms | 8.7 ms | 7.6 ms | 7.6 ms | 8 ms |
+| Measurement (ms)                 | 128 MB    | 256 MB    | 512 MB    | 1024 MB   | 2048 MB  |
+| -------------------------------- | --------- | --------- | --------- | --------- | -------- |
+| Average warm start response time | 8.3 ms    | 8.7 ms    | 7.6 ms    | 7.6 ms    | 8 ms     |
 | Average cold start response time | 2870.9 ms | 2570.4 ms | 2174.1 ms | 1012.8 ms | 943.4 ms |
-| Fastest warm response time | 6 ms | 6 ms | 6 ms | 6.9 ms | 6.9 ms |
-| Slowest warm response time | 16.9 ms | 16.9 ms | 16.9 ms | 16.9 ms | 16.9 ms |
-| Fastest cold response time  | 837 ms | 837 ms | 837 ms | 837 ms | 837 ms |
-| Slowest cold response time | 3861.9 ms | 3861.9 ms | 2612.9 ms | 1625 ms | 1139 ms |
-
+| Fastest warm response time       | 6 ms      | 6 ms      | 6 ms      | 6.9 ms    | 6.9 ms   |
+| Slowest warm response time       | 16.9 ms   | 16.9 ms   | 16.9 ms   | 16.9 ms   | 16.9 ms  |
+| Fastest cold response time       | 837 ms    | 837 ms    | 837 ms    | 837 ms    | 837 ms   |
+| Slowest cold response time       | 3861.9 ms | 3861.9 ms | 2612.9 ms | 1625 ms   | 1139 ms  |
 
 `lambda-directly`
 
@@ -143,14 +142,14 @@ A few samples of `lambda-directly-optimized` (optimized for speed) Cold Starts:
 
 Breakdown of only the router (making no queries to subgraphs):
 
-| Measurement (ms) | 128 MB | 256 MB | 512 MB | 1024 MB | 2048 MB |
-|-------------|-------------|-------------|-------------|-------------|-------------|
-| Average warm start response time | 9.7 ms | 5.4 ms | 5.6 ms | 6.1 ms | 5.8 ms |
-| Average cold start response time | 858 ms | 837.6 ms | 775.5 ms | 768.3 ms | 753.2 ms |
-| Fastest warm response time | 4.9 ms | 4.9 ms | 4.9 ms | 4.9 ms | 4.9 ms |
-| Slowest warm response time | 23 ms | 8 ms | 7 ms | 7 ms | 7 ms |
-| Fastest cold response time  | 719 ms | 719 ms | 719 ms | 719 ms | 719 ms |
-| Slowest cold response time | 1075 ms | 981.9 ms | 981.9 ms | 981.9 ms | 868 ms |
+| Measurement (ms)                 | 128 MB  | 256 MB   | 512 MB   | 1024 MB  | 2048 MB  |
+| -------------------------------- | ------- | -------- | -------- | -------- | -------- |
+| Average warm start response time | 9.7 ms  | 5.4 ms   | 5.6 ms   | 6.1 ms   | 5.8 ms   |
+| Average cold start response time | 858 ms  | 837.6 ms | 775.5 ms | 768.3 ms | 753.2 ms |
+| Fastest warm response time       | 4.9 ms  | 4.9 ms   | 4.9 ms   | 4.9 ms   | 4.9 ms   |
+| Slowest warm response time       | 23 ms   | 8 ms     | 7 ms     | 7 ms     | 7 ms     |
+| Fastest cold response time       | 719 ms  | 719 ms   | 719 ms   | 719 ms   | 719 ms   |
+| Slowest cold response time       | 1075 ms | 981.9 ms | 981.9 ms | 981.9 ms | 868 ms   |
 
 `lambda-directly-optimized` (optimized for size)
 
@@ -162,15 +161,14 @@ A few samples of `lambda-directly-optimized` (optimized for size) Cold Starts:
 
 Breakdown of only the router (making no queries to subgraphs):
 
-| Measurement (ms) | 128 MB | 256 MB | 512 MB | 1024 MB | 2048 MB |
-|-------------|-------------|-------------|-------------|-------------|-------------|
-| Average warm start response time | 5.2 ms | 5.6 ms | 5.2 ms | 5.6 ms | 5.5 ms |
+| Measurement (ms)                 | 128 MB   | 256 MB   | 512 MB   | 1024 MB  | 2048 MB  |
+| -------------------------------- | -------- | -------- | -------- | -------- | -------- |
+| Average warm start response time | 5.2 ms   | 5.6 ms   | 5.2 ms   | 5.6 ms   | 5.5 ms   |
 | Average cold start response time | 735.8 ms | 735.6 ms | 698.1 ms | 698.8 ms | 688.1 ms |
-| Fastest warm response time | 4 ms | 4 ms | 4.9 ms | 4.9 ms | 4.9 ms |
-| Slowest warm response time | 72.9 ms | 20.9 ms | 9.9 ms | 8 ms | 8 ms |
-| Fastest cold response time  | 617 ms | 617 ms | 617 ms | 617 ms | 617 ms |
-| Slowest cold response time | 985 ms | 985 ms | 894.9 ms | 894.9 ms | 762 ms |
-
+| Fastest warm response time       | 4 ms     | 4 ms     | 4.9 ms   | 4.9 ms   | 4.9 ms   |
+| Slowest warm response time       | 72.9 ms  | 20.9 ms  | 9.9 ms   | 8 ms     | 8 ms     |
+| Fastest cold response time       | 617 ms   | 617 ms   | 617 ms   | 617 ms   | 617 ms   |
+| Slowest cold response time       | 985 ms   | 985 ms   | 894.9 ms | 894.9 ms | 762 ms   |
 
 # Comparison: Federation via Apollo Router (Warm Start)
 
@@ -182,18 +180,14 @@ Both of these examples talk to 1 warm subgraph implemented in Rust, to simulate
 
 <img width="1637" alt="Direct Router Warm (Products query)  Screenshot 2023-10-31 at 20 50 48" src="https://github.com/codetalkio/apollo-router-lambda/assets/1189998/fd0ee045-1b1e-4417-a077-316ddbe8f35c">
 
-
 `lambda-directly`
 
 <img width="1637" alt="Lambda Router Warm (Products query) Screenshot 2023-10-31 at 20 41 43" src="https://github.com/codetalkio/apollo-router-lambda/assets/1189998/b9b8e456-d32e-4c0a-b1e4-72cb7c9cbe9c">
 
-
 `lambda-directly-optimized` (optimized for size)
 
 <img width="1485" alt="Warm start (talking to Products) Screenshot 2023-11-11 at 23 48 46" src="https://github.com/codetalkio/apollo-router-lambda/assets/1189998/9bfbe078-2c27-4739-a29a-6b3a20ff09ff">
 
-
-
 # Comparison: Rust Subgraph in AWS Lambda
 
 For comparison so that you know how far we _could_ go, here's a subgraph in Rust implemented using [async-graphql](https://github.com/async-graphql/async-graphql) and wrapped up in [cargo-lambda](https://www.cargo-lambda.info/).
@@ -202,12 +196,10 @@ Cold Start (201ms):
 
 <img width="1411" alt="Screenshot 2023-10-21 at 12 13 18" src="https://github.com/codetalkio/apollo-router-lambda/assets/1189998/e561ce43-ffd3-4ef3-bb95-8d5619035f37">
 
-
 Warm Start (8ms):
 
 <img width="1411" alt="Screenshot 2023-10-21 at 12 14 41" src="https://github.com/codetalkio/apollo-router-lambda/assets/1189998/f71beb7f-b210-46ec-80ea-fc0de86f9581">
 
-
 # Comparison: Federation via Apollo Gateway
 
 To have something to compare the Apollo Router PoC more directly against, here's one alternative using [Apollo Gateway](https://www.apollographql.com/docs/apollo-server/using-federation/apollo-gateway-setup).
@@ -216,22 +208,20 @@ Cold start (1.23ms):
 
 <img width="1350" alt="Cold start ms-gateway Screenshot 2023-10-22 at 21 45 34" src="https://github.com/codetalkio/apollo-router-lambda/assets/1189998/d8958d82-529a-4b63-98c9-db90b06f0fe2">
 
-
 Warm start (120ms):
 
 <img width="1412" alt="Warm start subgraph times Screenshot 2023-10-22 at 16 13 26" src="https://github.com/codetalkio/apollo-router-lambda/assets/1189998/577d8e5b-afc6-4d2f-b22c-7b61f94a473d">
 
 Breakdown of only the router (making no queries to subgraphs):
 
-| Measurement (ms) | 512 MB | 1024 MB | 2048 MB |
-|-------------|-------------|-------------|-------------|
-| Average warm start response time | 8.8 ms | 8.9 ms | 9.8 ms |
-| Average cold start response time | 1037.7 ms | 871.2 ms | 851 ms |
-| Fastest warm response time | 6.9 ms | 6.9 ms | 6.9 ms |
-| Slowest warm response time | 12 ms | 12 ms | 10.9 ms |
-| Fastest cold response time  | 797 ms | 797 ms | 797 ms |
-| Slowest cold response time | 1170 ms | 1039.9 ms | 898 ms |
-
+| Measurement (ms)                 | 512 MB    | 1024 MB   | 2048 MB |
+| -------------------------------- | --------- | --------- | ------- |
+| Average warm start response time | 8.8 ms    | 8.9 ms    | 9.8 ms  |
+| Average cold start response time | 1037.7 ms | 871.2 ms  | 851 ms  |
+| Fastest warm response time       | 6.9 ms    | 6.9 ms    | 6.9 ms  |
+| Slowest warm response time       | 12 ms     | 12 ms     | 10.9 ms |
+| Fastest cold response time       | 797 ms    | 797 ms    | 797 ms  |
+| Slowest cold response time       | 1170 ms   | 1039.9 ms | 898 ms  |
 
 # Comparison: Federation via GraphQL Mesh
 
@@ -243,15 +233,14 @@ Cold start (956ms):
 
 Breakdown of only the router (making no queries to subgraphs):
 
-| Measurement (ms) | 512 MB | 1024 MB | 2048 MB |
-|-------------|-------------|-------------|-------------|
-| Average warm start response time | 10.2 ms | 10 ms | 10.3 ms |
+| Measurement (ms)                 | 512 MB   | 1024 MB  | 2048 MB  |
+| -------------------------------- | -------- | -------- | -------- |
+| Average warm start response time | 10.2 ms  | 10 ms    | 10.3 ms  |
 | Average cold start response time | 615.9 ms | 609.8 ms | 565.2 ms |
-| Fastest warm response time | 6.9 ms | 7.9 ms | 8 ms |
-| Slowest warm response time | 38.9 ms | 38.9 ms | 38.9 ms |
-| Fastest cold response time  | 495.9 ms | 495.9 ms | 495.9 ms |
-| Slowest cold response time | 877 ms | 786.9 ms | 786.9 ms |
-
+| Fastest warm response time       | 6.9 ms   | 7.9 ms   | 8 ms     |
+| Slowest warm response time       | 38.9 ms  | 38.9 ms  | 38.9 ms  |
+| Fastest cold response time       | 495.9 ms | 495.9 ms | 495.9 ms |
+| Slowest cold response time       | 877 ms   | 786.9 ms | 786.9 ms |
 
 # Comparison: Federation via Cosmo Router
 
@@ -263,13 +252,11 @@ Cold start (339ms):
 
 Breakdown of only the router (making no queries to subgraphs):
 
-| Measurement (ms) | ms-cosmo (512 MB) | ms-cosmo (1024 MB) | ms-cosmo (2048 MB) |
-|-------------|-------------|-------------|-------------|
-| Average warm start response time | 10.7 ms | 10 ms | 9.8 ms |
-| Average cold start response time | 442.9 ms | 464.7 ms | 427.7 ms |
-| Fastest warm response time | 6.9 ms | 7.9 ms | 7.9 ms |
-| Slowest warm response time | 19 ms | 11.9 ms | 10.9 ms |
-| Fastest cold response time  | 328 ms | 328 ms | 328 ms |
-| Slowest cold response time | 581 ms | 531 ms | 505 ms |
-
-
+| Measurement (ms)                 | ms-cosmo (512 MB) | ms-cosmo (1024 MB) | ms-cosmo (2048 MB) |
+| -------------------------------- | ----------------- | ------------------ | ------------------ |
+| Average warm start response time | 10.7 ms           | 10 ms              | 9.8 ms             |
+| Average cold start response time | 442.9 ms          | 464.7 ms           | 427.7 ms           |
+| Fastest warm response time       | 6.9 ms            | 7.9 ms             | 7.9 ms             |
+| Slowest warm response time       | 19 ms             | 11.9 ms            | 10.9 ms            |
+| Fastest cold response time       | 328 ms            | 328 ms             | 328 ms             |
+| Slowest cold response time       | 581 ms            | 531 ms             | 505 ms             |