Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve watch latency benchmark #17562

Merged
merged 1 commit into from
Mar 24, 2024

Conversation

serathius
Copy link
Member

  • Support prevKV
  • Support multiple watchers per stream
  • Allow continious puts without waiting for event

cc @ahrtr @jmhbnz @chaochn47

@serathius serathius force-pushed the benchmark-put-latency branch from ea05b11 to da4db52 Compare March 10, 2024 10:43
@serathius
Copy link
Member Author

Implemented for #16839

Copy link
Member

@jmhbnz jmhbnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial look I think this is looking good, thanks @serathius. Small question about defaults below.

For one scenario below is quick comparison at 10,000 between new and old showing that the existing most basic functionality seems to be behaving ok in case it sames time for other reviewers.

main currently

 ~  D  e  t  benchmark  ?6  benchmark-put-laten..  ./benchmark watch-latency --put-total 10000

Summary:
  Total:	99.0015 secs.
  Slowest:	0.0003 secs.
  Fastest:	0.0000 secs.
  Average:	0.0000 secs.
  Stddev:	0.0000 secs.
  Requests/sec:	101.0086

Response time histogram:
  0.0000 [1]	|
  0.0000 [6595]	|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0001 [2384]	|∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0001 [796]	|∎∎∎∎
  0.0001 [183]	|∎
  0.0001 [33]	|
  0.0002 [2]	|
  0.0002 [2]	|
  0.0002 [2]	|
  0.0003 [0]	|
  0.0003 [2]	|

Latency distribution:
  10% in 0.0000 secs.
  25% in 0.0000 secs.
  50% in 0.0000 secs.
  75% in 0.0000 secs.
  90% in 0.0001 secs.
  95% in 0.0001 secs.
  99% in 0.0001 secs.
  99.9% in 0.0001 secs.

This pull request

Put summary:

Summary:
  Total:	99.0020 secs.
  Slowest:	0.0100 secs.
  Fastest:	0.0010 secs.
  Average:	0.0015 secs.
  Stddev:	0.0006 secs.
  Requests/sec:	101.0080

Response time histogram:
  0.0010 [1]	|
  0.0019 [9255]	|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0028 [130]	|
  0.0037 [465]	|∎∎
  0.0046 [59]	|
  0.0055 [75]	|
  0.0064 [4]	|
  0.0073 [4]	|
  0.0082 [2]	|
  0.0091 [2]	|
  0.0100 [3]	|

Latency distribution:
  10% in 0.0012 secs.
  25% in 0.0012 secs.
  50% in 0.0013 secs.
  75% in 0.0014 secs.
  90% in 0.0015 secs.
  95% in 0.0032 secs.
  99% in 0.0045 secs.
  99.9% in 0.0066 secs.

Watch events summary:

Summary:
  Total:	99.0812 secs.
  Slowest:	0.0011 secs.
  Fastest:	0.0000 secs.
  Average:	0.0002 secs.
  Stddev:	0.0001 secs.
  Requests/sec:	10092.7286

Response time histogram:
  0.0000 [1862]	|
  0.0001 [95833]	|∎∎∎∎
  0.0002 [792736]	|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0003 [98306]	|∎∎∎∎
  0.0005 [3380]	|
  0.0006 [1745]	|
  0.0007 [3836]	|
  0.0008 [1788]	|
  0.0009 [307]	|
  0.0010 [161]	|
  0.0011 [46]	|

Latency distribution:
  10% in 0.0001 secs.
  25% in 0.0001 secs.
  50% in 0.0002 secs.
  75% in 0.0002 secs.
  90% in 0.0002 secs.
  95% in 0.0003 secs.
  99% in 0.0004 secs.
  99.9% in 0.0007 secs.

watchLatencyCmd.Flags().IntVar(&watchLWatchPerStream, "watch-per-stream", 10, "Total watchers per stream")
watchLatencyCmd.Flags().BoolVar(&watchLPrevKV, "prevkv", false, "PrevKV enabled on watch requests")

watchLatencyCmd.Flags().IntVar(&watchLPutTotal, "put-total", 1000, "Total number of put requests")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be trying to align the defaults across the board at 10,000?

 james  ~  D  e  t  benchmark  ?6  benchmark-put-..                                                                                  19:30:38 
 ➜ grep -Ri "1000" | grep -i intvar
cmd/put.go:     putCmd.Flags().IntVar(&putTotal, "total", 10000, "Total number of put requests")
cmd/stm.go:     stmCmd.Flags().IntVar(&stmTotal, "total", 10000, "Total number of completed STM transactions")
cmd/txn_put.go: txnPutCmd.Flags().IntVar(&txnPutTotal, "total", 10000, "Total number of txn requests")
cmd/watch.go:   watchCmd.Flags().IntVar(&watchPutTotal, "put-total", 1000, "Number of put requests")
cmd/range.go:   rangeCmd.Flags().IntVar(&rangeTotal, "total", 10000, "Total number of range requests")
cmd/watch_get.go:       watchGetCmd.Flags().IntVar(&watchGetTotalWatchers, "watchers", 10000, "Total number of watchers")
cmd/lease.go:   leaseKeepaliveCmd.Flags().IntVar(&leaseKeepaliveTotal, "total", 10000, "Total number of lease keepalive requests")
cmd/watch_latency.go:   watchLatencyCmd.Flags().IntVar(&watchLPutTotal, "put-total", 1000, "Total number of put requests")
cmd/mvcc.go:    mvccCmd.PersistentFlags().IntVar(&batchLimit, "batch-limit", 10000, "A limit of batched transaction")
cmd/txn_mixed.go:       mixedTxnCmd.Flags().IntVar(&mixedTxnTotal, "total", 10000, "Total number of txn requests")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark code is not very organized. I planned to make a direct improvement that we require for testing prevKV, and leave cleanups for later.

tools/benchmark/cmd/watch_latency.go Outdated Show resolved Hide resolved
tools/benchmark/cmd/watch_latency.go Outdated Show resolved Hide resolved
func setupWatchChannels(key string) []clientv3.WatchChan {
clients := mustCreateClients(totalClients, totalConns)

streams := make([]clientv3.Watcher, watchLStreams)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that it doesn't really create watchLStreams gRPC streams. Instead it just create totalConns gRPC streams?

Copy link
Member Author

@serathius serathius Mar 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by streams here I mean watch streams aka Watch RPC. I based the code on tools/benchmark/cmd/watch.go where I discovered that you can separate watch requests into different stream by creating a new watcher. This reuses same grpc connection as you pass it the same etcd client, however creates a separate stream as those are stored on watcher. Seems like a better way to separate streams approach I proposed in Kubernetes via grpc context metadata.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discovered that you can separate watch requests into different stream by creating a new watcher.

Yes, it's true.

Each gRPC connection may be reused by multiple gRPC streams, and each gRPC stream can be reused by multiple etcd watchers.

wch := wch
i := i
eventTimes[i] = make([]time.Time, watchLPutTotal)
wg.Add(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the wg defined?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

He he, didn't notice that, it's a global wg for all benchmarks. Would prefer to leave it as it is to at least be consistent with other benchmarks.

* Support prevKV
* Support multiple watchers per stream
* Allow continious puts without waiting for event

Signed-off-by: Marek Siarkowicz <[email protected]>
@serathius serathius force-pushed the benchmark-put-latency branch from da4db52 to 7a84cbb Compare March 11, 2024 19:30
@serathius
Copy link
Member Author

/retest

@serathius
Copy link
Member Author

ping @ahrtr

Comment on lines +121 to +126
start := putTimes[j]
end := eventTimes[i][j]
if end.Before(start) {
start = end
}
watchReport.Results() <- report.Result{Start: start, End: end}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is too fragile. The startTime is recorded in the main goroutine when putting the key, while the endTime is recorded in the watch gouroutines, and you correlate the startTime with the endTime using an index.

Suggest to use WithPrefix when watching, and put a different key each time when putting, e.g. "foo1", "foo2", etc. And we use the exact key to correlate the startTime with the endTime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fragile? Puts are done sequentially and etcd guarantees that responses are sequential to writes. Not sure I would call it fragile or at least up to the etcd correctness, whoever I don't think we should care about this in benchmarks.

Copy link
Member

@ahrtr ahrtr Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "fragile" isn't about the etcd's correctness, it's about the readability, it's about the code itself. There may be network jitter, the put may fail. Currently it looks OK because the tool just exits directly when put fails. I think we shouldn't exit in such case, we are running benchmark test, why do we care about some occasional failures due to env problems (e.g. network jitter)? Some time later, other contributors may even update the implementation not exiting in such case.

Also depending on index (correlate the times by an index) reduces the readability, also increase review burden. We need super carefulness to ensure there is no +1/-1 problem.

Anyway we should make the implementation more robustness & less error prone by itself.

if _, err := putClient.Put(context.TODO(), key, value); err != nil {
fmt.Fprintf(os.Stderr, "Failed to Put for watch latency benchmark: %v\n", err)
os.Exit(1)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't insist on this for now, because it works for now as mentioned above "Currently it looks OK because the tool just exits directly when put fails.".

Please let me know if you want to keep it as it's or plan to change it based on the proposed above.

Copy link
Member Author

@serathius serathius Mar 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't plan to change it for now. I think it could be improved, however I'm not as concerned with making benchmark code super readable, as it's not often changed nor critical for etcd functionality (at least until we start to really depend in automatic performance regression detection by @jmhbnz). I think we could just merge it to make sure the improvements are available for everyone to use.

@ahrtr do you see any blockers?

func setupWatchChannels(key string) []clientv3.WatchChan {
clients := mustCreateClients(totalClients, totalConns)

streams := make([]clientv3.Watcher, watchLStreams)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discovered that you can separate watch requests into different stream by creating a new watcher.

Yes, it's true.

Each gRPC connection may be reused by multiple gRPC streams, and each gRPC stream can be reused by multiple etcd watchers.

@serathius serathius merged commit b643f1b into etcd-io:main Mar 24, 2024
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants