-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Row-level TTL PR 1: new API #370
Conversation
7d5a4db
to
8d72148
Compare
@@ -56,7 +56,9 @@ public final class ResponsiveStores { | |||
* @return a supplier for a key-value store with the given options | |||
* that uses Responsive's storage for its backend | |||
*/ | |||
public static KeyValueBytesStoreSupplier keyValueStore(final ResponsiveKeyValueParams params) { | |||
public static ResponsiveKeyValueBytesStoreSupplier keyValueStore( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agavra FYI all the changes in ResponsiveStores are technically unrelated -- just holdovers from an alternative API approach I abandoned -- but I think it's the right thing to do anyways so I kept it in. Hope you don't mind
return new TtlDuration(Duration.ZERO, Ttl.INFINITE); | ||
} | ||
|
||
// TODO(sophie): store ttl as long to avoid Duration conversions on the hot path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agavra I'm actually a bit concerned about this. Even if we store it as millis we'll still have a Duration on the hot path since this is the return type. What if we just bite the bullet and have the user return a TTL in terms of millis rather than a Duration?
I know it's not as nice API-wise but To be honest I think most Streams users are probably pretty used to reasoning about time in terms of milliseconds. So I'm personally leaning towards replacing Duration with long millis entirely. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have evidence that this is actually significant on a per-retrieved doc basis?
A few bitwise CPU operations seems to be dwarfed by the potential network call + the re-serialization in my opinion. Even if hits the in-memory cache, we're talking about loads from main memory into L1 cache.
The cost of creating a Duration
is ~15ns (thanks ChatGPT for the estimation, looks legit to me, and half of that cost is just object allocation) and the cost of calling toMillis
is ~1ns (assuming there's no nanosecond component), so we're talking 16ns per-key returned overhead for using Duration on a 3.2Ghz CPU.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have evidence that this is actually significant on a per-retrieved doc basis?
No hard/recent evidence -- I suppose I'm over-leveraging on some older benchmarks I only half-remember from back at Confluent. It's probably not worth worrying about right now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just nits inline
kafka-client/src/main/java/dev/responsive/kafka/api/stores/ResponsiveKeyValueParams.java
Outdated
Show resolved
Hide resolved
kafka-client/src/main/java/dev/responsive/kafka/api/stores/ResponsiveKeyValueParams.java
Outdated
Show resolved
Hide resolved
kafka-client/src/main/java/dev/responsive/kafka/api/stores/TtlProvider.java
Outdated
Show resolved
Hide resolved
return new TtlDuration(Duration.ZERO, Ttl.INFINITE); | ||
} | ||
|
||
// TODO(sophie): store ttl as long to avoid Duration conversions on the hot path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have evidence that this is actually significant on a per-retrieved doc basis?
A few bitwise CPU operations seems to be dwarfed by the potential network call + the re-serialization in my opinion. Even if hits the in-memory cache, we're talking about loads from main memory into L1 cache.
The cost of creating a Duration
is ~15ns (thanks ChatGPT for the estimation, looks legit to me, and half of that cost is just object allocation) and the cost of calling toMillis
is ~1ns (assuming there's no nanosecond component), so we're talking 16ns per-key returned overhead for using Duration on a 3.2Ghz CPU.
First PR for row-level ttl. Includes only the API changes and supporting code
TODO (in order)
Next PR: #371