-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-914666 Adds MAX_CLIENT_LAG configuration option #586
Conversation
We want to expose a knob that gives users the ability to control when data is ingested. This has a material difference on the size of blobs generated and can result in fewer smaller-sized blobs which in turn affects query performance. The trade-off is higher ingest latencies. We have decided to expose this in the form of an optional `MAX_CLIENT_LAG` option that accepts inputs as the following: - `number second` (ex: `1 second`) - `number seconds` (ex: `2 seconds`) - `number minute` (ex: `1 minute`) - `number minutes` (ex: `2 minutes`) By default we use 1 second as the maximum client lag which is the current behavior of the SDK. Note that this dictates when a flush is triggered to cloud storage. Depending on your connection to cloud storage and cloud storage tail latencies a blob persist may take longer than expected. Therefore, it is helpful to think of this parameter as a target, rather than an absolute number. @test Adds tests to `ParameterProviderTest`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, i had couple questions only, thank you for the quick change!
src/main/java/net/snowflake/ingest/utils/ParameterProvider.java
Outdated
Show resolved
Hide resolved
src/main/java/net/snowflake/ingest/utils/ParameterProvider.java
Outdated
Show resolved
Hide resolved
src/main/java/net/snowflake/ingest/utils/ParameterProvider.java
Outdated
Show resolved
Hide resolved
src/test/java/net/snowflake/ingest/streaming/internal/ParameterProviderTest.java
Show resolved
Hide resolved
What's the max lag that can be set ? |
We probably should have a minimum and maximum. 10 minutes seems reasonable as an upper limit, what about for a lower limit? |
@sfc-gh-xhuang added minimum of 1 second, max of 10 minutes. PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thank you
src/main/java/net/snowflake/ingest/utils/ParameterProvider.java
Outdated
Show resolved
Hide resolved
src/main/java/net/snowflake/ingest/utils/ParameterProvider.java
Outdated
Show resolved
Hide resolved
src/main/java/net/snowflake/ingest/utils/ParameterProvider.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
We want to expose a knob that gives users the ability to control when data is ingested. This has a material difference on the size of blobs generated and can result in fewer smaller-sized blobs which in turn affects query performance. The trade-off is higher ingest latencies. We have decided to expose this in the form of an optional
MAX_CLIENT_LAG
option that accepts inputs as the following:number second
(ex:1 second
)number seconds
(ex:2 seconds
)number minute
(ex:1 minute
)number minutes
(ex:2 minutes
)By default we use 1 second as the maximum client lag which is the current behavior of the SDK.
Note that this dictates when a flush is triggered to cloud storage. Depending on your connection to cloud storage and cloud storage tail latencies a blob persist may take longer than expected. Therefore, it is helpful to think of this parameter as a target, rather than an absolute number.
@test Adds tests to
ParameterProviderTest