Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNOW-914666 Adds MAX_CLIENT_LAG configuration option #586

Merged
merged 5 commits into from
Sep 14, 2023

Conversation

sfc-gh-tjones
Copy link
Contributor

We want to expose a knob that gives users the ability to control when data is ingested. This has a material difference on the size of blobs generated and can result in fewer smaller-sized blobs which in turn affects query performance. The trade-off is higher ingest latencies. We have decided to expose this in the form of an optional MAX_CLIENT_LAG option that accepts inputs as the following:

  • number second (ex: 1 second)
  • number seconds (ex: 2 seconds)
  • number minute (ex: 1 minute)
  • number minutes (ex: 2 minutes)

By default we use 1 second as the maximum client lag which is the current behavior of the SDK.

Note that this dictates when a flush is triggered to cloud storage. Depending on your connection to cloud storage and cloud storage tail latencies a blob persist may take longer than expected. Therefore, it is helpful to think of this parameter as a target, rather than an absolute number.

@test Adds tests to ParameterProviderTest

We want to expose a knob that gives users the ability to control when
data is ingested. This has a material difference on the size of blobs
generated and can result in fewer smaller-sized blobs which in turn
affects query performance. The trade-off is higher ingest latencies. We
have decided to expose this in the form of an optional `MAX_CLIENT_LAG`
option that accepts inputs as the following:

- `number second` (ex: `1 second`)
- `number seconds` (ex: `2 seconds`)
- `number minute` (ex: `1 minute`)
- `number minutes` (ex: `2 minutes`)

By default we use 1 second as the maximum client lag which is the
current behavior of the SDK.

Note that this dictates when a flush is triggered to cloud storage.
Depending on your connection to cloud storage and cloud storage tail
latencies a blob persist may take longer than expected. Therefore, it is
helpful to think of this parameter as a target, rather than an absolute
number.

@test Adds tests to `ParameterProviderTest`
@sfc-gh-tjones sfc-gh-tjones requested review from sfc-gh-tzhang and a team as code owners September 13, 2023 20:45
Copy link
Contributor

@sfc-gh-asen sfc-gh-asen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, i had couple questions only, thank you for the quick change!

@sfc-gh-xhuang
Copy link
Contributor

What's the max lag that can be set ?
Perhaps we should enforce an upper bound, 10 minutes?

@sfc-gh-tjones
Copy link
Contributor Author

What's the max lag that can be set ? Perhaps we should enforce an upper bound, 10 minutes?

We probably should have a minimum and maximum. 10 minutes seems reasonable as an upper limit, what about for a lower limit?

@sfc-gh-tjones
Copy link
Contributor Author

What's the max lag that can be set ? Perhaps we should enforce an upper bound, 10 minutes?

@sfc-gh-xhuang added minimum of 1 second, max of 10 minutes. PTAL

Copy link
Contributor

@sfc-gh-asen sfc-gh-asen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thank you

Copy link
Collaborator

@sfc-gh-japatel sfc-gh-japatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@sfc-gh-tjones sfc-gh-tjones merged commit f57f92b into master Sep 14, 2023
9 checks passed
@sfc-gh-tjones sfc-gh-tjones deleted the SNOW-914666 branch September 14, 2023 02:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants