-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1512047 Introduce independent per-table flushes when interleaving is disabled #788
Conversation
f3ef51c
to
be25b1e
Compare
src/main/java/net/snowflake/ingest/streaming/internal/ChannelCache.java
Outdated
Show resolved
Hide resolved
src/main/java/net/snowflake/ingest/streaming/internal/ChannelCache.java
Outdated
Show resolved
Hide resolved
src/main/java/net/snowflake/ingest/streaming/internal/ChannelCache.java
Outdated
Show resolved
Hide resolved
src/main/java/net/snowflake/ingest/streaming/internal/ChannelCache.java
Outdated
Show resolved
Hide resolved
src/main/java/net/snowflake/ingest/streaming/internal/FlushService.java
Outdated
Show resolved
Hide resolved
src/main/java/net/snowflake/ingest/streaming/internal/FlushService.java
Outdated
Show resolved
Hide resolved
src/main/java/net/snowflake/ingest/streaming/internal/FlushService.java
Outdated
Show resolved
Hide resolved
&& !tablesToFlush.isEmpty()) { | ||
tablesToFlush.addAll( | ||
this.channelCache.entrySet().stream().map(Map.Entry::getKey).collect(Collectors.toSet())); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to do this, If the previous code block already picked up the minimal set of tables needing flush?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, even if interleaving is enabled, I'd prefer to keep the above logic for flushing and wait until the MaxClientLag for each channel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I aimed to maintain the original interleaving behavior, where all channels are flushed if any channel needs it. With independent flushing intervals, we might miss the chance to combine multiple chunks into the same BDEC. A potential workaround is to discretize timestamps and reduce jitter on lastFlushTime
in interleaving mode. This can increase the chances of combining multiple chunks into the same blob. What do you think?
@@ -33,6 +40,12 @@ void addChannel(SnowflakeStreamingIngestChannelInternal<T> channel) { | |||
this.cache.computeIfAbsent( | |||
channel.getFullyQualifiedTableName(), v -> new ConcurrentHashMap<>()); | |||
|
|||
// Update the last flush time for the table, add jitter to avoid all channels flush at the same | |||
// time when the blobs are not interleaved | |||
this.lastFlushTime.putIfAbsent( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure i understand what this helps with. If someone does addChannel, and doesn't add any data for a minute, the first row that they add will trigger a flush since we'll mistakenly think its been a long time since the last flush.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Should we change the logic to following?
- Don't edit
lastFlushTime
when creating a channel. - When calling
putRow
orputRows
, iflastFlushTime
is null, set to current time. - Whenever a table is flushed, set
lastFlushTime
to null, go to step 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed pl track this with a JIRA so we don't forget about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jira created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, had small comments on top of Hitesh's comments
src/main/java/net/snowflake/ingest/streaming/internal/FlushService.java
Outdated
Show resolved
Hide resolved
src/main/java/net/snowflake/ingest/streaming/internal/FlushService.java
Outdated
Show resolved
Hide resolved
&& !tablesToFlush.isEmpty()) { | ||
tablesToFlush.addAll( | ||
this.channelCache.entrySet().stream().map(Map.Entry::getKey).collect(Collectors.toSet())); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, even if interleaving is enabled, I'd prefer to keep the above logic for flushing and wait until the MaxClientLag for each channel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, please get Alkin's or Toby's signoff before merging!
src/main/java/net/snowflake/ingest/streaming/internal/ChannelCache.java
Outdated
Show resolved
Hide resolved
return; | ||
logFlushTask(isForce, tablesToFlush, flushStartTime); | ||
distributeFlushTasks(tablesToFlush); | ||
long flushEndTime = System.currentTimeMillis(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name is confusing, this is the end time of the previous flush?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to prevFlushEndTime
.
src/main/java/net/snowflake/ingest/streaming/internal/ChannelCache.java
Outdated
Show resolved
Hide resolved
src/main/java/net/snowflake/ingest/streaming/internal/ChannelCache.java
Outdated
Show resolved
Hide resolved
if (this.owningClient.getParameterProvider().getMaxChunksInBlobAndRegistrationRequest() != 1 | ||
&& !tablesToFlush.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do the check before populating tablesToFlush
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline. Preserve the client level lastFlushTime
and isNeedFlush
to avoid checking table level flush info when interleaving is enabled which might cause performance change. Preserve old logging format when interleaving is enabled to avoid logging too much information.
cc: @sfc-gh-hmadan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments, otherwise LGTM
src/main/java/net/snowflake/ingest/streaming/internal/ChannelCache.java
Outdated
Show resolved
Hide resolved
src/main/java/net/snowflake/ingest/streaming/internal/ChannelCache.java
Outdated
Show resolved
Hide resolved
src/main/java/net/snowflake/ingest/streaming/internal/ChannelCache.java
Outdated
Show resolved
Hide resolved
src/main/java/net/snowflake/ingest/streaming/internal/ChannelCache.java
Outdated
Show resolved
Hide resolved
src/main/java/net/snowflake/ingest/streaming/internal/FlushService.java
Outdated
Show resolved
Hide resolved
@@ -276,7 +276,7 @@ public void testDropChannel() throws Exception { | |||
@Test | |||
public void testParameterOverrides() throws Exception { | |||
Map<String, Object> parameterMap = new HashMap<>(); | |||
parameterMap.put(ParameterProvider.MAX_CLIENT_LAG, "3 sec"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this is changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ParameterProvider does not support "sec" (ref). The old code somehow ignore the exception thrown in thread.
Thanks for solving this issue :D #570 |
The current SDK behavior flushes all channels simultaneously when any buffer reaches its limit, potentially causing unnecessary small file flushes if interleaving is disabled (
MAX_CHUNKS_IN_BLOB_AND_REGISTRATION_REQUEST = 1
) and the ingestion throughput between tables is uneven.As the WIP streaming to Iceberg table feature set
MAX_CHUNKS_IN_BLOB_AND_REGISTRATION_REQUEST = 1
. This PR introduces per-table flushing to avoid the above issue.JIRA