-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SNOW-1061848: DirectTopicPartitionChannel - copy of the old implementation with removed buffer #852
SNOW-1061848: DirectTopicPartitionChannel - copy of the old implementation with removed buffer #852
Conversation
2615388
to
a4ce755
Compare
* @return InsertValidationResponse a response that wraps around InsertValidationResponse | ||
*/ | ||
private InsertValidationResponse insertRowsWithFallback(Map<String, Object> transformedRecord, long offset) { | ||
Fallback<Object> reopenChannelFallbackExecutorForInsertRows = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i know this code was here initially, but it does obscure the view a lot. what do you think - could we perhaps extract the failsafe handling outside, to some other class, make it sort of an aspect applied on top of the channel, so from channel perspective- we simply insert rows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while this code is functional, its not readable or easily debuggable. Probably out of scope for this pr, but yes i support either splitting the fallback logic into a different class or using a different implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it does the job, however, the amount of state managed by the channel is huge.
shall we consider refactoring it into smaller pieces?
i think for scenarios like that, building something like a flux stream could significantly reduce the complexity.. but that's something we could consider later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job!
Apply ./format.sh
, please.
src/main/java/com/snowflake/kafka/connector/internal/streaming/DirectTopicPartitionChannel.java
Show resolved
Hide resolved
src/main/java/com/snowflake/kafka/connector/internal/streaming/DirectTopicPartitionChannel.java
Outdated
Show resolved
Hide resolved
*/ | ||
private InsertRowsResponse insertRowsWithFallback(StreamingBuffer buffer) { | ||
private InsertValidationResponse insertRowsWithFallback(Map<String, Object> transformedRecord, long offset) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: insertRowWithFallback.
src/main/java/com/snowflake/kafka/connector/internal/streaming/DirectTopicPartitionChannel.java
Outdated
Show resolved
Hide resolved
src/main/java/com/snowflake/kafka/connector/internal/streaming/DirectTopicPartitionChannel.java
Outdated
Show resolved
Hide resolved
src/main/java/com/snowflake/kafka/connector/internal/streaming/DirectTopicPartitionChannel.java
Show resolved
Hide resolved
src/main/java/com/snowflake/kafka/connector/internal/streaming/DirectTopicPartitionChannel.java
Outdated
Show resolved
Hide resolved
src/main/java/com/snowflake/kafka/connector/internal/streaming/DirectTopicPartitionChannel.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm thanks for the change, i like how you moved the schematization logic. left comments about logging and other small changes
src/main/java/com/snowflake/kafka/connector/internal/streaming/DirectTopicPartitionChannel.java
Outdated
Show resolved
Hide resolved
return; | ||
} | ||
// Accept the incoming record only if we don't have a valid offset token at server side, or the | ||
// incoming record offset is 1 + the processed offset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace 1+ with greater in the comment here to match behavior
src/main/java/com/snowflake/kafka/connector/internal/streaming/DirectTopicPartitionChannel.java
Outdated
Show resolved
Hide resolved
src/main/java/com/snowflake/kafka/connector/internal/streaming/DirectTopicPartitionChannel.java
Outdated
Show resolved
Hide resolved
* @return InsertValidationResponse a response that wraps around InsertValidationResponse | ||
*/ | ||
private InsertValidationResponse insertRowsWithFallback(Map<String, Object> transformedRecord, long offset) { | ||
Fallback<Object> reopenChannelFallbackExecutorForInsertRows = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while this code is functional, its not readable or easily debuggable. Probably out of scope for this pr, but yes i support either splitting the fallback logic into a different class or using a different implementation
src/main/java/com/snowflake/kafka/connector/internal/streaming/DirectTopicPartitionChannel.java
Outdated
Show resolved
Hide resolved
} | ||
|
||
@Parameterized.Parameters(name = "{0}") | ||
public static Collection<Object[]> input() { | ||
return Arrays.asList(new Object[][] {{true}, {false}}); | ||
return Arrays.asList(new Object[][] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets name the variables similar to this test for better readability in intellij
src/main/java/com/snowflake/kafka/connector/internal/streaming/DirectTopicPartitionChannel.java
Outdated
Show resolved
Hide resolved
src/main/java/com/snowflake/kafka/connector/internal/streaming/DirectTopicPartitionChannel.java
Outdated
Show resolved
Hide resolved
* being invoked) | ||
* @return Approximate long size of record in bytes. 0 if record is broken | ||
*/ | ||
protected long getApproxSizeOfRecordInBytes(SinkRecord kafkaSinkRecord) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh nice were removing this! we actually make this approximation twice per record, so hopefully this will speed up ingestion to not make this pass through
cc2b6f9
to
0ae0d55
Compare
…ation with removed buffer (snowflakedb#852)
Overview
SNOW-1061848
This is my second attempt to remove the buffering from a quite tightly convoluted piece of code in the
BufferredTopicPartitionChannel
class.To reduce the extent of changes, and make it easily comparable to the old implementation, the PR is prepared in 3 steps (1 step is 1 commit).
The current implementation is validated with:
TopicPartitionChannelTest
- all greenTopicPartitionChannelIT
-however it still requires manual change in the code to follow a single buffer path.This change is safe to merge as we still do not allow for using a single buffer. As the next PRs I'd like to gradually make other tests passing.
Pre-review checklist
snowflake.ingestion.method
.Yes
- Added end to end and Unit Tests.No
- Suggest why it is not param protected