-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-25509][connector/kafka] Add RecordEvaluator to dynamically stop source based on de-serialized records #76
base: main
Are you sure you want to change the base?
Conversation
@ruanhang1993 Can you rebase? |
030d0ad
to
79573fe
Compare
@ruanhang1993 If you want to add support for this feature, I think you need to propose to drop support for Flink 1.18 in this version |
Thanks, @MartijnVisser . This PR relies on the API changes in FLIP-208, which released in 1.18. |
@ruanhang1993 I think that means that in this PR, you'll need to remove support for 1.17. Since this feature requires 1.18, it would mean that |
Hi, @MartijnVisser. You are right. |
I don't think we need 3.1 to support Flink 1.17 and Flink 1.18. We can use 3.0 for the purpose of supporting 1.17, but we need to continue supporting patch fixes on it. |
* @param eofRecordEvaluator a {@link RecordEvaluator recordEvaluator} | ||
* @return this KafkaSourceBuilder. | ||
*/ | ||
public KafkaSourceBuilder<OUT> setEofRecordEvaluator(RecordEvaluator<OUT> eofRecordEvaluator) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When this condition is met, do we expect the Flink job to go into FINISHED state? If so, I think we need to only support this in bounded mode, as unbounded mode can't reach FINISHED state. This is because the KafkaSource and DynamicKafkaSource only send "notifyNoMoreSplits" for the bounded case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this comment, @mas-chen .
Yes, we should make the source FINISHED. I will fix this part later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @mas-chen .
I read some parts about the stoppingOffsetsInitializer
. Actually the eofRecordEvaluator
does the same job as stoppingOffsetsInitializer
. We want to use both eofRecordEvaluator
and stoppingOffsetsInitializer
to decide when to stop reading and support to use them in one same source together.
But I find the method KafkaSourceBuilder<OUT> setUnbounded(OffsetsInitializer stoppingOffsetsInitializer)
in builder, which supports to set a stoppingOffsetsInitializer
in an unbounded source. I want to keep the same behavior as stoppingOffsetsInitializer
and do not add the limitation for eofRecordEvaluator
. What do you think?
Looking forward to your option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, then we should be fine. The job won't terminate (and I guess this is desired in some situations, just learned this) but the source will terminate eventually
We don't need it indeed. We just need to decide to which branch/version we'll need to submit backports. I'm inclined to do the following:
|
@mas-chen @ruanhang1993 I've brought this discussion (and my proposal) back to the Dev mailing list, see https://lists.apache.org/thread/0hp782dv6kjcv3ypthhfhyp7kglg83d0 |
@ruanhang1993 Do you want to update your PR, and drop support for testing against 1.17 and 1.18 as a separate commit? That should unblock this PR. |
…er records in kafka connector
b47e7d6
to
c7faaca
Compare
Add RecordEvaluator to dynamically stop source based on de-serialized records.