You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I searched in the issues and found nothing similar.
Motivation
Optimize the logic of org.apache.paimon.flink.sink.cdc.UpdatedDataFieldsProcessFunctionBase#extractSchemaChanges: prioritize whether updatedDataFields is empty to avoid accessing the latest schema information every time
Solution
prioritize whether updatedDataFields is empty to avoid accessing the latest schema information every time
Anything else?
No response
Are you willing to submit a PR?
I'm willing to submit a PR!
The text was updated successfully, but these errors were encountered:
Background
Here, each thread maintains a set of field information separately. If different threads process the same field one after another, the shema change check will be triggered. In this case, the latest schame information will be frequently obtained, resulting in a decrease in the overall throughput of the task, which will cause subsequent exceptions such as checkpoint failure.
Solution
Maintain a state cache for the latest shema information to avoid direct access to the file system.
For example, the Paimon table has 1500 fields, the Parallelism of the Write operator is 500, and the task is restarted. In extreme cases, it will trigger 1500500 calls to the latest schema information. If each call takes 20ms, the total time is: 1500500*30ms=6.25h. This will greatly affect the throughput of the task.
Search before asking
Motivation
Optimize the logic of org.apache.paimon.flink.sink.cdc.UpdatedDataFieldsProcessFunctionBase#extractSchemaChanges: prioritize whether updatedDataFields is empty to avoid accessing the latest schema information every time
Solution
prioritize whether updatedDataFields is empty to avoid accessing the latest schema information every time
Anything else?
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: