-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] improve sql digest for massive compound predicates #53207
[Enhancement] improve sql digest for massive compound predicates #53207
Conversation
exprs.add((SlotRef) x); | ||
} | ||
}); | ||
return "$massive_compounds[" + exprs.stream().map(SlotRef::toSqlImpl) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SlotRefs are query-specific so unstable, suggest to use original SlotRefs(slotref references to table column) and convert original slotrefs into table columns then sort these columns lexicographically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea.
if (pair.first + pair.second >= MASSIVE_COMPOUND_LIMIT) { | ||
// Only record de-duplicated slots if there are too many compounds | ||
Set<SlotRef> exprs = Sets.newHashSet(); | ||
traverse(node, x -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use node.collectAllSlotRefs()
directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, I should use it. but it seems I should improve the performance of that method first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it constructs a lot of intermediate list,, i'm afraid the performance can be terrible
@Override | ||
public String visitCompoundPredicate(CompoundPredicate node, Void context) { | ||
Pair<Integer, Integer> pair = countCompound(node); | ||
if (pair.first + pair.second >= MASSIVE_COMPOUND_LIMIT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about Expr#flattenPredicate;
14527f8
to
7dbd265
Compare
Signed-off-by: Murphy <[email protected]>
Signed-off-by: Murphy <[email protected]>
7dbd265
to
8693c26
Compare
Quality Gate passedIssues Measures |
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[FE Incremental Coverage Report]✅ pass : 53 / 55 (96.36%) file detail
|
[BE Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
@Mergifyio backport branch-3.4 |
@Mergifyio backport branch-3.3 |
@Mergifyio backport branch-3.2 |
✅ Backports have been created
|
✅ Backports have been created
|
✅ Backports have been created
|
) Signed-off-by: Murphy <[email protected]> (cherry picked from commit c411ac5)
) Signed-off-by: Murphy <[email protected]> (cherry picked from commit c411ac5) # Conflicts: # fe/fe-core/src/main/java/com/starrocks/analysis/Expr.java # fe/fe-core/src/main/java/com/starrocks/sql/optimizer/transformer/RelationTransformer.java
) Signed-off-by: Murphy <[email protected]> (cherry picked from commit c411ac5) # Conflicts: # fe/fe-core/src/main/java/com/starrocks/analysis/Expr.java # fe/fe-core/src/main/java/com/starrocks/sql/analyzer/AnalyzerUtils.java # fe/fe-core/src/main/java/com/starrocks/sql/common/SqlDigestBuilder.java # fe/fe-core/src/main/java/com/starrocks/sql/optimizer/transformer/RelationTransformer.java # fe/fe-core/src/test/java/com/starrocks/sql/common/SqlDigestBuilderTest.java
Why I'm doing:
What I'm doing:
In cases where an OR predicate is dynamically constructed with a fixed column, the SQL digest varies due to a differing number of predicates. To address this, we consolidate extensive compound predicates into a compact format, ensuring consistent SQL digests.
Fixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: