-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add RightMark Join #13252
base: main
Are you sure you want to change the base?
feat: add RightMark Join #13252
Conversation
@eejbyfeldt Just leaving it as a draft for now, if you have any pointers feel free to add on. |
@@ -136,6 +136,9 @@ fn swap_join_type(join_type: JoinType) -> JoinType { | |||
JoinType::LeftMark => { | |||
unreachable!("LeftMark join type does not support swapping") | |||
} | |||
JoinType::RightMark => { | |||
unreachable!("RightMark join type does not support swapping") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can support it now, I think this should be relatively easy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, supporting swap is partly why we are adding RightMark. So would be good to have that fixed in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some minor comments, otherwise it looks good so far to me.
let probe_indices = (0..prune_length) | ||
.map(R::Native::from_usize) | ||
.collect::<PrimitiveArray<R>>(); | ||
let build_indices = (0..prune_length) | ||
.map(|idx| { | ||
// For mark join we output a dummy index 0 to indicate the row had a match | ||
if visited_rows.contains(&(idx + deleted_offset)) { | ||
Some(L::Native::from_usize(0).unwrap()) | ||
} else { | ||
None | ||
} | ||
}) | ||
.collect(); | ||
(build_indices, probe_indices) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very similar to what we do or the LeftMark
could we move this code into a function and call it for both?
let left_field = once(( | ||
Field::new("mark", arrow_schema::DataType::Boolean, false), | ||
ColumnIndex { | ||
index: 0, // 'mark' is not associated with either side | ||
side: JoinSide::None, | ||
}, | ||
)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could also be moved to a function that is used for both LeftMark
and RightMark
.
@@ -3911,6 +3913,7 @@ impl<'de> serde::Deserialize<'de> for JoinType { | |||
"RIGHTSEMI" => Ok(JoinType::Rightsemi), | |||
"RIGHTANTI" => Ok(JoinType::Rightanti), | |||
"LEFTMARK" => Ok(JoinType::Leftmark), | |||
"RIGHTMARK" => Ok(JoinTYpe::Rightmark), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is causing CI failures. I belive this file should be generated by running: https://github.com/apache/datafusion/blob/main/datafusion/proto-common/regen.sh
"RIGHTMARK" => Ok(JoinTYpe::Rightmark), | |
"RIGHTMARK" => Ok(JoinType::Rightmark), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this included in the docs?
@@ -136,6 +136,9 @@ fn swap_join_type(join_type: JoinType) -> JoinType { | |||
JoinType::LeftMark => { | |||
unreachable!("LeftMark join type does not support swapping") | |||
} | |||
JoinType::RightMark => { | |||
unreachable!("RightMark join type does not support swapping") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, supporting swap is partly why we are adding RightMark. So would be good to have that fixed in this PR.
Co-authored-by: Emil Ejbyfeldt <[email protected]>
Co-authored-by: Emil Ejbyfeldt <[email protected]>
@eejbyfeldt I implemented the swapping, would be nice to see if I did that correctly. I made a change to
Instead of this:
|
Which issue does this PR close?
Closes #13138 .
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?