Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Side aliases in JOIN command should be optional #857

Closed
LantaoJin opened this issue Oct 31, 2024 · 0 comments · Fixed by #862
Closed

[BUG] Side aliases in JOIN command should be optional #857

LantaoJin opened this issue Oct 31, 2024 · 0 comments · Fixed by #862
Assignees
Labels
0.6 bug Something isn't working Lang:PPL Pipe Processing Language support

Comments

@LantaoJin
Copy link
Member

LantaoJin commented Oct 31, 2024

What is the bug?
Currently, the syntax of JOIN command requires side aliases to be present as following

leftAlias

  • Syntax: left = <leftAlias>
  • Required
  • Description: The subquery alias to use with the left join side, to avoid ambiguous naming.

rightAlias

  • Syntax: right = <rightAlias>
  • Required
  • Description: The subquery alias to use with the right join side, to avoid ambiguous naming.

But these required side aliases could cause a bug which hide the attribute references in table.
Imaging a case as following: Assume table1, table2, and table3 all contain a column id.

select
  *
from
  table1 t1,
  table2 t2,
  table3 t3
where
 t1.id = t2.id
 and t1.id = t3.id

The logical plan looks

'Project [*]
+- 'Filter (('t1.id = 't2.id) AND ('t1.id = 't3.id))
   +- 'Join Inner
      :- 'Join Inner
      :  :- 'SubqueryAlias t1
      :  :  +- 'UnresolvedRelation [table1], [], false
      :  +- 'SubqueryAlias t2
      :     +- 'UnresolvedRelation [table2], [], false
      +- 'SubqueryAlias t3
         +- 'UnresolvedRelation [table3], [], false

To rewrite above SQL query to PPL query, we will get

source = table1
| join left = t1 right = t2 ON t1.id = t2.id table2
| join left = l1 right = t3 ON t1.id = t3.id table3 // <------ issue here! 

The PPL query throws an exception with message:

t1.id cannot be resolved, Did you mean one of the following? [l1.id, l1.id, t3.id].

It because the new left alias l1 overrides the table alias t1 and t2.
Its logical plan looks

'Project [*]
+- 'Filter (('t1.id = 't2.id) AND ('t1.id = 't3.id)). <------ t1.id cannot be resolved
   +- 'Join Inner
      :- 'SubqueryAlias l1  <------ issue root cause
      :  +- 'Join Inner
      :    :- 'SubqueryAlias t1
      :    :  +- 'UnresolvedRelation [table1], [], false
      :    +- 'SubqueryAlias t2
      :       +- 'UnresolvedRelation [table2], [], false
      +- 'SubqueryAlias t3
         +- 'UnresolvedRelation [table3], [], false
@LantaoJin LantaoJin added bug Something isn't working untriaged and removed untriaged labels Oct 31, 2024
@LantaoJin LantaoJin self-assigned this Oct 31, 2024
@YANG-DB YANG-DB added Lang:PPL Pipe Processing Language support 0.6 labels Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.6 bug Something isn't working Lang:PPL Pipe Processing Language support
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants