Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cost-model: group-by column ref property + cost model design? #202

Open
skyzh opened this issue Oct 29, 2024 · 2 comments
Open

cost-model: group-by column ref property + cost model design? #202

skyzh opened this issue Oct 29, 2024 · 2 comments

Comments

@skyzh
Copy link
Member

skyzh commented Oct 29, 2024

currently, aggregation group-by's logical property is like:

select v1 from t1 group by v1;

Agg group=v1 <- schema=[v1], column_ref=[v1]
  Scan t1

but actually, group by could change the distribution of the column, so probably we should set it to derived, or find a way to represent it? if a later join refers to this column, we should treat it differently.

@jurplel
Copy link
Member

jurplel commented Oct 30, 2024

so probably we should set it to derived

How did you mean? It looks like it's just storing the group by column. I am not sure i'm following where distribution of a column is stored here

@skyzh
Copy link
Member Author

skyzh commented Oct 30, 2024

I think it's probably better to store it as Distinct(v1) in column ref logical property so that the cost model can take such information into account

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants