Add support for ordered-set aggregation functions (`WITHIN GROUP`) #576

shane-circuithub · 2023-10-05T15:54:08Z

This commit adds support for ordered-set aggregation functions such as mode() which use the WITHIN GROUP (ORDER BY _) syntax. However, it doesn't actually expose any public facing API for this, basically because I don't know yet what that API should be for Opaleye. There are a lot of subtle restrictions on the kinds of arguments you can pass to the ordered-set aggregation functions (e.g., percentile_disc() can take either a constant expression or one of the columns from the query we're aggregating over, but only if it's part of the GROUP BY clause) that would be hard to capture in the types that Opaleye currently uses.

What this commit does is add just enough internals for me to able to experiment with a limited API in Rel8 for ordered-set aggregation functions that might be subject to change in the future. But I don't want to add something half-baked to Opaleye's public facing API just yet.

This means that expressions contained in `ORDER BY` clauses of aggregation functions are now also renamed by `extractAggregateFields`.

This commit adds support for ordered-set aggregation functions such as `mode()` which use the `WITHIN GROUP (ORDER BY _)` syntax. However, it doesn't actually expose any public facing API for this, basically because I don't know yet what that API should be for Opaleye. There are a lot of subtle restrictions on the kinds of arguments you can pass to the ordered-set aggregation functions (e.g., `percentile_disc()` can take either a constant expression or one of the columns from the query we're aggregating over, but only if it's part of the `GROUP BY` clause) that would be hard to capture in the types that Opaleye currently uses. What this commit does is add just enough internals for me to able to experiment with a limited API in Rel8 for ordered-set aggregation functions that might be subject to change in the future. But I don't want to add something half-baked to Opaleye's public facing API just yet.

shane-circuithub · 2023-10-05T16:11:46Z

This is based on the refactoring in #575, but no longer depends on the fixes proposed in #577 or #578.

tomjaguarpaw · 2023-10-05T18:58:52Z

Yup, all good. Likewise split up a bit and merged separately.

Due to the splitting tool all the commits get attributed to me. Sorry about that. I try to credit you on all the commits, e.g. 1367e70. If you feel strongly about this please let me know and I'll see what I can do to fix the tool.

shane-circuithub · 2023-10-06T00:28:15Z

No worries at all, I don't care about the attribution, just happy to have this merged so I can play around with it! Thanks for being so responsive in the last week with all the PRs.

tomjaguarpaw · 2023-10-06T06:43:48Z

You're welcome. Thanks for the great PRs!

It's redundant because we only need to rebind PrimExprs that came from a lateral subquery. As explained at [1], rather than carefully analysing which PrimExprs came from a lateral subquery we can just rebind everything. A previous commit [2] changed things so that everything mentioned in aggregator order expressions is rebound. Instead we probably should have done what this commit does, that is, added an Unpackspec constraint to aggregate. Fixes #587 This is a simpler approach to resolving the issues discussed at * #585 * #578 This still suffers from the problem described at: #578 (comment) i.e. if we find a way of duplicating field names, such as O.asc (\x -> snd x O..++ snd x) then we can still create crashing queries. The benefit of this comment is that there is a way of generating non-crashing queries! [1] https://github.com/tomjaguarpaw/haskell-opaleye/blob/52a4063188dd617ff91050dc0f2e27fc0570633c/src/Opaleye/Internal/Aggregate.hs#L111-L114 [2] d848317, part of #576

shane-circuithub added 4 commits October 5, 2023 17:08

Refactor aggregates to internally take a *list* of PrimExprs

41a69ad

Make OrderExpr an alias for a polymorphic Traversable OrderExpr'

83e0577

Switch aggrOrder to use OrderExpr' a instead of OrderExpr

11e3693

This means that expressions contained in `ORDER BY` clauses of aggregation functions are now also renamed by `extractAggregateFields`.

shane-circuithub force-pushed the within-group branch from 1eea642 to 7f8b018 Compare October 5, 2023 16:09

tomjaguarpaw closed this Oct 5, 2023

shane-circuithub mentioned this pull request Jan 9, 2024

Rewrite only references in aggregation, not all values #585

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for ordered-set aggregation functions (`WITHIN GROUP`) #576

Add support for ordered-set aggregation functions (`WITHIN GROUP`) #576

shane-circuithub commented Oct 5, 2023

shane-circuithub commented Oct 5, 2023

tomjaguarpaw commented Oct 5, 2023

shane-circuithub commented Oct 6, 2023

tomjaguarpaw commented Oct 6, 2023

Add support for ordered-set aggregation functions (WITHIN GROUP) #576

Add support for ordered-set aggregation functions (WITHIN GROUP) #576

Conversation

shane-circuithub commented Oct 5, 2023

shane-circuithub commented Oct 5, 2023

tomjaguarpaw commented Oct 5, 2023

shane-circuithub commented Oct 6, 2023

tomjaguarpaw commented Oct 6, 2023

Add support for ordered-set aggregation functions (`WITHIN GROUP`) #576

Add support for ordered-set aggregation functions (`WITHIN GROUP`) #576