-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Relational Algebra example. #395
base: safe-names-dev
Are you sure you want to change the base?
Conversation
Yeah, the current syntax doesn't support record flattening. I think it's actually a more fundamental issue than that, too, because the concatenation of two records can't be unified with a concrete record type in a unique way (since the fields could come from either (I think @dougalm and I had a conversation a few months ago where we were wondering if there were any practical use cases for record concatenation. Looks like this may be one!) My not-fully-thought-through hypothesis is that record concatenation and record equality are both special cases of making it possible to iterate over the fields of arbitrary records in user-space, which could be accomplished by somehow transforming records into a generic representation. But I'm still not sure how much work that would be; it's something I'm considering playing around with in the future but I don't know when I'll get around to it. Similar to the manual renaming isomorphism, it should be possible right now to manually write out a concatenation for two tables by providing an isomorphism that lists all of the fields from |
Thanks for your thoughts, @danieldjohnson !
I suspected as much, or you would have included an example of it in
Yes, although we should keep in mind that the main use case here,
I also am getting the sense that moving things to a user-manipulable representation is going to be necessary for usability. However, I've been surprised by how much you've managed to be able to do with the current representation, so I'm not sure. Part of the motivation for this example was for me to see how far I could push this isomorphism framework, and in this case I was able to push it a little further than I thought, but it looks like I'm hitting some real limitations.
Right, I found that example in |
roughly following [Wikipedia](https://en.wikipedia.org/wiki/Relational_algebra). | ||
|
||
'Right now there are two major differences from the standard relational algebra: | ||
1. We're not using sets of records, but rather tables. I'm not sure which is better. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine. SQL also doesn't have set semantics, because every table entry has a distinct primary key (which corresponds to table's index set in Dex!).
|
||
def cartesian_product (r:n=>a) (s:m=>b) : (n & m)=>(a & b) = | ||
-- Follows the set theory defintion, not the relational | ||
-- algebra definition, which would flatten a & b. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But... it does flatten to a & b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote that poorly, I said it "would flatten a & b
", i.e. the relational algebra definition would flatten a & b
to (a_1, a_2, ... b_1, b_2...)
, but this implementation doesn't do that.
(left_iso : Iso ({&} & a1) (b & c1)) | ||
(right_iso: Iso ({&} & a2) (b & c2)) | ||
(left_tab:n=>a1) (right_tab:m=>a2) : List (b & c1 & c2) = | ||
concat for (i, j). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should just write filter : (n=>(Maybe a)) -> List a
. Or maybe call it concatMaybes
to match catMaybes
from Haskell.
In principle Dex's record system could use tables of records to do the sorts of data-munging usually done by packages such as pandas.
In order to match the standard relational algebra, there are a couple of things that I don't think can be expressed in the current type system:
def combine_records (a:{&...l}) (b:{&...r}): ({& ...l ...r}) = todo
but I don't think the current syntax supports this. Any thoughts, @danieldjohnson ?
Another missing feature (as discussed with @apaszke and @dougalm is the ability to create custom index sets. Ideally, instead of tracking primary keys, one would be able to re-index a table of records by one of its columns. Something like:
but I don't know how to write
elementsOfColumn
to create a new index set.