Additions and changes to initial part of Group-by section #115

mars0i · 2023-10-08T21:44:49Z

Added paragraph explaining that most operations apply to sub-datasets, and gave example to illustrate the idea.

Moved up two lines that relate to this point.

Added explanation of effect of grouping parameters on selection of rows for sub-datasets. I couldn't figure out how to describe the map from group names to indexes in the same way, so I made this item last, and gave it a different explanation.

Fixed one minor typo in first line of Group-by section (added "s" to "pack").

Fixed minor typos in line about grouped? meta tag.

Added paragraph explaining that most operations apply to sub-datasets, and gave example to illustrate the idea. Moved up two lines that relate to this point. Added explanation of effect of grouping parameters on selection of rows for sub-datasets. I couldn't figure out how to describe the map from group names to indexes in the same way, so I made this item last, and gave it a different explanation. Fixed one minor typo in first line of Group-by section (added "s" to "pack"). Fixed minor typos in line about `grouped?` meta tag. Additions, changes to beginning of Group-by section Added paragraph explaining that most operations apply to sub-datasets, and gave examples to illustrate the idea. Moved up two lines that relate to this point. Added explanation of effect of grouping parameters on selection of rows for sub-datasets. I couldn't figure out how to describe the map from group names to indexes in the same way, so I made this item last, and gave it a different explanation. Fixed small typo in first line of Group-by section (added "s" to "pack"). Fixed some typos in the line about the :grouped? meta tag.

mars0i · 2023-10-08T21:50:40Z

The commit message contains the same text twice. Sorry about that. I'm sure there's a way to fix it, but I couldn't figure it out, and it didn't seem worth a lot of time.

genmeblog · 2023-10-08T22:10:08Z

Cool! Thanks. I'll read it tomorrow (midnight here) and let you know if it's ok.

mars0i · 2023-10-09T00:29:54Z

Great. Also, I was thinking of suggesting additions to the group-by docstring, analogous to what I did TMD's group-by (techascent/tech.ml.dataset#375), but the docstring for TC's group-by would be different, because its behavior is more complex. I thought I'd wait and see what you thought of the index.Rmd changes. I can submit a separate PR for the docstring if that seems worthwhile.

genmeblog · 2023-10-09T07:33:59Z

docs/index.Rmd

+In the case of the first three of these methods, each sub-dataset contains all and only rows from the original data set that share the same grouping value:
+
+* the value of the row in a specified single column
+* the sequence of resulting values for a sequence of column names


In this case the result is a map not a sequence.

(tc/group-by DS [:V1]) ;; => _unnamed [2 3]: ;; | :name | :group-id | :data | ;; |---------|----------:|-----------------------| ;; | {:V1 1} | 0 | Group: {:V1 1} [5 4]: | ;; | {:V1 2} | 1 | Group: {:V1 2} [4 4]: |

Ah, I didn't realize that. I'll fix it. (I mixed up that case with an example in which I used a function that returned a sequence, which is of course different.)

genmeblog · 2023-10-09T07:47:18Z

docs/index.Rmd

+* the sequence of resulting values for a sequence of column names
+* the value returned by the function taking row as map
+
+In the case of the map from group names to sequences of indexes, each sub-dataset will contain all and only rows with the indexes in one value sequence of indexes.


This sounds incomprehensible a little bit to me:

[...] with the indexes in one value sequence of indexes

Maybe something like that?

[...] with the indexes listed in the sequence for a given group name (a key).

I see your point. :-) I like your revision and will make the change.

(I had trouble figuring out a way to spell out grouping by maps to values in a way that was brief but didn't make my eyes glaze over unless I kept in mind what I was talking about. I don't want to do that to a reader. I think the current version is OK.)

genmeblog

Great!

genmeblog · 2023-10-09T17:07:40Z

Do you want to work more on this topic? Or should I merge and deploy?

mars0i · 2023-10-09T17:53:13Z

Thanks @genmeblog. I'm fine with you merging it. (I saw some typos in a keyword name in another part of the doc, but they're completely unrelated, so I think I should submit a different PR for that.)

mars0i · 2023-10-09T18:57:28Z

I decided it was silly to leave the typos unfixed. There were two instances of :ungroup in the text that should be :ungroup?. That's now corrected. It concerns aggregate, but applied to grouped datasets, so it's related to the other parts of this PR. I hope you don't mind--I can roll it back if you want.

genmeblog · 2023-10-09T21:21:39Z

Great, thanks! I don't mind, of course. We can do doc updates in batches. It doesn't matter if they are related to one topic or many. The most important is to have it better and better.

genmeblog · 2023-10-09T21:22:50Z

docs/index.Rmd

@@ -1680,7 +1704,7 @@ Aggregator is a function or sequence or map of functions which accept dataset as

 Where map is given as an input or result, keys are treated as column names.

-Grouped dataset is ungrouped after aggreation. This can be turned off by setting `:ungroup` to false. In case you want to pass additional ungrouping parameters add them to the options.
+Grouped dataset is ungrouped after aggreation. This can be turned off by setting `:ungroup?` to false. In case you want to pass additional ungrouping parameters add them to the options.


I've found one more typo. aggreation -> aggregation.

Might as well bundle that into this PR, so I made the change.

(I didn't read this text carefully! I tried to experiment with ungrouping with aggregate, and it didn't work with :ungroup; then I noticed :ungroup? in the code example.)

Probably there are more such unfortunate things in the text... :/

Yeah, can't be helped.

(Do you need me to squash the commits at this point?)

genmeblog · 2023-10-10T08:17:43Z

No need to squash.
I will be merging today and releasing new version (there are some other changes waiting).

genmeblog requested changes Oct 9, 2023

View reviewed changes

genmeblog approved these changes Oct 9, 2023

View reviewed changes

Fixed two typos: :ungrouped s/b :ungrouped?

cf789db

genmeblog reviewed Oct 9, 2023

View reviewed changes

Fixed typo "aggreation" s/b "aggregation".

38b2675

genmeblog approved these changes Oct 10, 2023

View reviewed changes

genmeblog merged commit 515da95 into scicloj:master Oct 10, 2023
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additions and changes to initial part of Group-by section #115

Additions and changes to initial part of Group-by section #115

mars0i commented Oct 8, 2023 •

edited

Loading

mars0i commented Oct 8, 2023

genmeblog commented Oct 8, 2023

mars0i commented Oct 9, 2023

genmeblog Oct 9, 2023

mars0i Oct 9, 2023 •

edited

Loading

genmeblog Oct 9, 2023

mars0i Oct 9, 2023

genmeblog left a comment

genmeblog commented Oct 9, 2023

mars0i commented Oct 9, 2023

mars0i commented Oct 9, 2023

genmeblog commented Oct 9, 2023

genmeblog Oct 9, 2023

mars0i Oct 9, 2023

mars0i Oct 9, 2023

genmeblog Oct 9, 2023

mars0i Oct 10, 2023

genmeblog commented Oct 10, 2023

Additions and changes to initial part of Group-by section #115

Additions and changes to initial part of Group-by section #115

Conversation

mars0i commented Oct 8, 2023 • edited Loading

mars0i commented Oct 8, 2023

genmeblog commented Oct 8, 2023

mars0i commented Oct 9, 2023

Choose a reason for hiding this comment

mars0i Oct 9, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

genmeblog left a comment

Choose a reason for hiding this comment

genmeblog commented Oct 9, 2023

mars0i commented Oct 9, 2023

mars0i commented Oct 9, 2023

genmeblog commented Oct 9, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

genmeblog commented Oct 10, 2023

mars0i commented Oct 8, 2023 •

edited

Loading

mars0i Oct 9, 2023 •

edited

Loading