diff --git a/eeps/eep-0073.md b/eeps/eep-0073.md new file mode 100644 index 0000000..d57646b --- /dev/null +++ b/eeps/eep-0073.md @@ -0,0 +1,212 @@ + Author: Isabell Huang + Status: Draft + Type: Standard Track + Created: 21-Sep-2024 + Erlang-Version: 28 + Post-History: + Replaces: 19 +**** +EEP 73: Zip generator +---- + +Abstract +======== + +This EEP proposes the addition of zip generators with a syntax of `&&` for +comprehensions in Erlang. The idea and syntax of zip generators (comprehension +multigenerators) was first brought up by EEP-19. Even if the syntax and +usages of zip generators proposed by this EEP is mostly the same with EEP-19, +the comprehension language of Erlang has undergone many changes since EEP-19 +was accepted. With an implementation that is compatible with all existing +comprehensions, this EEP defines the behavior of zip generators with more +clarification on the compiler's part. + +Rationale +========= + +List comprehension is a way to create a new list from existing list(s). Lists +are traversed in a dependent (nested) way. In the following example, the +resulting list has length 4 when the two input lists both have length 2. + + [{X,Y} || X <- [1,2], Y <- [3,4]] = [{1,3}, {1,4}, {2,3}, {2,4}]. + +In contrast, parallel list comprehension (also known as zip comprehension) +evaluates qualifiers (a generalization of lists) in parallel. Qualifiers are +first "zipped" together, and then evaluated. Many functional languages +([Haskell][2], [Racket][3], etc.) and non-functional languages (Python etc.) support +this variation. Suppose the two lists in the example above are evaluated as +a zip comprehension, the result would be `[{1,3}, {2,4}]`. + +Zip comprehensions allow the user to conveniently iterate over several lists +at once. Without it, the standard way to accomplish the same task in Erlang +is to use `lists:zip` to zip two lists into two-tuples, or to use `lists:zip3` +to zip three lists into three-tuples. The list module does not provide a +function to zip more than three lists. Functions like `lists:zip` always +create intermediate data structures when compiled. The compiler does not +perform deforestation to eliminate the unwanted tuples. + +Zip generators is a generalization of zip comprehensions. Every set of zipped +generators is treated as one generator. Instead of constraining a comprehension +to be either zipped or non-zipped, any generator can be either a zip generator +(containing at least two generators zipped together), or a non-zip generator +(all existing generators are non-zip generator). Therefore, zip generators +can be mixed freely with all existing generators and filters. Zip comprehension +then becomes a special case of comprehension where only zip generators are +used. + +In summary, zip generator removes the user's need to call the zip function +and allows for any number of lists to be zipped at once. It can be used in +list, binary, and map comprehensions, and mixed freely with all existing +generators and filters. Internally, the compiler does not create any +intermediate data structure, therefore also removing the need of deforestation. + +Specification +======================== + +Currently, Erlang supports three kinds of comprehensions, list comprehension, +binary comprehension, and map comprehension. Their names refer to the result +of the comprehension. List comprehension produces a list; binary comprehension +produces a binary, etc. + + [Expression || Qualifier(s)] %% List Comprehension + <> %% Binary Comprehension + #{Expression || Qualifier(s)} %% Map Comprehension + + +Qualifiers can have the following kind: filter, list generator, bitstring +generator, and map generator. Except for filters, the other three kinds of +qualifiers are generators. Their names refer to the type on the right hand +side of `<-` or `<=`. Generators have the following form: + + Pattern <- List %% List Generator + Pattern <= Bitstring %% Bitstring Generator + Pattern_1 := Pattern_2 <- Map %% Map Generator + +All qualifiers and filters can be freely used and mixed in all 3 kinds of +comprehensions. The following example shows a list comprehension with a +list generator and a bitstring generator. + + [{X,Y} || X <- [1,2,3], <> <= <<4,5,6>>]. + +This EEP proposes the addition of zip generators. A zip generator is two or +more generators connected by `&&`. Zip generators is constructed to connect +any number of the 3 kinds of generators above. Zip generators can be used +in list, binary, or map comprehensions in the same way. + +For example, if the two generators in the above example is combined together +as a zip generator, the comprehension would look like: + + [{X,Y} || X <- [1,2,3] && <> <= <<4,5,6>>]. + +For every zip generator of the form +`G1 && ... && Gn`, it is evaluated to have the same result as `zip/n` where + + zip([H1|T1], ..., [Hn|Tn]) -> + [{H1,...,Hn} | zip(T1, ..., Tn)]; + zip([], ..., []) -> + []. + +Therefore, the above comprehension evaluates to `[{1,4}, {2,5}, {3,6}]`, which +is the same as if using `lists:zip/2`. + +Zip generator can also be used when a comprehension contains other non-zip +generators and/or filters. The `&&` symbol has a higher precedence than `,`. + +The following example evaluates to `[{b,4}, {c,6}]`. The element `{a,2}` is +filtered out from the resulting list. + + [{X, Y} || X <- [a, b, c] && <> <= <<2, 4, 6>>, Y =/= 2]. + +Comparing to using helper functions, there is one advantage of using a zip +generator: The Erlang compiler does not generate any tuple when a zip +generator is translated into core Erlang. The generated code reflects the +programmer's intent, which is to collect one element from every list at a +time without creating a list of tuples. + +Error Behaviors +================ + +One would expect that when errors happen, a zip generator behaves the same +as `lists:zip/2`, `lists:zip3/3`, and also the `zip/n` function above when +more than 3 lists are zipped together. The design and implementation of +zip generators aim to achieve that both for compiled code and for comprehensions +evaluated in Erlang shell. + +Generators of Different Lengths +-------------- + +`lists:zip/2` and `lists:zip3/3` will fail if the given lists are not of the +same length, where `zip/n` will also crash. Therefore, a zip generator raises a +`bad generators` error when it discovers that the given generators are of +different lengths. + +When a zip generator crashes because the containing generators are of +different lengths, the error message is a tuple, where the first element +is the atom `bad_generators`, and the second element is a tuple that contains +the remaining data from all generators. + +For example, this comprehension will crash at runtime. + + [{X,Y} || X <- [1,2,3] && Y <- [1,2,3,4]]. + +The resulting error tuple is `{bad_generators,{[],[4]}}`. This is because +when the comprehension crashes, the first list in the zip generator has +only the empty list `[]` left, while the second list in the zip generator +has `[4]` left. + +On the compiler's side, it is rather difficult to return the original zip +generator in the error message, or to point out which generator is of +different length comparing to others. The proposed error message aims to +gives the most helpful information without imposing extra burden on the +compiler or runtime. + +Non-generator in a Zip Generator +----------------- + +As the idea of zipping only makes sense for generators, a zip generator cannot +contain filters or any expression that is not a generator. Whenever it is +possible to catch such an error at compile-time, this error is caught by +the Erlang linter. + +For example, the zip generator in the following comprehension contains a +non-generator. + + zip() -> [{X,Y} || X <- [1,2,3] && Y <- [1,2,3] && X > 0]. + +When the function is compiled, the linter points out that only generators are +allowed in a zip generator, together with the position of the non-generator. + + t.erl:6:55: only generators are allowed in a zip generator. + % 6| zip() -> [{X,Y} || X <- [1,2,3] && Y <- [1,2,3] && X > 0]. + % | ^ + + +Backwards Compatibility +======================== + +The operator `&&` is not used in Erlang. No existing code is affected by +this addition. + +Reference Implementation +======================== + +[Git Branch][1] contains the implementation for zip generators. + +[1]: https://github.com/lucioleKi/otp/tree/isabell/compiler/zip-comprehension/OTP-19184 +[2]: https://downloads.haskell.org/~ghc/5.00/docs/set/parallel-list-comprehensions.html +[3]: https://docs.racket-lang.org/reference/for.html + +Copyright +========= + +This document is placed in the public domain or under the CC0-1.0-Universal +license, whichever is more permissive. + +[EmacsVar]: <> "Local Variables:" +[EmacsVar]: <> "mode: indented-text" +[EmacsVar]: <> "indent-tabs-mode: nil" +[EmacsVar]: <> "sentence-end-double-space: t" +[EmacsVar]: <> "fill-column: 70" +[EmacsVar]: <> "coding: utf-8" +[EmacsVar]: <> "End:" +[VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4 softtabstop=4: "