Skip to content

Commit

Permalink
add draft for eep-73
Browse files Browse the repository at this point in the history
  • Loading branch information
lucioleKi committed Oct 1, 2024
1 parent 44c891a commit 8d40d10
Showing 1 changed file with 212 additions and 0 deletions.
212 changes: 212 additions & 0 deletions eeps/eep-0073.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,212 @@
Author: Isabell Huang <isabell(at)erlang(dot)org>
Status: Draft
Type: Standard Track
Created: 21-Sep-2024
Erlang-Version: 28
Post-History:
Replaces: 19
****
EEP 73: Zip generator
----

Abstract
========

This EEP proposes the addition of zip generators with a syntax of `&&` for

Check failure on line 15 in eeps/eep-0073.md

View workflow job for this annotation

GitHub Actions / markdownlint

Trailing spaces

eeps/eep-0073.md:15:75 MD009/no-trailing-spaces Trailing spaces [Expected: 0 or 2; Actual: 1] https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md009.md
comprehensions in Erlang. The idea and syntax of zip generators (comprehension
multigenerators) was first brought up by EEP-19. Even if the syntax and
usages of zip generators proposed by this EEP is mostly the same with EEP-19,
the comprehension language of Erlang has undergone many changes since EEP-19
was accepted. With an implementation that is compatible with all existing
comprehensions, this EEP defines the behavior of zip generators with more
clarification on the compiler's part.

Rationale
=========

List comprehension is a way to create a new list from existing list(s). Lists
are traversed in a dependent (nested) way. In the following example, the
resulting list has length 4 when the two input lists both have length 2.

[{X,Y} || X <- [1,2], Y <- [3,4]] = [{1,3}, {1,4}, {2,3}, {2,4}].

In contrast, parallel list comprehension (also known as zip comprehension)
evaluates qualifiers (a generalization of lists) in parallel. Qualifiers are
first "zipped" together, and then evaluated. Many functional languages
([Haskell][2], [Racket][3], etc.) and non-functional languages (Python etc.) support
this variation. Suppose the two lists in the example above are evaluated as
a zip comprehension, the result would be `[{1,3}, {2,4}]`.

Zip comprehensions allow the user to conveniently iterate over several lists
at once. Without it, the standard way to accomplish the same task in Erlang
is to use `lists:zip` to zip two lists into two-tuples, or to use `lists:zip3`
to zip three lists into three-tuples. The list module does not provide a
function to zip more than three lists. Functions like `lists:zip` always
create intermediate data structures when compiled. The compiler does not
perform deforestation to eliminate the unwanted tuples.

Zip generators is a generalization of zip comprehensions. Every set of zipped
generators is treated as one generator. Instead of constraining a comprehension
to be either zipped or non-zipped, any generator can be either a zip generator
(containing at least two generators zipped together), or a non-zip generator
(all existing generators are non-zip generator). Therefore, zip generators
can be mixed freely with all existing generators and filters. Zip comprehension
then becomes a special case of comprehension where only zip generators are
used.

In summary, zip generator removes the user's need to call the zip function
and allows for any number of lists to be zipped at once. It can be used in
list, binary, and map comprehensions, and mixed freely with all existing
generators and filters. Internally, the compiler does not create any
intermediate data structure, therefore also removing the need of deforestation.

Specification
========================

Currently, Erlang supports three kinds of comprehensions, list comprehension,
binary comprehension, and map comprehension. Their names refer to the result
of the comprehension. List comprehension produces a list; binary comprehension
produces a binary, etc.

[Expression || Qualifier(s)] %% List Comprehension
<<Expression || Qualifier(s)>> %% Binary Comprehension
#{Expression || Qualifier(s)} %% Map Comprehension


Check failure on line 75 in eeps/eep-0073.md

View workflow job for this annotation

GitHub Actions / markdownlint

Multiple consecutive blank lines

eeps/eep-0073.md:75 MD012/no-multiple-blanks Multiple consecutive blank lines [Expected: 1; Actual: 2] https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md012.md
Qualifiers can have the following kind: filter, list generator, bitstring
generator, and map generator. Except for filters, the other three kinds of
qualifiers are generators. Their names refer to the type on the right hand
side of `<-` or `<=`. Generators have the following form:

Check failure on line 79 in eeps/eep-0073.md

View workflow job for this annotation

GitHub Actions / markdownlint

Trailing spaces

eeps/eep-0073.md:79:58 MD009/no-trailing-spaces Trailing spaces [Expected: 0 or 2; Actual: 1] https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md009.md

Pattern <- List %% List Generator
Pattern <= Bitstring %% Bitstring Generator
Pattern_1 := Pattern_2 <- Map %% Map Generator

All qualifiers and filters can be freely used and mixed in all 3 kinds of
comprehensions. The following example shows a list comprehension with a
list generator and a bitstring generator.

[{X,Y} || X <- [1,2,3], <<Y>> <= <<4,5,6>>].

This EEP proposes the addition of zip generators. A zip generator is two or
more generators connected by `&&`. Zip generators is constructed to connect
any number of the 3 kinds of generators above. Zip generators can be used
in list, binary, or map comprehensions in the same way.

For example, if the two generators in the above example is combined together
as a zip generator, the comprehension would look like:

[{X,Y} || X <- [1,2,3] && <<Y>> <= <<4,5,6>>].

For every zip generator of the form
`G1 && ... && Gn`, it is evaluated to have the same result as `zip/n` where

zip([H1|T1], ..., [Hn|Tn]) ->
[{H1,...,Hn} | zip(T1, ..., Tn)];
zip([], ..., []) ->
[].

Therefore, the above comprehension evaluates to `[{1,4}, {2,5}, {3,6}]`, which
is the same as if using `lists:zip/2`.

Zip generator can also be used when a comprehension contains other non-zip
generators and/or filters. The `&&` symbol has a higher precedence than `,`.

The following example evaluates to `[{b,4}, {c,6}]`. The element `{a,2}` is
filtered out from the resulting list.

[{X, Y} || X <- [a, b, c] && <<Y>> <= <<2, 4, 6>>, Y =/= 2].

Comparing to using helper functions, there is one advantage of using a zip
generator: The Erlang compiler does not generate any tuple when a zip
generator is translated into core Erlang. The generated code reflects the
programmer's intent, which is to collect one element from every list at a
time without creating a list of tuples.

Error Behaviors
================

One would expect that when errors happen, a zip generator behaves the same
as `lists:zip/2`, `lists:zip3/3`, and also the `zip/n` function above when
more than 3 lists are zipped together. The design and implementation of
zip generators aim to achieve that both for compiled code and for comprehensions
evaluated in Erlang shell.

Generators of Different Lengths
--------------

`lists:zip/2` and `lists:zip3/3` will fail if the given lists are not of the
same length, where `zip/n` will also crash. Therefore, a zip generator raises a
`bad generators` error when it discovers that the given generators are of
different lengths.

Check failure on line 141 in eeps/eep-0073.md

View workflow job for this annotation

GitHub Actions / markdownlint

Trailing spaces

eeps/eep-0073.md:141:19 MD009/no-trailing-spaces Trailing spaces [Expected: 0 or 2; Actual: 1] https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md009.md

When a zip generator crashes because the containing generators are of
different lengths, the error message is a tuple, where the first element
is the atom `bad_generators`, and the second element is a tuple that contains
the remaining data from all generators.

Check failure on line 146 in eeps/eep-0073.md

View workflow job for this annotation

GitHub Actions / markdownlint

Trailing spaces

eeps/eep-0073.md:146:40 MD009/no-trailing-spaces Trailing spaces [Expected: 0 or 2; Actual: 1] https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md009.md

For example, this comprehension will crash at runtime.

Check failure on line 148 in eeps/eep-0073.md

View workflow job for this annotation

GitHub Actions / markdownlint

Trailing spaces

eeps/eep-0073.md:148:55 MD009/no-trailing-spaces Trailing spaces [Expected: 0 or 2; Actual: 1] https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md009.md

[{X,Y} || X <- [1,2,3] && Y <- [1,2,3,4]].

The resulting error tuple is `{bad_generators,{[],[4]}}`. This is because
when the comprehension crashes, the first list in the zip generator has
only the empty list `[]` left, while the second list in the zip generator
has `[4]` left.

On the compiler's side, it is rather difficult to return the original zip
generator in the error message, or to point out which generator is of
different length comparing to others. The proposed error message aims to
gives the most helpful information without imposing extra burden on the
compiler or runtime.

Check failure on line 161 in eeps/eep-0073.md

View workflow job for this annotation

GitHub Actions / markdownlint

Trailing spaces

eeps/eep-0073.md:161:21 MD009/no-trailing-spaces Trailing spaces [Expected: 0 or 2; Actual: 1] https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md009.md

Non-generator in a Zip Generator
-----------------

As the idea of zipping only makes sense for generators, a zip generator cannot
contain filters or any expression that is not a generator. Whenever it is
possible to catch such an error at compile-time, this error is caught by
the Erlang linter.

For example, the zip generator in the following comprehension contains a
non-generator.

zip() -> [{X,Y} || X <- [1,2,3] && Y <- [1,2,3] && X > 0].

When the function is compiled, the linter points out that only generators are
allowed in a zip generator, together with the position of the non-generator.

t.erl:6:55: only generators are allowed in a zip generator.
% 6| zip() -> [{X,Y} || X <- [1,2,3] && Y <- [1,2,3] && X > 0].
% | ^


Check failure on line 183 in eeps/eep-0073.md

View workflow job for this annotation

GitHub Actions / markdownlint

Multiple consecutive blank lines

eeps/eep-0073.md:183 MD012/no-multiple-blanks Multiple consecutive blank lines [Expected: 1; Actual: 2] https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md012.md
Backwards Compatibility
========================

The operator `&&` is not used in Erlang. No existing code is affected by
this addition.

Reference Implementation
========================

[Git Branch][1] contains the implementation for zip generators.

Check failure on line 193 in eeps/eep-0073.md

View workflow job for this annotation

GitHub Actions / markdownlint

Trailing spaces

eeps/eep-0073.md:193:64 MD009/no-trailing-spaces Trailing spaces [Expected: 0 or 2; Actual: 1] https://github.com/DavidAnson/markdownlint/blob/v0.32.1/doc/md009.md

[1]: https://github.com/lucioleKi/otp/tree/isabell/compiler/zip-comprehension/OTP-19184
[2]: https://downloads.haskell.org/~ghc/5.00/docs/set/parallel-list-comprehensions.html
[3]: https://docs.racket-lang.org/reference/for.html

Copyright
=========

This document is placed in the public domain or under the CC0-1.0-Universal
license, whichever is more permissive.

[EmacsVar]: <> "Local Variables:"
[EmacsVar]: <> "mode: indented-text"
[EmacsVar]: <> "indent-tabs-mode: nil"
[EmacsVar]: <> "sentence-end-double-space: t"
[EmacsVar]: <> "fill-column: 70"
[EmacsVar]: <> "coding: utf-8"
[EmacsVar]: <> "End:"
[VimVar]: <> " vim: set fileencoding=utf-8 expandtab shiftwidth=4 softtabstop=4: "

0 comments on commit 8d40d10

Please sign in to comment.