Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flattening list of lists of dicts leads to surprising results #3

Open
tuturto opened this issue Oct 19, 2017 · 11 comments
Open

Flattening list of lists of dicts leads to surprising results #3

tuturto opened this issue Oct 19, 2017 · 11 comments

Comments

@tuturto
Copy link

tuturto commented Oct 19, 2017

Instead of

=> (flatten [[{:one 1 :two 2}] [{:three 3 :four 4}]])
['\ufdd0:one', '\ufdd0:two', '\ufdd0:three', '\ufdd0:four']

I was expecting

=> (flatten [[{:one 1 :two 2}] [{:three 3 :four 4}]])
[{:one 1 :two 2} {:three 3 :four 4}]
@Kodiologist
Copy link
Member

Currently, flatten uses coll? to decide whether to flatten something. This example constitutes a good argument that coll? isn't the right criterion. But it's not clear what we should use instead. Checking for the existence of __getitem__ is perhaps the most obvious choice, but this wouldn't flatten sets and frozensets, which we probably do want to flatten.

@Kodiologist
Copy link
Member

Kodiologist commented Oct 19, 2017

By the way, if you just want to concatenate one level of iterables in x, without recursive flattening, you can use (reduce + x) or (sum x []).

@Kodiologist
Copy link
Member

To clarify, @tuturto, did you expect [{:one 1 :two 2} {:three 3 :four 4}] because you forgot that flatten was recursive, or because you thought it would leave dictionaries in particular untouched?

@tuturto
Copy link
Author

tuturto commented Oct 19, 2017

The latter one, I expected dictionaries be left as they were. And even if dictionaries were flattened, we shouldn't be discarding half of the information (keeping only keys and discarding values) while doing so.

@gilch
Copy link
Member

gilch commented Oct 19, 2017

Don't use sum like that.

This function is intended specifically for use with numeric values and may
reject non-numeric types.

It's a mere implementation detail that this works at all. A type checker has every right to flag that use, and other Python implementations (even later CPython implementations) might not support it.

@Kodiologist
Copy link
Member

Yes, it's allegedly an implementation detail, but it's an obvious desideratum. I doubt it will ever be removed.

@gilch
Copy link
Member

gilch commented Oct 19, 2017

In Python, dicts are iterables. Their iterator returns the keys. In a for loop, this is more useful, since you can still use the keys to get the values via lookup. If you want the pairs, you have to use .items.

Clojure arguably handles this more consistently.

user=> (seq {1 2, 3 4})
([1 2] [3 4])

The default iterator is like Python's .items.

So how does Clojure flatten maps?

user=> (flatten {1 2, 3 4})
()
user=> (doc flatten)
-------------------------
clojure.core/flatten
([x])
  Takes any nested combination of sequential things (lists, vectors,
  etc.) and returns their contents as a single, flat sequence.
  (flatten nil) returns an empty sequence.
nil
user=> (flatten (seq {1 2, 3 4}))
(1 2 3 4)

It doesn't.

Flatten should support third-party iterables too. How can you treat them consistently if you only make dicts a special case? This is how Python is. How do you want this to work?

@gilch
Copy link
Member

gilch commented Oct 19, 2017

It looks like Clojure keeps maps in one piece when they're not the top level.

user=> (flatten [10 20 [{1 2, 5 6} {3 4}]])
(10 20 {1 2, 5 6} {3 4})

@Kodiologist
Copy link
Member

How do you want this to work?

Right; that's the hard part about choosing the right criterion, I think.

Perhaps, using the types of collections.abc, we could flatten all Iterables that aren't Mappings, ByteStrings, or strings.

@gilch
Copy link
Member

gilch commented Oct 19, 2017

Clojure also keeps sets in one piece.

user=> (flatten [10 #{1 2 3}])
(10 #{1 3 2})

I don't think "iterable" is the right criterion. The appropriate abc is probably Sequence, which is iterable with a specific ordering.

@gilch
Copy link
Member

gilch commented Oct 19, 2017

Except strings or bytestrings.

Is an OrderedDict a sequence?

>>> isinstance(OrderedDict(), abc.Sequence)
False

Python says no.

@Kodiologist Kodiologist transferred this issue from hylang/hy Aug 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants