Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: always check '.length' for 'unknown_length' #3332

Merged
merged 4 commits into from
Jan 10, 2025

Conversation

jpivarski
Copy link
Member

@pfackeldey, is this correct? Here are the matches, post-fix.

% fgrep -r '.length == ' src/
src/awkward/contents/bitmaskedarray.py:        if self._mask.length is not unknown_length and self._mask.length == 0:
src/awkward/contents/bitmaskedarray.py:        if self._mask.length is not unknown_length and self._mask.length == 0:
src/awkward/contents/bitmaskedarray.py:            if self._mask.length is not unknown_length and self._mask.length == excess_length:
src/awkward/contents/listarray.py:        if offsets.length is not unknown_length and offsets.length == 0:
src/awkward/contents/listarray.py:                advanced.length is not unknown_length and advanced.length == 0
src/awkward/contents/listarray.py:                advanced.length is not unknown_length and advanced.length == 0
src/awkward/contents/listarray.py:        if self._starts.length is not unknown_length and self._starts.length == 0:
src/awkward/contents/listarray.py:        if self._starts.length is not unknown_length and self._starts.length == 0:
src/awkward/contents/unmaskedarray.py:            if offsets.length is not unknown_length and offsets.length == 0:
src/awkward/contents/unmaskedarray.py:        if self._content.length is not unknown_length and self._content.length == 0:
src/awkward/contents/unmaskedarray.py:        if self._content.length is not unknown_length and self._content.length == 0:
src/awkward/contents/numpyarray.py:        if self.length is not unknown_length and self.length == 0:
src/awkward/contents/numpyarray.py:                return out.content.length is not unknown_length and out.content.length == self.length
src/awkward/contents/numpyarray.py:                return out.length is not unknown_length and out.length == self.length
src/awkward/contents/content.py:            if out.length is not unknown_length and out.length == 0:
src/awkward/contents/content.py:            if out.length is not unknown_length and out.length == 0:
src/awkward/contents/content.py:                or self.length == other.length
src/awkward/contents/indexedarray.py:        if self._index.length is not unknown_length and self._index.length == 0:
src/awkward/contents/indexedarray.py:        if self._index.length is not unknown_length and self._index.length == 0:
src/awkward/contents/indexedarray.py:            if self._content.length is not unknown_length and self._content.length == 0:
src/awkward/contents/indexedarray.py:                # every masked value is self._content[0], unless self._content.length == 0.
src/awkward/contents/indexedarray.py:        if self._content.length is not unknown_length and self._content.length == 0:
src/awkward/contents/indexedarray.py:            # every masked value is self._content[0], unless self._content.length == 0.
src/awkward/contents/indexedarray.py:        if not nplike.known_data or self._index.length == 0:
src/awkward/contents/unionarray.py:        if simplified.length is not unknown_length and simplified.length == 0:
src/awkward/contents/unionarray.py:        if self.length is not unknown_length and self.length == 0:
src/awkward/contents/unionarray.py:        if simplified.length is not unknown_length and simplified.length == 0:
src/awkward/contents/listoffsetarray.py:            and offsets.length == 0
src/awkward/contents/listoffsetarray.py:        if offsets.length is not unknown_length and offsets.length == 0:
src/awkward/contents/listoffsetarray.py:        if offsets.length is not unknown_length and offsets.length == 0:
src/awkward/contents/listoffsetarray.py:                advanced.length is not unknown_length and advanced.length == 0
src/awkward/contents/listoffsetarray.py:                advanced.length is not unknown_length and advanced.length == 0
src/awkward/contents/listoffsetarray.py:            if inneroffsets.length is not unknown_length and inneroffsets.length == 0:
src/awkward/contents/listoffsetarray.py:                self._offsets.length is not unknown_length and self._offsets.length == 1
src/awkward/contents/listoffsetarray.py:                return out2.length == self.length
src/awkward/contents/listoffsetarray.py:            if self.starts.length is not unknown_length and self.starts.length == 0:
src/awkward/contents/indexedoptionarray.py:        if self._content.length is not unknown_length and self._content.length == 0:
src/awkward/contents/indexedoptionarray.py:            if offsets.length is not unknown_length and offsets.length == 0:
src/awkward/contents/indexedoptionarray.py:        if self._index.length is not unknown_length and self._index.length == 0:
src/awkward/contents/indexedoptionarray.py:        if not nplike.known_data or self._index.length == 0:
src/awkward/contents/bytemaskedarray.py:            if offsets.length is not unknown_length and offsets.length == 0:
src/awkward/contents/bytemaskedarray.py:        if self._mask.length is not unknown_length and self._mask.length == 0:
src/awkward/contents/bytemaskedarray.py:        if self._mask.length is not unknown_length and self._mask.length == 0:
src/awkward/contents/emptyarray.py:        if not carry.nplike.known_data or carry.length == 0:
src/awkward/contents/emptyarray.py:            if not head.nplike.known_data or head.length == 0:
src/awkward/contents/regulararray.py:        if offsets.length is not unknown_length and offsets.length == 0:
src/awkward/contents/regulararray.py:                advanced.length is not unknown_length and advanced.length == 0
src/awkward/contents/regulararray.py:                advanced.length is not unknown_length and advanced.length == 0
src/awkward/contents/regulararray.py:                        or trimmed.length == self._size * outcontent.length
src/awkward/contents/recordarray.py:            or out.length == self._length
src/awkward/contents/recordarray.py:            (x if x.length == length else x[:length])._to_arrow(
src/awkward/_broadcasting.py:        if x.length is not unknown_length and x.length == 0:
src/awkward/_broadcasting.py:        if x.length is not unknown_length and x.length == 0:
src/awkward/record.py:        if self._array.length is not unknown_length and self._array.length == 1:
src/awkward/operations/ak_enforce_type.py:        if layout_regular.length is not unknown_length and layout_regular.length == 0:
src/awkward/operations/ak_firsts.py:        if layout.length is not unknown_length and layout.length == 0:

@pfackeldey
Copy link
Collaborator

pfackeldey commented Dec 6, 2024

These changes look reasonable to me 👍
I'm just confused why the JAX tests are failing for {ubuntu,macOS} with python 3.{10,11,12} ? This seems to be unrelated to this PR though...

@jpivarski
Copy link
Member Author

It's hard to diagnose because of the JAX PyTrees control flow, but at least one of the errors was caused by this change:

if offsets.length is not unknown_length and offsets.length == 0:

(from ak.unflatten). The pattern-match solution apparently doesn't work and we'll have to think carefully about each case. I'll make this a draft.

But this may be an indication that we need to rethink the JAX backend, too. (I'm still worried about that...)

@jpivarski jpivarski marked this pull request as draft December 6, 2024 18:38
@pfackeldey
Copy link
Collaborator

I tend to agree. Despite being a JAX fan, I think it is more important that awkward-array with typetracer backend works flawlessly.

@agoose77
Copy link
Collaborator

agoose77 commented Dec 6, 2024

Yeah, unfortunately this kind of logic is not easy to pattern match. When I was reworking this handling, I started to make an effort to standardise unknown length handling, but I definitely think there's a long tail on this. Over time, though, we'll get them all :)

Copy link
Collaborator

@pfackeldey pfackeldey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look reasonable to me, probably some of these additional checks would never be hit in practice, but better be safe than sorry as we hit exactly these kinds of problems several times in the past. 👍

Copy link
Collaborator

@ianna ianna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me! Thanks!

@ianna ianna merged commit 32c1171 into main Jan 10, 2025
39 checks passed
@ianna ianna deleted the jpivarski/always-check-unknown_length branch January 10, 2025 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants