-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-465: Clarify backward-compatibility rules on LIST type #466
Open
wgtmac
wants to merge
10
commits into
apache:master
Choose a base branch
from
wgtmac:old-list-structure
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+49
−9
Open
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
492860c
GH-465: Clarify backward-compatibility rules on LIST type
wgtmac 71a40e2
fix rule 2
wgtmac 767f64e
add more clarification
wgtmac fae84cb
address comment
wgtmac 6c6dcf7
fix typo
wgtmac e9e5da4
add new rule
wgtmac 335b9a4
address feedback
wgtmac 93e8a83
Update LogicalTypes.md
wgtmac dbb6550
address comments
wgtmac 3f7238c
relocate 1-level structure example
wgtmac File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -609,6 +609,17 @@ that is neither contained by a `LIST`- or `MAP`-annotated group nor annotated | |
by `LIST` or `MAP` should be interpreted as a required list of required | ||
elements where the element type is the type of the field. | ||
|
||
``` | ||
// List<Integer> (non-null list, non-null elements) | ||
repeated int32 num; | ||
|
||
// List<Tuple<Integer, String>> (non-null list, non-null elements) | ||
repeated group my_list { | ||
required int32 num; | ||
optional binary str (STRING); | ||
} | ||
``` | ||
|
||
Implementations should use either `LIST` and `MAP` annotations _or_ unannotated | ||
repeated fields, but not both. When using the annotations, no unannotated | ||
repeated types are allowed. | ||
|
@@ -670,6 +681,8 @@ optional group array_of_arrays (LIST) { | |
|
||
#### Backward-compatibility rules | ||
|
||
##### 3-level structure with different field names | ||
|
||
It is required that the repeated group of elements is named `list` and that | ||
its element field is named `element`. However, these names may not be used in | ||
existing data and should not be enforced as errors when reading. For example, | ||
|
@@ -684,49 +697,76 @@ optional group my_list (LIST) { | |
} | ||
``` | ||
|
||
Some existing data does not include the inner element layer. For | ||
backward-compatibility, the type of elements in `LIST`-annotated structures | ||
should always be determined by the following rules: | ||
##### 2-level structure | ||
|
||
Some existing data does not include the inner element layer, meaning that `LIST` | ||
annotates a 2-level structure. In contrast to 3-level structure, the repetition | ||
of 2-level structure can be `optional`, `required`, or `repeated`. | ||
|
||
``` | ||
<list-repetition> group <name> (LIST) { | ||
repeated <element-type> <element-name>; | ||
} | ||
``` | ||
|
||
For backward-compatibility, the type of elements in `LIST`-annotated 2-level | ||
structures should always be determined by the following rules: | ||
|
||
1. If the repeated field is not a group, then its type is the element type and | ||
elements are required. | ||
wgtmac marked this conversation as resolved.
Show resolved
Hide resolved
|
||
2. If the repeated field is a group with multiple fields, then its type is the | ||
element type and elements are required. | ||
3. If the repeated field is a group with one field and is named either `array` | ||
3. If the repeated field is a group with a `repeated` field, then the repeated | ||
field is the element type because the type cannot be a 3-level list. | ||
4. If the repeated field is a group with one field and is named either `array` | ||
or uses the `LIST`-annotated group's name with `_tuple` appended then the | ||
repeated type is the element type and elements are required. | ||
4. Otherwise, the repeated field's type is the element type with the repeated | ||
5. Otherwise, the repeated field's type is the element type with the repeated | ||
field's repetition. | ||
|
||
Examples that can be interpreted using these rules: | ||
|
||
``` | ||
// List<Integer> (nullable list, non-null elements) | ||
// Rule 1: List<Integer> (nullable list, non-null elements) | ||
optional group my_list (LIST) { | ||
repeated int32 element; | ||
} | ||
|
||
// List<Tuple<String, Integer>> (nullable list, non-null elements) | ||
// Rule 2: List<Tuple<String, Integer>> (nullable list, non-null elements) | ||
optional group my_list (LIST) { | ||
repeated group element { | ||
required binary str (STRING); | ||
required int32 num; | ||
}; | ||
} | ||
|
||
// List<OneTuple<String>> (nullable list, non-null elements) | ||
// Rule 3: List<List<Integer>> (nullable outer list, non-null elements) | ||
optional group my_list (LIST) { | ||
repeated group array (LIST) { | ||
repeated int32 array; | ||
}; | ||
} | ||
|
||
// Rule 4: List<OneTuple<String>> (nullable list, non-null elements) | ||
optional group my_list (LIST) { | ||
repeated group array { | ||
required binary str (STRING); | ||
}; | ||
} | ||
|
||
// List<OneTuple<String>> (nullable list, non-null elements) | ||
// Rule 4: List<OneTuple<String>> (nullable list, non-null elements) | ||
optional group my_list (LIST) { | ||
repeated group my_list_tuple { | ||
required binary str (STRING); | ||
}; | ||
} | ||
|
||
wgtmac marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// Rule 5: List<OneTuple<List<Integer>>> (nullable outer list, non-null elements) | ||
optional group my_list (LIST) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe this is the case where annotated 2-level list is mixed with unannotated repeated field (1-level list). Is this allowed? If not, I can remove the example. Then, is there any concrete example for the last rule (a.k.a. rule 5 above)? @gszadovszky @rdblue |
||
repeated group foo { | ||
repeated int32 bar; | ||
}; | ||
} | ||
``` | ||
|
||
### Maps | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this confusing. Why does it need to change?
The problem isn't when there is a 2-level structure like the one in the new example. The problem is when we need to decide whether a structure is a 2-level or 3-level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just happen to see that this section is all about 2-level list so I have added a title to make it explicit.
The change aims to exhaust all possible forms of list structures and we can just apply these rules in order to deduce a list type:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decide to keep the subtitles as they are pretty clear. It is the exact order to apply when deducing a list type. Please let me know if you have different opinions and I'm happy to change. @rdblue @gszadovszky @emkornfield