-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DynamoDB: Support list with items of different types. #28
Comments
That's actually ilegal in CrateDB, but ok in DynamoDB, we should probably detect it and handle it. |
Yeah, CrateDB needs to specify the inner type, like |
Good idea. While storing it 1:1 is probably not possible, because CrateDB can't handle it, we need a trick/workaround. Do you have any suggestions? Maybe one of you already applied any tricks in similar situations, or have another idea? /cc @hlcianfagna, @wierdvanderhaar, @hammerhead, @proddata, @matriv |
How about we detect it, map it to a l = [
{'a': 1}, 2, "Three"
]
r = {}
for el in l:
if (_type := type(el).__name__) not in r:
r[_type] = []
r[_type].append(el)
print(r) {
'dict': [{'a': 1}],
'int': [2],
'str': ['Three']
} |
You can indirectly store it cr> CREATE TABLE t01 (o OBJECT(IGNORED));
cr> INSERT INTO t01 (o) VALUES ($${"oo":[{"a": 1}, 2, "Three"]}$$);
cr> SELECT * FROM t01;
+--------------------------------+
| o |
+--------------------------------+
| {"oo": [{"a": 1}, 2, "Three"]} |
+--------------------------------+
cr> SELECT o['oo'][1] FROM t01;
+------------+
| o['oo'][1] |
+------------+
| {"a": 1} |
+------------+
cr> SELECT o['oo'][2] FROM t01;
+------------+
| o['oo'][2] |
+------------+
| 2 |
+------------+ |
I think the |
I think the option is to store what is possible in a typed field and what is not in a separate column. SELECT json_structure('[{"x":1,"y":1},{"x":2,"y":[1,2,3]}]') a;
┌──────────────────────────────┐
│ a │
│ json │
├──────────────────────────────┤
│ [{"x":"UBIGINT","y":"JSON"}] │
└──────────────────────────────┘ |
Thank you for your excellent suggestions 💯. I followed @proddata's advise:
Please add your voice about general idea, naming things, or anything else which comes to mind. Thanks! |
The patch referenced above, GH-39, will be included into the upcoming release, in order to improve the situation. The code is now also validated on behalf of integration tests with CrateDB, after GH-42 has been added. To recap and summarize, the gist of the implementation is to use two distinct columns to store typed vs. untyped data. Currently, column names are commons-codec/src/commons_codec/transform/dynamodb.py Lines 75 to 81 in 3a31a06
|
The improvement has been included into release v0.0.14. |
Example:
The text was updated successfully, but these errors were encountered: