-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DynamoDB: Fix serializing OBJECT and ARRAY representations to CrateDB #34
Conversation
89f000e
to
17e03d1
Compare
# TODO: ARRAY types do not support JSON syntax yet. | ||
# Let's relay them 1:1, which works for primitive inner types, | ||
# but complex ones need special treatment, like representing | ||
# OBJECTs in CrateDB-native syntax. | ||
# FIXME: Find a way to use a custom JSONEncoder for that, in order to | ||
# fully support also nested elements of those. | ||
# https://github.com/crate/commons-codec/issues/33 | ||
elif isinstance(key_value, list): | ||
if key_value: | ||
if isinstance(key_value[0], dict): | ||
items = [] | ||
for item in key_value: | ||
items.append("{" + ", ".join(f"{key}='{value}'" for key, value in item.items()) + "}") | ||
key_value = "[" + ",".join(items) + "]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handling it specially, [...] close to Python's
dict(foo='bar')
syntax notation.
This does not apply any special treatments to the OBJECT keys yet, and I did not perform any further orientation flights yet. For example, what happens if the key is a reserved keyword, or contains special characters like dashes -
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on your spirits, we may want to defer further special handling to a subsequent iteration, and first check if the use case at hand already gets satisfied well.
if isinstance(key_value, dict): | ||
key_value = repr(json.dumps(key_value)) | ||
elif isinstance(key_value, dict): | ||
# TODO: Does it also need escaping of inner TEXT values, like the above? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single quotes must always be escaped as they are used for any text literals at CrateDB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. So that serialization will need further improvements to treat OBJECT values the same like above.
if isinstance(key_value[0], dict): | ||
items = [] | ||
for item in key_value: | ||
items.append("{" + ", ".join(f"{key}='{value}'" for key, value in item.items()) + "}") | ||
key_value = "[" + ",".join(items) + "]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is about the other cases where the array element isn't a dict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They absolutely need to be taken care of, right. Currently, as far as I can see through knowing about data payloads at hand, it might not be of current concerns yet.
Looks like the special |
This is absolutely right on the spot, and seems to work well. Thank you so much! data['list_of_objects'] = '[{"foo": "bar"}, {"baz": "qux"}]'::OBJECT[] cr> UPDATE "foo" SET data['tags'] = ['foo', 'bar'], data['empty_map'] = '{}'::OBJECT, data['empty_list'] = [], data['string_set'] = ['location_1'], data['number_set'] = [0.34, 1.0, 2.0, 3.0],
data['binary_set'] = ['U3Vubnk='], data['somemap'] = '{"test": 1.0, "test2": 2.0}'::OBJECT, data['list_of_objects'] = '[{"foo": "bar"}, {"baz": "qux"}]'::OBJECT[] WHERE data['device'] =
'foo';
UPDATE OK, 1 row affected (0.039 sec) cr> select * from foo;
|
I've amended the patch correspondingly. |
Btw. any reason not to use parameter binding instead of serializing values into the SQL? This would avoid the need of any special escaping I think and would be much safer.. |
Absolutely not. It is just my fault that I didn't use it from the very beginning. A subsequent iteration certainly will bring it to the table, because it is absolutely the right choice not needing to care about those details manually. Thanks! |
About
Another attempt to get relaying container type values right, this time for both OBJECTs and ARRAYs, at least a basic variant thereof, in order to unblock people needing it.
Details
The problem with ARRAYs is that they can't be relayed using JSON syntax, supported by a corresponding CAST. Instead, we need to use the native notation of CrateDB, which works without much ado on primitive inner types.
When serializing ARRAYs including OBJECT inner types, we need to actively convert to CrateDB's native syntax, which is close to Python's
dict(foo='bar')
syntax notation.References