-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide special treatment for human-written text #261
Comments
I would like to add to point 3 that it is not necessarily a mistake or assumption. In our case we use the schema as source of truth and generate DataTransferObjects from the schema, in order to comply with the schema. Maybe |
Hey @kasper-agg, Yes I agree that using JSON Schemas as source of truth is a great idea and this issue prevents engineers from doing that. I also agree that it would be a right thing to do to over-provision Redshift, but problem here is that we need to come up with way to "mark" these fields as it is relatively narrow subset of cases. |
Regardless of the solution, I must agree it sounds like a challenge. Even more to fix it in clean way... |
I'm quite keen to add custom extensions into JSON Schema (example #170, but new properties as well - it is fully-compatible with standard) and I think this is the way to go here. |
I have a weird feeling we may end up having to drive a Draft 5 (?) of the JSON Schema spec, given there don't seem to be many other vendors taking a major interest in it... |
That would be lovely! I've seen efforts on Draft 5 specification, but almost no implementations out there. |
In JSON Schema specification
maxLength
keyword specifies the amount of characters in field, not bytes. However, in Redshift (as in most databases, I believe) inVARCHAR
we specify amount of bytes, which may introduce mismatch in text-fields, usually written by humans, not computers.This is generally not a problem as analytical data typically contains ASCII-text (written by computers), where amount of bytes precisely match amount of characters.
But at the same time, I can imagine an issue:
maxLength: 32
is correct constrainThis could be done as part of #170 (
format: "unicode"
, which specify that string has absolutely no structure) or similar custom JSON-schema extension.The text was updated successfully, but these errors were encountered: