-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix MySQL parsing of GRANT, REVOKE, and CREATE VIEW #1538
base: main
Are you sure you want to change the base?
Conversation
This is needed for parsing MySQL-style `'user'@'host'` grantee syntax. As far as I can tell, no dialect allows quotes or backticks as part of an identifier (regardless of whether it starts with `@`) without other specific syntax (e.g. nested in another quote style and thus not starting with `@`), so this shouldn't adversely affect non-MySQL dialects.
Introduces a new `Grantee` enum to differentiate bare identifiers (every other database: `root`) from user/host pairs (MySQL: `'root'@'%'`). While we're here, make the CASCADE/RESTRICT syntax for REVOKE optional since Postgres doesn't require it and MySQL doesn't allow it. Add support for MySQL wildcard object syntax: `GRANT ALL ON *.* ...`
Would appreciate feedback on the naming the MySQL-specific params types (would a generic/non-branded name be better?) and the API change for wildcard support (didn't want to change all 107 |
Some('\'') => Ok(Some(Token::AtSign)), | ||
Some('\"') => Ok(Some(Token::AtSign)), | ||
Some('`') => Ok(Some(Token::AtSign)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the changes feel unexpected, took a quick look it seems rather a bug that mysql dialect is currently configured to accept @
as an identifier start? their doc seems to suggest otherwise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought so at first too, but variables in MySQL start with @
and I don't think there's a way to differentiate them during tokenization. We would want to tokenize @foo
as a single Token::Word
, but @'foo'
(in my understanding) as [Token::AtSign, Token::SingleQuotedString]
. In this check during tokenization, we aren't differentiating between "schema names etc." and "variable names", just "identifiers". Maybe there is something more correct to do here, like checking for self.is_identifier_continuation(sch)
and ruling out '
there. But my approach of just adding cases for quotes is based on the fact that as far as I can tell, no dialect allows quote marks within an identifier unless it's in an outer quotation, which doesn't get tokenized by this path.
I think an approach like self.is_identifier_continuation(sch)
would effectively be the same but could allow customization per dialect, but it has another pitfall: say in addition to quote marks, a dialect doesn't allow _
in an identifier. Then should we fail to tokenize @_
with an error, or tokenize them separately as [@, _]
? If the former, we would then need another special case for quote marks, and the latter seems wrong to me.
One other example in support of tokenizing like [Token::AtSign, Token::SingleQuotedString]
, just in case it's not clear why I think that's right: in Postgres, @
is absolute value, and it will implicitly cast the operand to a numeric type, so that select @'1';
(no space after @
) treats '1'
as a string and casts it to double.
All that in mind, I still think my current approach is the simplest correct I can think of; let me know what you think!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see that makes sense to me and the current approach sounds reasonable! Can we add a comment around mentioning the example use case? I think that would be useful since the logic/tradeoff wouldnt be obvious otherwise, Also can we extend the tokenizer test case to cover the other two tokens supported by this path "
and '`'?
@@ -8353,20 +8409,40 @@ impl<'a> Parser<'a> { | |||
} | |||
} | |||
|
|||
/// Parse a possibly qualified, possibly quoted identifier, optionally allowing for wildcards, | |||
/// e.g. *, `foo`.*, or "foo"."bar" | |||
pub fn parse_object_name_with_wildcards( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method sounds like what parse_wildcard_expr
does, minus the return type. Can we reuse that functionality instead? e.g. if we move that logic out to its own method so that parse_wildcard_expr
and other functionality like grant etc can call it directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still having a think about the best way to do this; posting the other updates for further feedback.
Though as far as I know they are entirely MySQL specific.
…mething unexpected
Some('\'') => Ok(Some(Token::AtSign)), | ||
Some('\"') => Ok(Some(Token::AtSign)), | ||
Some('`') => Ok(Some(Token::AtSign)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see that makes sense to me and the current approach sounds reasonable! Can we add a comment around mentioning the example use case? I think that would be useful since the logic/tradeoff wouldnt be obvious otherwise, Also can we extend the tokenizer test case to cover the other two tokens supported by this path "
and '`'?
let parts = [ | ||
self.algorithm | ||
.as_ref() | ||
.map(|algorithm| format!("ALGORITHM = {algorithm}")), | ||
self.definer | ||
.as_ref() | ||
.map(|definer| format!("DEFINER = {definer}")), | ||
self.security | ||
.as_ref() | ||
.map(|security| format!("SQL SECURITY {security}")), | ||
] | ||
.into_iter() | ||
.flatten() | ||
.collect::<Vec<_>>(); | ||
display_separated(&parts, " ").fmt(f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we write these explicitly as e.g?
if let Some(algorithm) = self.algorithm { write!() }
I think that potentially makes it easier to spot the final output and extend in the future.
Also ideally we can use an exhaustive check with
let CreateviewParams { algorithm, definer, security } = self
so that the user is automatically guided to this function whenever the representation changes
'user'@'host'
syntax.*.*
.CASCADE
/RESTRICT
in revoke statements optional since MySQL doesn't accept it.user@host
syntax), and security context.