Fix MySQL parsing of GRANT, REVOKE, and CREATE VIEW #1538

mvzink · 2024-11-20T16:54:37Z

Support MySQL style 'user'@'host' syntax.
Support wildcards on grant objects: *.*.
Make CASCADE/RESTRICT in revoke statements optional since MySQL doesn't accept it.
Support optional MySQL parameters for create view: algorithm, definer (requires user@host syntax), and security context.

This is needed for parsing MySQL-style `'user'@'host'` grantee syntax. As far as I can tell, no dialect allows quotes or backticks as part of an identifier (regardless of whether it starts with `@`) without other specific syntax (e.g. nested in another quote style and thus not starting with `@`), so this shouldn't adversely affect non-MySQL dialects.

Introduces a new `Grantee` enum to differentiate bare identifiers (every other database: `root`) from user/host pairs (MySQL: `'root'@'%'`). While we're here, make the CASCADE/RESTRICT syntax for REVOKE optional since Postgres doesn't require it and MySQL doesn't allow it. Add support for MySQL wildcard object syntax: `GRANT ALL ON *.* ...`

mvzink · 2024-11-20T16:58:55Z

Would appreciate feedback on the naming the MySQL-specific params types (would a generic/non-branded name be better?) and the API change for wildcard support (didn't want to change all 107 parse_object_name callsites for an option that is so niche, but happy to be convinced otherwise).

tests/sqlparser_mysql.rs

src/parser/mod.rs

iffyio · 2024-11-24T09:29:59Z

src/tokenizer.rs

+                        Some('\'') => Ok(Some(Token::AtSign)),
+                        Some('\"') => Ok(Some(Token::AtSign)),
+                        Some('`') => Ok(Some(Token::AtSign)),


the changes feel unexpected, took a quick look it seems rather a bug that mysql dialect is currently configured to accept @ as an identifier start? their doc seems to suggest otherwise?

I thought so at first too, but variables in MySQL start with @ and I don't think there's a way to differentiate them during tokenization. We would want to tokenize @foo as a single Token::Word, but @'foo' (in my understanding) as [Token::AtSign, Token::SingleQuotedString]. In this check during tokenization, we aren't differentiating between "schema names etc." and "variable names", just "identifiers". Maybe there is something more correct to do here, like checking for self.is_identifier_continuation(sch) and ruling out ' there. But my approach of just adding cases for quotes is based on the fact that as far as I can tell, no dialect allows quote marks within an identifier unless it's in an outer quotation, which doesn't get tokenized by this path.

I think an approach like self.is_identifier_continuation(sch) would effectively be the same but could allow customization per dialect, but it has another pitfall: say in addition to quote marks, a dialect doesn't allow _ in an identifier. Then should we fail to tokenize @_ with an error, or tokenize them separately as [@, _]? If the former, we would then need another special case for quote marks, and the latter seems wrong to me.

One other example in support of tokenizing like [Token::AtSign, Token::SingleQuotedString], just in case it's not clear why I think that's right: in Postgres, @ is absolute value, and it will implicitly cast the operand to a numeric type, so that select @'1'; (no space after @) treats '1' as a string and casts it to double.

All that in mind, I still think my current approach is the simplest correct I can think of; let me know what you think!

Ah I see that makes sense to me and the current approach sounds reasonable! Can we add a comment around mentioning the example use case? I think that would be useful since the logic/tradeoff wouldnt be obvious otherwise, Also can we extend the tokenizer test case to cover the other two tokens supported by this path " and '`'?

iffyio · 2024-11-24T09:50:09Z

src/parser/mod.rs

@@ -8353,20 +8409,40 @@ impl<'a> Parser<'a> {
        }
    }

+    /// Parse a possibly qualified, possibly quoted identifier, optionally allowing for wildcards,
+    /// e.g. *, `foo`.*, or "foo"."bar"
+    pub fn parse_object_name_with_wildcards(


The method sounds like what parse_wildcard_expr does, minus the return type. Can we reuse that functionality instead? e.g. if we move that logic out to its own method so that parse_wildcard_expr and other functionality like grant etc can call it directly?

Still having a think about the best way to do this; posting the other updates for further feedback.

src/parser/mod.rs

tests/sqlparser_mysql.rs

src/ast/mod.rs

Though as far as I know they are entirely MySQL specific.

…mething unexpected

iffyio · 2024-11-28T15:51:03Z

src/tokenizer.rs

+                        Some('\'') => Ok(Some(Token::AtSign)),
+                        Some('\"') => Ok(Some(Token::AtSign)),
+                        Some('`') => Ok(Some(Token::AtSign)),


Ah I see that makes sense to me and the current approach sounds reasonable! Can we add a comment around mentioning the example use case? I think that would be useful since the logic/tradeoff wouldnt be obvious otherwise, Also can we extend the tokenizer test case to cover the other two tokens supported by this path " and '`'?

iffyio · 2024-11-28T15:56:40Z

src/ast/mod.rs

+        let parts = [
+            self.algorithm
+                .as_ref()
+                .map(|algorithm| format!("ALGORITHM = {algorithm}")),
+            self.definer
+                .as_ref()
+                .map(|definer| format!("DEFINER = {definer}")),
+            self.security
+                .as_ref()
+                .map(|security| format!("SQL SECURITY {security}")),
+        ]
+        .into_iter()
+        .flatten()
+        .collect::<Vec<_>>();
+        display_separated(&parts, " ").fmt(f)


Can we write these explicitly as e.g?

if let Some(algorithm) = self.algorithm { write!() }

I think that potentially makes it easier to spot the final output and extend in the future.
Also ideally we can use an exhaustive check with

let CreateviewParams { algorithm, definer, security } = self

so that the user is automatically guided to this function whenever the representation changes

mvzink added 3 commits November 20, 2024 08:49

Parse MySQL CREATE VIEW parameters

64669ed

Fix lint error and no-std build

69bb0d8

mvzink mentioned this pull request Nov 20, 2024

Enhance object name path segments #1539

Open

iffyio reviewed Nov 24, 2024

View reviewed changes

mvzink added 8 commits November 27, 2024 09:47

Add more test cases for create view

797c510

Use generic names for optional CREATE VIEW params

c896e10

Though as far as I know they are entirely MySQL specific.

Return errors instead of panicking when expect_one_keyword returns so…

935777b

…mething unexpected

Add dialect support toggle for MySQL grantee syntax

99bf3f6

Refactor parsing revoke cascade/restrict option to reuse truncate code

3091653

Fix a doc comment about BigQuery identifiers

50506c4

Merge branch 'main' into mysql-users

93a4b03

Accommodate new span field on ident for grantees and wildcards

3fc2164

iffyio reviewed Nov 28, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix MySQL parsing of GRANT, REVOKE, and CREATE VIEW #1538

Fix MySQL parsing of GRANT, REVOKE, and CREATE VIEW #1538

mvzink commented Nov 20, 2024

mvzink commented Nov 20, 2024

iffyio Nov 24, 2024

mvzink Nov 26, 2024

iffyio Nov 28, 2024

iffyio Nov 24, 2024

mvzink Nov 28, 2024

iffyio Nov 28, 2024

iffyio Nov 28, 2024

Fix MySQL parsing of GRANT, REVOKE, and CREATE VIEW #1538

Are you sure you want to change the base?

Fix MySQL parsing of GRANT, REVOKE, and CREATE VIEW #1538

Conversation

mvzink commented Nov 20, 2024

mvzink commented Nov 20, 2024

iffyio Nov 24, 2024

Choose a reason for hiding this comment

mvzink Nov 26, 2024

Choose a reason for hiding this comment

iffyio Nov 28, 2024

Choose a reason for hiding this comment

iffyio Nov 24, 2024

Choose a reason for hiding this comment

mvzink Nov 28, 2024

Choose a reason for hiding this comment

iffyio Nov 28, 2024

Choose a reason for hiding this comment

iffyio Nov 28, 2024

Choose a reason for hiding this comment