Skip to content

Commit

Permalink
[SPARK-49119][SQL] Fix the inconsistency of syntax show columns bet…
Browse files Browse the repository at this point in the history
…ween v1 and v2

### What changes were proposed in this pull request?
The pr aims to
- fix the `inconsistency` of syntax `show columns` between `v1` and `v2`.
- assign a name `SHOW_COLUMNS_WITH_CONFLICT_NAMESPACE` to the error condition `_LEGACY_ERROR_TEMP_1057`.
- unify v1 and v2 `SHOW COLUMNS ...` tests.
- move some UT related to `SHOW COLUMNS` from `DDLSuite` to `command/ShowColumnsSuiteBase` or `v1/ShowColumnsSuiteBase`.
- move some UT related to `SHOW COLUMNS` from `DDLParserSuite` and `ErrorParserSuite` to `ShowColumnsParserSuite`.

### Why are the changes needed?
In `AstBuilder`, we have `a comment` that explains as follows:
https://github.com/apache/spark/blob/2a752105091ef95f994526b15bae2159657c8ed0/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala#L5054-L5055
However, in our v2 of the syntax `show columns` implementation, we `did not` perform the above checks, as shown below:

```
  withNamespaceAndTable("ns", "tbl") { t =>
      sql(s"CREATE TABLE $t (col1 int, col2 string) $defaultUsing")
      sql(s"SHOW COLUMNS IN $t IN ns1")
    }
```

- Before (inconsistent, v1 will fail, but v2 will success)
  v1:
  ```
  [SHOW_COLUMNS_WITH_CONFLICT_NAMESPACE] SHOW COLUMNS with conflicting namespace: `ns1` != `ns`.
  ```

  v2:
  ```
  Execute successfully.
  ```

#### so, we should fix it.

- After (consistent, v1 & v2 all will fail)
  v1:
  ```
  [SHOW_COLUMNS_WITH_CONFLICT_NAMESPACE] SHOW COLUMNS with conflicting namespace: `ns1` != `ns`.
  ```

  v2:
  ```
  [SHOW_COLUMNS_WITH_CONFLICT_NAMESPACE] SHOW COLUMNS with conflicting namespace: `ns1` != `ns`.
  ```

### Does this PR introduce _any_ user-facing change?
Yes, for v2 tables, in syntax `SHOW COLUMNS {FROM | IN} {tableName} {FROM | IN} {namespace}`, if the namespace (`second parameter`) is different from the namespace of the table(`first parameter`), the command will succeed without any awareness before this PR, after this PR, it will report an error.

### How was this patch tested?
Add new UT.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #47628 from panbingkun/SPARK-49119.

Lead-authored-by: panbingkun <[email protected]>
Co-authored-by: Kent Yao <[email protected]>
Signed-off-by: yangjie01 <[email protected]>
  • Loading branch information
2 people authored and LuciferYang committed Aug 30, 2024
1 parent 0602020 commit 53c1f31
Show file tree
Hide file tree
Showing 16 changed files with 297 additions and 90 deletions.
11 changes: 6 additions & 5 deletions common/utils/src/main/resources/error/error-conditions.json
Original file line number Diff line number Diff line change
Expand Up @@ -3866,6 +3866,12 @@
],
"sqlState" : "42K08"
},
"SHOW_COLUMNS_WITH_CONFLICT_NAMESPACE" : {
"message" : [
"SHOW COLUMNS with conflicting namespaces: <namespaceA> != <namespaceB>."
],
"sqlState" : "42K05"
},
"SORT_BY_WITHOUT_BUCKETING" : {
"message" : [
"sortBy must be used together with bucketBy."
Expand Down Expand Up @@ -5685,11 +5691,6 @@
"ADD COLUMN with v1 tables cannot specify NOT NULL."
]
},
"_LEGACY_ERROR_TEMP_1057" : {
"message" : [
"SHOW COLUMNS with conflicting databases: '<dbA>' != '<dbB>'."
]
},
"_LEGACY_ERROR_TEMP_1058" : {
"message" : [
"Cannot create table with both USING <provider> and <serDeInfo>."
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1045,13 +1045,14 @@ private[sql] object QueryCompilationErrors extends QueryErrorsBase with Compilat
)
}

def showColumnsWithConflictDatabasesError(
db: Seq[String], v1TableName: TableIdentifier): Throwable = {
def showColumnsWithConflictNamespacesError(
namespaceA: Seq[String],
namespaceB: Seq[String]): Throwable = {
new AnalysisException(
errorClass = "_LEGACY_ERROR_TEMP_1057",
errorClass = "SHOW_COLUMNS_WITH_CONFLICT_NAMESPACE",
messageParameters = Map(
"dbA" -> db.head,
"dbB" -> v1TableName.database.get))
"namespaceA" -> toSQLId(namespaceA),
"namespaceB" -> toSQLId(namespaceB)))
}

def cannotCreateTableWithBothProviderAndSerdeError(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2406,29 +2406,6 @@ class DDLParserSuite extends AnalysisTest {
RefreshTable(UnresolvedTableOrView(Seq("a", "b", "c"), "REFRESH TABLE", true)))
}

test("show columns") {
val sql1 = "SHOW COLUMNS FROM t1"
val sql2 = "SHOW COLUMNS IN db1.t1"
val sql3 = "SHOW COLUMNS FROM t1 IN db1"
val sql4 = "SHOW COLUMNS FROM db1.t1 IN db1"

val parsed1 = parsePlan(sql1)
val expected1 = ShowColumns(UnresolvedTableOrView(Seq("t1"), "SHOW COLUMNS", true), None)
val parsed2 = parsePlan(sql2)
val expected2 = ShowColumns(UnresolvedTableOrView(Seq("db1", "t1"), "SHOW COLUMNS", true), None)
val parsed3 = parsePlan(sql3)
val expected3 =
ShowColumns(UnresolvedTableOrView(Seq("db1", "t1"), "SHOW COLUMNS", true), Some(Seq("db1")))
val parsed4 = parsePlan(sql4)
val expected4 =
ShowColumns(UnresolvedTableOrView(Seq("db1", "t1"), "SHOW COLUMNS", true), Some(Seq("db1")))

comparePlans(parsed1, expected1)
comparePlans(parsed2, expected2)
comparePlans(parsed3, expected3)
comparePlans(parsed4, expected4)
}

test("alter view: add partition (not supported)") {
val sql =
"""ALTER VIEW a.b.c ADD IF NOT EXISTS PARTITION
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -141,10 +141,6 @@ class ErrorParserSuite extends AnalysisTest {
exception = parseException("SHOW TABLE EXTENDED IN hyphen-db LIKE \"str\""),
errorClass = "INVALID_IDENTIFIER",
parameters = Map("ident" -> "hyphen-db"))
checkError(
exception = parseException("SHOW COLUMNS IN t FROM test-db"),
errorClass = "INVALID_IDENTIFIER",
parameters = Map("ident" -> "test-db"))
checkError(
exception = parseException("DESC SCHEMA EXTENDED test-db"),
errorClass = "INVALID_IDENTIFIER",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -330,7 +330,8 @@ class ResolveSessionCatalog(val catalogManager: CatalogManager)
val resolver = conf.resolver
val db = ns match {
case Some(db) if v1TableName.database.exists(!resolver(_, db.head)) =>
throw QueryCompilationErrors.showColumnsWithConflictDatabasesError(db, v1TableName)
throw QueryCompilationErrors.showColumnsWithConflictNamespacesError(
Seq(db.head), Seq(v1TableName.database.get))
case _ => ns.map(_.head)
}
ShowColumnsCommand(db, v1TableName, output)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ import org.apache.spark.sql.execution.{FilterExec, InSubqueryExec, LeafExecNode,
import org.apache.spark.sql.execution.command.CommandUtils
import org.apache.spark.sql.execution.datasources.{DataSourceStrategy, LogicalRelation, PushableColumnAndNestedColumn}
import org.apache.spark.sql.execution.streaming.continuous.{WriteToContinuousDataSource, WriteToContinuousDataSourceExec}
import org.apache.spark.sql.internal.SQLConf
import org.apache.spark.sql.internal.StaticSQLConf.WAREHOUSE_PATH
import org.apache.spark.sql.sources.{BaseRelation, TableScan}
import org.apache.spark.storage.StorageLevel
Expand Down Expand Up @@ -477,7 +478,17 @@ class DataSourceV2Strategy(session: SparkSession) extends Strategy with Predicat
Seq(part).asResolvedPartitionSpecs.head,
recacheTable(r)) :: Nil

case ShowColumns(resolvedTable: ResolvedTable, _, output) =>
case ShowColumns(resolvedTable: ResolvedTable, ns, output) =>
ns match {
case Some(namespace) =>
val tableNamespace = resolvedTable.identifier.namespace()
if (namespace.length != tableNamespace.length ||
!namespace.zip(tableNamespace).forall(SQLConf.get.resolver.tupled)) {
throw QueryCompilationErrors.showColumnsWithConflictNamespacesError(
namespace, tableNamespace.toSeq)
}
case _ =>
}
ShowColumnsExec(output, resolvedTable) :: Nil

case r @ ShowPartitions(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ import org.apache.spark.sql.execution.LeafExecNode
* Physical plan node for show columns from table.
*/
case class ShowColumnsExec(
output: Seq[Attribute],
resolvedTable: ResolvedTable) extends V2CommandExec with LeafExecNode {
output: Seq[Attribute],
resolvedTable: ResolvedTable) extends V2CommandExec with LeafExecNode {
override protected def run(): Seq[InternalRow] = {
resolvedTable.table.columns().map(f => toCatalystRow(f.name())).toSeq
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -94,10 +94,11 @@ SHOW COLUMNS IN showdb.showcolumn1 FROM baddb
-- !query analysis
org.apache.spark.sql.AnalysisException
{
"errorClass" : "_LEGACY_ERROR_TEMP_1057",
"errorClass" : "SHOW_COLUMNS_WITH_CONFLICT_NAMESPACE",
"sqlState" : "42K05",
"messageParameters" : {
"dbA" : "baddb",
"dbB" : "showdb"
"namespaceA" : "`baddb`",
"namespaceB" : "`showdb`"
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -123,10 +123,11 @@ struct<>
-- !query output
org.apache.spark.sql.AnalysisException
{
"errorClass" : "_LEGACY_ERROR_TEMP_1057",
"errorClass" : "SHOW_COLUMNS_WITH_CONFLICT_NAMESPACE",
"sqlState" : "42K05",
"messageParameters" : {
"dbA" : "baddb",
"dbB" : "showdb"
"namespaceA" : "`baddb`",
"namespaceB" : "`showdb`"
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2390,16 +2390,6 @@ class DataSourceV2SQLSuiteV1Filter
sql(s"UNCACHE TABLE IF EXISTS $t")
}

test("SHOW COLUMNS") {
val t = "testcat.ns1.ns2.tbl"
withTable(t) {
spark.sql(s"CREATE TABLE $t (id bigint, data string) USING foo")
checkAnswer(sql(s"SHOW COLUMNS FROM $t IN testcat.ns1.ns2"), Seq(Row("id"), Row("data")))
checkAnswer(sql(s"SHOW COLUMNS in $t"), Seq(Row("id"), Row("data")))
checkAnswer(sql(s"SHOW COLUMNS FROM $t"), Seq(Row("id"), Row("data")))
}
}

test("ALTER TABLE ... SET [SERDE|SERDEPROPERTIES]") {
val t = "testcat.ns1.ns2.tbl"
withTable(t) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1376,39 +1376,6 @@ abstract class DDLSuite extends QueryTest with DDLSuiteBase {
}
}

test("show columns - negative test") {
// When case sensitivity is true, the user supplied database name in table identifier
// should match the supplied database name in case sensitive way.
withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") {
withTempDatabase { db =>
val tabName = s"$db.showcolumn"
withTable(tabName) {
sql(s"CREATE TABLE $tabName(col1 int, col2 string) USING parquet ")
checkError(
exception = intercept[AnalysisException] {
sql(s"SHOW COLUMNS IN $db.showcolumn FROM ${db.toUpperCase(Locale.ROOT)}")
},
errorClass = "_LEGACY_ERROR_TEMP_1057",
parameters = Map("dbA" -> db.toUpperCase(Locale.ROOT), "dbB" -> db)
)
}
}
}
}

test("show columns - invalid db name") {
withTable("tbl") {
sql("CREATE TABLE tbl(col1 int, col2 string) USING parquet ")
checkError(
exception = intercept[AnalysisException] {
sql("SHOW COLUMNS IN tbl FROM a.b.c")
},
errorClass = "REQUIRES_SINGLE_PART_NAMESPACE",
parameters = Map("sessionCatalog" -> "spark_catalog", "namespace" -> "`a`.`b`.`c`")
)
}
}

test("SPARK-18009 calling toLocalIterator on commands") {
import scala.jdk.CollectionConverters._
val df = sql("show databases")
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.sql.execution.command

import org.apache.spark.sql.catalyst.analysis.{AnalysisTest, UnresolvedTableOrView}
import org.apache.spark.sql.catalyst.parser.CatalystSqlParser.parsePlan
import org.apache.spark.sql.catalyst.plans.logical.ShowColumns

class ShowColumnsParserSuite extends AnalysisTest {

test("show columns") {
comparePlans(
parsePlan("SHOW COLUMNS IN a.b.c"),
ShowColumns(
UnresolvedTableOrView(Seq("a", "b", "c"), "SHOW COLUMNS", allowTempView = true),
None))
comparePlans(
parsePlan("SHOW COLUMNS FROM a.b.c"),
ShowColumns(
UnresolvedTableOrView(Seq("a", "b", "c"), "SHOW COLUMNS", allowTempView = true),
None))
comparePlans(
parsePlan("SHOW COLUMNS IN a.b.c FROM a.b"),
ShowColumns(UnresolvedTableOrView(Seq("a", "b", "c"), "SHOW COLUMNS", allowTempView = true),
Some(Seq("a", "b"))))
comparePlans(
parsePlan("SHOW COLUMNS FROM a.b.c IN a.b"),
ShowColumns(UnresolvedTableOrView(Seq("a", "b", "c"), "SHOW COLUMNS", allowTempView = true),
Some(Seq("a", "b"))))
}

test("illegal characters in unquoted identifier") {
checkError(
exception = parseException(parsePlan)("SHOW COLUMNS IN t FROM test-db"),
errorClass = "INVALID_IDENTIFIER",
sqlState = "42602",
parameters = Map("ident" -> "test-db")
)
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package org.apache.spark.sql.execution.command

import java.util.Locale

import org.apache.spark.sql.{AnalysisException, QueryTest, Row}
import org.apache.spark.sql.internal.SQLConf

/**
* This base suite contains unified tests for the `SHOW COLUMNS ...` command that
* check V1 and V2 table catalogs. The tests that cannot run for all supported catalogs are
* located in more specific test suites:
*
* - V2 table catalog tests:
* `org.apache.spark.sql.execution.command.v2.ShowColumnsSuite`
* - V1 table catalog tests:
* `org.apache.spark.sql.execution.command.v1.ShowColumnsSuiteBase`
* - V1 In-Memory catalog:
* `org.apache.spark.sql.execution.command.v1.ShowColumnsSuite`
* - V1 Hive External catalog:
* `org.apache.spark.sql.hive.execution.command.ShowColumnsSuite`
*/
trait ShowColumnsSuiteBase extends QueryTest with DDLCommandTestUtils {
override val command = "SHOW COLUMNS ..."

test("basic test") {
withNamespaceAndTable("ns", "tbl") { t =>
sql(s"CREATE TABLE $t(col1 int, col2 string) $defaultUsing")
val expected = Seq(Row("col1"), Row("col2"))
checkAnswer(sql(s"SHOW COLUMNS FROM $t IN ns"), expected)
checkAnswer(sql(s"SHOW COLUMNS IN $t FROM ns"), expected)
checkAnswer(sql(s"SHOW COLUMNS IN $t"), expected)
}
}

test("negative test - the table does not exist") {
withNamespaceAndTable("ns", "tbl") { t =>
sql(s"CREATE TABLE $t(col1 int, col2 string) $defaultUsing")

checkError(
exception = intercept[AnalysisException] {
sql(s"SHOW COLUMNS IN tbl IN ns1")
},
errorClass = "TABLE_OR_VIEW_NOT_FOUND",
parameters = Map("relationName" -> "`ns1`.`tbl`"),
context = ExpectedContext(fragment = "tbl", start = 16, stop = 18)
)
}
}

test("the namespace of the table conflicts with the specified namespace") {
withNamespaceAndTable("ns", "tbl") { t =>
sql(s"CREATE TABLE $t(col1 int, col2 string) $defaultUsing")

val sqlText1 = s"SHOW COLUMNS IN $t IN ns1"
val sqlText2 = s"SHOW COLUMNS IN $t FROM ${"ns".toUpperCase(Locale.ROOT)}"

checkError(
exception = intercept[AnalysisException] {
sql(sqlText1)
},
errorClass = "SHOW_COLUMNS_WITH_CONFLICT_NAMESPACE",
parameters = Map(
"namespaceA" -> s"`ns1`",
"namespaceB" -> s"`ns`"
)
)
// When case sensitivity is true, the user supplied namespace name in table identifier
// should match the supplied namespace name in case-sensitive way.
withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") {
checkError(
exception = intercept[AnalysisException] {
sql(sqlText2)
},
errorClass = "SHOW_COLUMNS_WITH_CONFLICT_NAMESPACE",
parameters = Map(
"namespaceA" -> s"`${"ns".toUpperCase(Locale.ROOT)}`",
"namespaceB" -> "`ns`"
)
)
}
}
}
}
Loading

0 comments on commit 53c1f31

Please sign in to comment.