You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running queries in the REPL, the error message returned is generic and lacks specifics about the root cause. For instance, a failure in executing a query results in the following error message:
Fail to run query. Cause: Exception thrown in awaitResult:
This message does not provide sufficient information to diagnose the actual issue, making troubleshooting difficult.
How can one reproduce the bug?
The bug can be reproduced using the following steps in the REPL IT environment:
class FlintREPLITSuite extends SparkFunSuite with OpenSearchSuite with JobTest {
...
test("error") {
try {
createSession(jobRunId, "")
threadLocalFuture.set(startREPL())
val createStatement =
s"""
| CREATE TABLE $testTable
| (
| name STRING,
| age INT
| )
| USING CSV
| OPTIONS (
| header 'false',
| delimiter '\\t'
| )
| LOCATION 's3://test-123';
|""".stripMargin
submitQuery(s"${makeJsonCompliant(createStatement)}", "99")
val selectQueryValidation: REPLResult => Boolean = result => {
// noinspection ScalaStyle
println(result)
true
}
pollForResultAndAssert(selectQueryValidation, "99")
} catch {
case e: Exception =>
logError("Unexpected exception", e)
assert(false, "Unexpected exception")
} finally {
waitREPLStop(threadLocalFuture.get())
threadLocalFuture.remove()
}
24/07/23 15:01:16 INFO FlintREPL: command complete: FlintStatement(
state=failed,
query= CREATE TABLE my_glue1.default.flint_sql_test ( name STRING, age INT ) USING CSV
OPTIONS ( header 'false', delimiter ' ' ) LOCATION 's3://test-123'; ,
statementId=e0cafbd7-6e28-4f6e-8076-78026e28d6c6,
queryId=99,
submitTime=1721772068015,
error=Some({"Message":"Fail to run query. Cause: Exception thrown in awaitResult: "}))
What is the expected behavior?
The system should provide a detailed error message that includes the root cause of the failure. Instead of a generic message, users should receive information that helps in diagnosing the underlying problem, such as specific exceptions or stack traces.
Do you have any additional context?
Spark log:
24/07/23 15:01:16 WARN FileStreamSink: Assume no metadata directory. Error while looking for metadata directory in the path: s3://test-123.
org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "s3"
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3443)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3466)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
at org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:53)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:370)
at org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:79)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:98)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:109)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:169)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:95)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)
at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:94)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:584)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:176)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:584)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:560)
at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:94)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:81)
at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:79)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:220)
at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:622)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:617)
at org.apache.spark.sql.FlintJobExecutor.executeQuery(FlintJobExecutor.scala:423)
at org.apache.spark.sql.FlintJobExecutor.executeQuery$(FlintJobExecutor.scala:412)
at org.apache.spark.sql.FlintREPL$.executeQuery(FlintREPL.scala:49)
at org.apache.spark.sql.FlintREPL$.$anonfun$executeQueryAsync$1(FlintREPL.scala:821)
at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659)
at scala.util.Success.$anonfun$map$1(Try.scala:255)
at scala.util.Success.map(Try.scala:213)
at scala.concurrent.Future.$anonfun$map$1(Future.scala:292)
at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33)
at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
24/07/23 15:01:16 INFO FlintREPL: command complete: FlintStatement(
state=failed, query= CREATE TABLE my_glue1.default.flint_sql_test ( name STRING, age INT ) USING CSV
OPTIONS ( header 'false', delimiter ' ' ) LOCATION 's3://test-123'; ,
statementId=e0cafbd7-6e28-4f6e-8076-78026e28d6c6, queryId=99, submitTime=1721772068015,
error=Some({"Message":"Fail to run query. Cause: Exception thrown in awaitResult: "}))
The text was updated successfully, but these errors were encountered:
After investigation, it appears that the root cause of the issue lies in the following line of code. The code is using the original exception ex instead of the root cause r. After making the necessary changes, the error message stored in the result behaves as expected.
24/07/23 15:25:30 INFO FlintREPL: command complete: FlintStatement(
state=failed, query= CREATE TABLE my_glue1.default.flint_sql_test ( name STRING, age INT )
USING CSV OPTIONS ( header 'false', delimiter ' ' ) LOCATION 's3://test-123'; ,
statementId=8ba68f33-d83b-428a-82b9-60def7a29cd2, queryId=99, submitTime=1721773522390,
error=Some({"Message":"Fail to run query. Cause: No FileSystem for scheme \"s3\""}))
Code changes:
diff --git a/spark-sql-application/src/main/scala/org/apache/spark/sql/FlintJobExecutor.scala
b/spark-sql-application/src/main/scala/org/apache/spark/sql/FlintJobExecutor.scala
index f38a27ef..c7dccf64 100644
--- a/spark-sql-application/src/main/scala/org/apache/spark/sql/FlintJobExecutor.scala
+++ b/spark-sql-application/src/main/scala/org/apache/spark/sql/FlintJobExecutor.scala
@@ -500,7 +500,7 @@ trait FlintJobExecutor {
val e = new SecurityException(ExceptionMessages.GlueAccessDeniedMessage)
handleQueryException(e, ExceptionMessages.QueryRunErrorPrefix)
} else {
- handleQueryException(ex, ExceptionMessages.QueryRunErrorPrefix)
+ handleQueryException(r, ExceptionMessages.QueryRunErrorPrefix)
}
}
}
What is the bug?
When running queries in the REPL, the error message returned is generic and lacks specifics about the root cause. For instance, a failure in executing a query results in the following error message:
This message does not provide sufficient information to diagnose the actual issue, making troubleshooting difficult.
How can one reproduce the bug?
The bug can be reproduced using the following steps in the REPL IT environment:
What is the expected behavior?
The system should provide a detailed error message that includes the root cause of the failure. Instead of a generic message, users should receive information that helps in diagnosing the underlying problem, such as specific exceptions or stack traces.
Do you have any additional context?
Spark log:
The text was updated successfully, but these errors were encountered: