-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-49413][CONNECT][SQL] Create a shared RuntimeConfig interface #47980
Conversation
Merging to master. |
import org.apache.spark.sql.connect.client.SparkConnectClient | ||
|
||
/** | ||
* Runtime configuration interface for Spark. To access this, use `SparkSession.conf`. | ||
* | ||
* @since 3.4.0 | ||
*/ | ||
class RuntimeConfig private[sql] (client: SparkConnectClient) extends Logging { | ||
class ConnectRuntimeConfig private[sql] (client: SparkConnectClient) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a breaking change at API level.
- I'm wondering if this is okay, @hvanhovell ?
- In addition, please fix
@since 3.4.0
tag becauseConnectRuntimeConfig
only exists after 4.0.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class is not part of the public API. It is in internal, which is not public API. Regarding the tag I am not sure, that class has existed since 3.4.0. It just got renamed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To @hvanhovell . The previous RuntimeConfig
is a documented public API.
@@ -160,6 +160,10 @@ object MimaExcludes { | |||
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.DataFrameWriterV2"), | |||
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.WriteConfigMethods"), | |||
|
|||
// SPARK-49413: Create a shared RuntimeConfig interface. | |||
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.RuntimeConfig"), | |||
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.RuntimeConfig$"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya, this looks like a real breaking change indeed from Spark SQL part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it is not. The org.apache.spark.sql.RuntimeConfig class is still there. It was just moved. MiMa - unfortunately - cannot deal with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya, got it. So, there is no breaking change effectively because it's in api
module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment here as a follow-up, please, @hvanhovell ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I will create a follow-up.
@@ -1110,13 +1103,13 @@ object SparkSession extends Logging { | |||
private[sql] def getOrCloneSessionWithConfigsOff( | |||
session: SparkSession, | |||
configurations: Seq[ConfigEntry[Boolean]]): SparkSession = { | |||
val configsEnabled = configurations.filter(session.conf.get[Boolean]) | |||
val configsEnabled = configurations.filter(session.sessionState.conf.getConf[Boolean]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although this is not a breaking change, do we still need this change inevitably for some reasons?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If then, the developers seem to be annoyed by this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this PR I use the abstract class in sql/api the main interface. I did not really wanted to add support for this because it is not user facing, and because it requires me to move even more stuff into sql/api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which developers? Spark developers?
@@ -52,7 +52,7 @@ class FileBasedDataSourceSuite extends QueryTest | |||
|
|||
override def beforeAll(): Unit = { | |||
super.beforeAll() | |||
spark.conf.set(SQLConf.ORC_IMPLEMENTATION, "native") | |||
spark.conf.set(SQLConf.ORC_IMPLEMENTATION.key, "native") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a required user-facing change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well the method that takes a sql conf entry is gone now... This was not a user facing feature to begin with.
Thank you for making this direction, @hvanhovell . In general it looks like achieving most goals, but the following changes looks like a big user (Spark app developers) facing glitch to me. If possible, can we make Apache Spark 4.0.0 migration more seamlessly by supporting back in the same way?
|
If this is difficult, could you file a JIRA issue instead? Then, if it's required by someone, he might pick it up.
|
@dongjoon-hyun which developers are you talking about? I have fixed all the spark cases. Generally library developers should not really be using any of this, if they don't want their stuff to break. |
It is not difficult, it is just yet another thing that needs to be moved to common/utils. |
### What changes were proposed in this pull request? This PR introduces a shared RuntimeConfig interface. ### Why are the changes needed? We are creating a shared Scala Spark SQL interface for Classic and Connect. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47980 from hvanhovell/SPARK-49413. Authored-by: Herman van Hovell <[email protected]> Signed-off-by: Herman van Hovell <[email protected]>
### What changes were proposed in this pull request? This PR introduces a shared RuntimeConfig interface. ### Why are the changes needed? We are creating a shared Scala Spark SQL interface for Classic and Connect. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47980 from hvanhovell/SPARK-49413. Authored-by: Herman van Hovell <[email protected]> Signed-off-by: Herman van Hovell <[email protected]>
…interface ### What changes were proposed in this pull request? This PR adds support for ConfigEntry to the RuntimeConfig interface. This was removed in #47980. ### Why are the changes needed? This functionality is used a lot by Spark libraries. Removing them caused friction, and adding them does not pollute the RuntimeConfig interface. ### Does this PR introduce _any_ user-facing change? No. This is developer API. ### How was this patch tested? I have added tests cases for Connect and Classic. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #49062 from hvanhovell/SPARK-49709. Authored-by: Herman van Hovell <[email protected]> Signed-off-by: Herman van Hovell <[email protected]>
What changes were proposed in this pull request?
This PR introduces a shared RuntimeConfig interface.
Why are the changes needed?
We are creating a shared Scala Spark SQL interface for Classic and Connect.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Existing tests.
Was this patch authored or co-authored using generative AI tooling?
No.