-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-45687][CORE][SQL][ML][MLLIB][KUBERNETES][EXAMPLES][CONNECT][STRUCTURED STREAMING] Fix Passing an explicit array value to a Scala varargs method is deprecated
#43642
Conversation
cc @LuciferYang please take a look at this PR. Thanks. |
Could you check again? IIRC, there should be more than 40+ files involved in this issue... |
for example:
and
|
Thanks @LuciferYang ... checking |
I will set this PR to draft first. |
cc @LuciferYang , this PR is ready for review. Fixed all the warnnings with build command |
Passing an explicit array value to a Scala varargs method is deprecated
Passing an explicit array value to a Scala varargs method is deprecated
@ivoson You can temporarily add "-Wconf:cat=deprecation&msg=Passing an explicit array value to a Scala varargs method is deprecated:e" to |
core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala
Outdated
Show resolved
Hide resolved
project/SparkBuild.scala
Outdated
@@ -254,7 +254,8 @@ object SparkBuild extends PomBuild { | |||
// SPARK-45627 `enum`, `export` and `given` will become keywords in Scala 3, | |||
// so they are prohibited from being used as variable names in Scala 2.13 to | |||
// reduce the cost of migration in subsequent versions. | |||
"-Wconf:cat=deprecation&msg=it will become a keyword in Scala 3:e" | |||
"-Wconf:cat=deprecation&msg=it will become a keyword in Scala 3:e", | |||
"-Wconf:cat=deprecation&msg=Passing an explicit array value to a Scala varargs method is deprecated:e" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the end, we may not need to add this new compile option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we keep it to avoid other folks adding the case again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's not necessary, I'll remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we keep it to avoid other folks adding the case again?
This will cause some memory consumption and performance difference in collection copy, so it is indeed a problem. However, I prefer to wait for a while and see if the related cases increase rapidly again. If so, we can clean them up again and make it a stricter compile check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm... I want to confirm again that in Scala 3, this is still just a compilation warning, right?
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala
Outdated
Show resolved
Hide resolved
...alyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/UnsafeRowConverterSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JDBCRDD.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/ParametersSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/ParametersSuite.scala
Outdated
Show resolved
Hide resolved
cc @srowen FYI |
tmpModel.transform(df) | ||
.withColumn(accColName, updateUDF(col(accColName), col(tmpRawPredName))) | ||
.select(columns: _*) | ||
.select(columns.toImmutableArraySeq: _*) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the cases in the example module, it is recommended to directly use toIndexedSeq
or ArraySeq. unsafeWrapArra
because ArrayImplicits
is private[spark]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Do you mean the case in examples/src/main/scala/org/apache/spark/examples/graphx/Analytics.scala
? Changed to use ArraySeq. unsafeWrapArray
explicitly.
Looks OK but the PR description says you'll avoid a copy with unsafeWrapArray or other methods, but the change uses toImmutableArraySeq. That's fine but does it do the same thing? |
The implementation of spark/common/utils/src/main/scala/org/apache/spark/util/ArrayImplicits.scala Lines 27 to 34 in f603830
This is the same thing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM if test pass
@@ -51,7 +51,7 @@ object Analytics { | |||
case _ => throw new IllegalArgumentException(s"Invalid argument: $arg") | |||
} | |||
} | |||
val options = mutable.Map(optionsList: _*) | |||
val options = mutable.Map(immutable.ArraySeq.unsafeWrapArray(optionsList): _*) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, https://github.com/apache/spark/pull/43642/files#r1383391358 should for this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, why don't we use org.apache.spark.util.ArrayImplicits
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, because examples
do not depend on common/utils
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ArrayImplicits
is currently in the private[spark]
scope, should we expose them in the examples code? Sorry, I'm not very clear about this rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, because
examples
do not depend on common/utils
common/utils
is a transitive dependency of the core
module, so ArrayImplicits
is visible to the examples
module, I previously suggested this only because I believe that private[spark]
code should not be part of the examples. In this context, what is your suggestion? @cloud-fan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, because
examples
do not depend on common/utils
Just to clarify, the mllib-local
module indeed does not depend on the common/utils
module, so ArrayImplicits
is not used in the mllib-local
module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine to make examples
an exception and not use ArrayImplicits
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your confirmation.
…uite.scala Co-authored-by: YangJie <[email protected]>
…cala Co-authored-by: YangJie <[email protected]>
Passing an explicit array value to a Scala varargs method is deprecated
Passing an explicit array value to a Scala varargs method is deprecated
friendly ping @srowen Could you take another look? Thanks ~ |
Merged to master |
Thanks @LuciferYang @srowen |
What changes were proposed in this pull request?
Fix the deprecated behavior below:
Passing an explicit array value to a Scala varargs method is deprecated (since 2.13.0) and will result in a defensive copy; Use the more efficient non-copying ArraySeq.unsafeWrapArray or an explicit toIndexedSeq call
For all the use cases, we don't need to make a copy of the array. Explicitly use
ArraySeq.unsafeWrapArray
to do the conversion.Why are the changes needed?
Eliminate compile warnings and no longer use deprecated scala APIs.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Pass GA.
Fixed all the warning with build:
mvn clean package -DskipTests -Pspark-ganglia-lgpl -Pkinesis-asl -Pdocker-integration-tests -Pyarn -Pkubernetes -Pkubernetes-integration-tests -Phive-thriftserver -Phadoop-cloud
Was this patch authored or co-authored using generative AI tooling?
No.