You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
println("I am in readYamlfiledata method")
val myObj = filedata.parseYaml.convertTo[DatasetConfiguration]
println("file name is :" + myObj.file_location)
println("dataset name is:" +myObj.big_query_dataset)
println("Table name is:" + myObj.big_query_tablename)
}
}`
case class DatasetConfiguration ( file_location: String, big_query_dataset: String, big_query_tablename: String )
It's failing when I am reading yaml file from bucket or even when I have hardcoded the file as an Input. running fine at local
The text was updated successfully, but these errors were encountered:
@VIKCT001 I am facing the exact same error, I have a project with Scala 2.11 and Spark 2.4, after implementing YAML parsing everything worked flawlessly via sbt test but after building a fat jar with sbt assembly and running spark-submit locally I am getting the method not found exception. After unpacking the jar, I can confirm that the org.yaml.snakeyaml.Yaml.(Lorg/yaml/snakeyaml/LoaderOptions;) is there.
My hunch is something bad is happening in assemblyMergeStrategy as it excludes pom.properties and and pom.xml of SnakeYAML because it is under META-INF folder which is expected behavior. @VIKCT001 could you share your build.sbt?
@jcazevedo Do you have any hunches? Have you encountered such a scenario?
The reason for this issue to occur is because Apache Spark (2.4 in my case) uses SnakeYAML 1.15 which gets picked up first by the class loader when running the project with spark-submit and SnakeYAML 1.26 used by moultingyaml gets ignored.
Solution is to shade SnakeYAML in your build.sbt like that:
assemblyShadeRules in assembly := Seq(
// fixes the problem when running from spark-submit an older version of SnakeYAML is being used
ShadeRule.rename("org.yaml.snakeyaml.**" -> "org.yaml.snakeyamlShaded@1").inAll
)
I am facing below issue while running a scala code on dataproc cluster. Code is running fine at local.
[Exception in thread "main" java.lang.NoSuchMethodError: org.yaml.snakeyaml.Yaml.(Lorg/yaml/snakeyaml/LoaderOptions;)V]
`object mytestmain {
def main(args: Array[String]): Unit = {
println("In main function")
println("reading from gcs bucket")
// val filecontent = new String(my_blob.getContent(), StandardCharsets.UTF_8)
}
}`
`package com.test.processing.jobs
import net.jcazevedo.moultingyaml._
import com.test.processing.conf.DatasetConfiguration
object ReadYamlConfiguration extends DefaultYamlProtocol {
implicit object datasetConfFormat extends YamlFormat[DatasetConfiguration] {
}
}`
`import net.jcazevedo.moultingyaml._
import com.test.processing.jobs.ReadYamlConfiguration._
class IngestionData {
def printYamlfiledata(filedata: String) = {
}
}`
case class DatasetConfiguration ( file_location: String, big_query_dataset: String, big_query_tablename: String )
It's failing when I am reading yaml file from bucket or even when I have hardcoded the file as an Input. running fine at local
The text was updated successfully, but these errors were encountered: