facing error when parsing yaml using scala #47

VIKCT001 · 2020-07-31T08:28:07Z

I am facing below issue while running a scala code on dataproc cluster. Code is running fine at local.

[Exception in thread "main" java.lang.NoSuchMethodError: org.yaml.snakeyaml.Yaml.(Lorg/yaml/snakeyaml/LoaderOptions;)V]

`object mytestmain {

def main(args: Array[String]): Unit = {
println("In main function")
println("reading from gcs bucket")

//val storage = StorageOptions.getDefaultInstance.getService
//val my_blob = storage.get(BlobId.of("test-bucket", "job-configs/test.yml"))

// val filecontent = new String(my_blob.getContent(), StandardCharsets.UTF_8)

val config = """file_location: test-file
               |big_query_dataset: test-dataset
               |big_query_tablename: test-table
                 """.stripMargin

val classobj = new IngestionData()
classobj.printYamlfiledata(config)

}
}`

`package com.test.processing.jobs

import net.jcazevedo.moultingyaml._
import com.test.processing.conf.DatasetConfiguration

object ReadYamlConfiguration extends DefaultYamlProtocol {
implicit object datasetConfFormat extends YamlFormat[DatasetConfiguration] {

def write(obj: DatasetConfiguration)=YamlObject (
  YamlString("file_location") -> YamlString(obj.file_location),
  YamlString("big_query_dataset") -> YamlString(obj.big_query_dataset),
  YamlString("big_query_tablename") -> YamlString(obj.big_query_tablename)
)

println("I am in read datasetConfFormat object ")
def read(value: YamlValue) = {
  value.asYamlObject.getFields(
    YamlString("file_location"),
    YamlString("big_query_dataset"),
    YamlString("big_query_tablename")) match {
    case Seq(
    YamlString(file_location),
    YamlString(big_query_dataset),
    YamlString(big_query_tablename)) =>
    new DatasetConfiguration(file_location, big_query_dataset, big_query_tablename)
    case _ => deserializationError("Data configs expected")
  }
}
implicit val YamlDatasetConfigurationfFormat = yamlFormat3(DatasetConfiguration)

}
}`

`import net.jcazevedo.moultingyaml._

import com.test.processing.jobs.ReadYamlConfiguration._

class IngestionData {

def printYamlfiledata(filedata: String) = {

println("I am in readYamlfiledata method")

val myObj = filedata.parseYaml.convertTo[DatasetConfiguration]
println("file name is :" + myObj.file_location)
println("dataset name is:" +myObj.big_query_dataset)
println("Table name is:" + myObj.big_query_tablename)

}

}`

case class DatasetConfiguration ( file_location: String, big_query_dataset: String, big_query_tablename: String )

It's failing when I am reading yaml file from bucket or even when I have hardcoded the file as an Input. running fine at local

The text was updated successfully, but these errors were encountered:

dankolesnikov · 2020-10-29T17:26:18Z

@VIKCT001 I am facing the exact same error, I have a project with Scala 2.11 and Spark 2.4, after implementing YAML parsing everything worked flawlessly via sbt test but after building a fat jar with sbt assembly and running spark-submit locally I am getting the method not found exception. After unpacking the jar, I can confirm that the org.yaml.snakeyaml.Yaml.(Lorg/yaml/snakeyaml/LoaderOptions;) is there.

My hunch is something bad is happening in assemblyMergeStrategy as it excludes pom.properties and and pom.xml of SnakeYAML because it is under META-INF folder which is expected behavior. @VIKCT001 could you share your build.sbt?

@jcazevedo Do you have any hunches? Have you encountered such a scenario?

dankolesnikov · 2020-10-31T01:01:08Z

The reason for this issue to occur is because Apache Spark (2.4 in my case) uses SnakeYAML 1.15 which gets picked up first by the class loader when running the project with spark-submit and SnakeYAML 1.26 used by moultingyaml gets ignored.

Solution is to shade SnakeYAML in your build.sbt like that:

assemblyShadeRules in assembly := Seq(
  // fixes the problem when running from spark-submit an older version of SnakeYAML is being used
  ShadeRule.rename("org.yaml.snakeyaml.**" -> "org.yaml.snakeyamlShaded@1").inAll
)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

facing error when parsing yaml using scala #47

facing error when parsing yaml using scala #47

VIKCT001 commented Jul 31, 2020 •

edited

Loading

dankolesnikov commented Oct 29, 2020

dankolesnikov commented Oct 31, 2020 •

edited

Loading

facing error when parsing yaml using scala #47

facing error when parsing yaml using scala #47

Comments

VIKCT001 commented Jul 31, 2020 • edited Loading

dankolesnikov commented Oct 29, 2020

dankolesnikov commented Oct 31, 2020 • edited Loading

VIKCT001 commented Jul 31, 2020 •

edited

Loading

dankolesnikov commented Oct 31, 2020 •

edited

Loading