Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate Map Entries get "squashed" to the last entry in Yaml src #12

Open
rbuckland opened this issue Jul 27, 2016 · 4 comments
Open

Comments

@rbuckland
Copy link

rbuckland commented Jul 27, 2016

There is unexpected behaviour when dealing with duplicate maps inside a yaml file.

foo:
  value: 1
foo:
  value: 2

will be valid, and will only have a {foo:{value:2}} map (no error).
The expected behaviour would be to have an exception // parsing error of some form.

The underlying behaviour is in snakeyaml, and it tracked and documented well here https://bitbucket.org/asomov/snakeyaml/issues/337/option-to-disallow-duplicate-keys.

This issue is needed to track the outcome of the PR to snakeyaml.

@jcazevedo
Copy link
Owner

I think the parseYaml and parseYamls methods of MoultingYAML should use the default BaseConstructor of SnakeYAML if called without arguments. I'd more inclined to change the signature of parseYaml and parseYamls to something that allows customizing the SnakeYAML BaseConstructor, i.e.:

def parseYaml(baseConstructor: BaseConstructor = new Constructor()) = ???
def parseYamls(baseConstructor: BaseConstructor = new Constructor()) = ???

@rbuckland
Copy link
Author

👍 - makes sense to change the signature

regarding which constructor (or better put - the default behaviour)

I maybe slightly green on understanding the YAML specifications but it occurs to me that "accepting" duplicate keys (in 1.1) is not good.. and in 1.2 is a MUST NOT allow duplicates.

I could be wrong, but it leads to the "default" behaviour to be

NOT_ALLOW_DUPLICATES

and error on load if they are found.
In code that I am writing for snakeyaml now ..

     * YAML 1.1 is slightly vague around duplicate entries in the YAML file.
     * The best reference is <a href="http://www.yaml.org/spec/1.1/#id862121">
     * 3.2.1.3. Nodes Comparison</a> where it hints that a duplicate map key is 
     * an error.
     * 
     * For future reference, YAML spec 1.2 is clear. The keys MUST be unique.
     * <a href="http://www.yaml.org/spec/1.2/spec.html#id2759572">1.3. Relation
     * to JSON</a>

I am about to push a PR to snakeyaml that introduces a config object to the constructor, instead of this overloaded method.

@rbuckland
Copy link
Author

rbuckland commented Jul 28, 2016

I have reworked the change in snakeyaml and then reworked the impact to my fork.
https://github.com/rbuckland/moultingyaml

  • snakeyaml now takes an optional LoadingConfig (similar to DumperOptions)

    • it added setAllowDuplicateKeys(boolean allowDuplicateKeys)
  • this loadingConfig is supplied via an implicit

      implicit class PimpedString(val string: String) extends AnyVal {
    
        def parseYaml()(implicit loadingConfig: LoadingConfig = new LoadingConfig()): YamlValue =
          convertToYamlValue(new Yaml(loadingConfig).load(string))
    
        def parseYamls()(implicit loadingConfig: LoadingConfig = new LoadingConfig()): Seq[YamlValue] =
          new Yaml(loadingConfig).loadAll(string).asScala.map(convertToYamlValue).toSeq
      }
    

    implementation is simply

      implicit val loadingConfig = new LoadingConfig()
      loadingConfig.setAllowDuplicateKeys(false)
     ... 
     ... parseYamls...
    
  • once snakeyaml PR is reviewed / approved and released, I will create the PR from my fork

@jcazevedo
Copy link
Owner

If SnakeYAML is to provide a way to configure the way YAMLs are parsed, I'd rather have the configuration be done explicitly in the parseYaml(s) methods. Also, I wouldn't expose a SnakeYAML object (i.e. LoadingConfig), but rather have a MoultingYAML interface to avoid exposing the underlying implementation (the configurable print method is implemented that way).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants