Eliminate circe-yaml dependency #10326

hubertp · 2024-06-20T15:08:09Z

Pull Request Description

This change adds our very-own YAML parser on top of SnakeYAML. Compared to Circe parser on top of SnakeYAML. The advantage? In some not-so-distant future we might actually get rid of circe and the related performance issues.

The logic is similar to what circe does i.e. analyzing SnakeYAML to build our own structure.
All configs already parseable. We could auto-generate some of the code but still some of the logic would have to be tweaked by hand; the current logic has a number of special cases, as I found out the hard way.

Closes #9113.

Important Notes

It's a bit hard to get a definite number of how things improved but here are some screenshots:
Before:

After:

There are a couple more savings in the range of 10-20ms each which all adds up easily to about 300ms.

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

The documentation has been updated, if necessary.
Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
All code follows the
Scala,
Java,
TypeScript,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
Unit tests have been written where possible.

This change adds our very-own YAML parser on top of SnakeYAML. Compared to Circe parser on top of SnakeYAML. The advantage? In some not-so-distant future we might actually get rid of circe and the related performance issues. The logic is similar to what circe does i.e. analyzing SnakeYAML to build our own structure. This change is not complete, as there are still some tests failing, but most common Configs are already parseable. We _could_ auto-generate some of the code but still some of the logic would have to be tweaked by hand; the current logic has a number of special cases, as I found out the hard way.

Dropping circe as a decoder for Editions revealed some problems. Turns out the current implementation had even more special cases to deal with.

Replaced almost all `toYAML` locations with SnakeYAML equivalent. The encoding has to use Java collections for which there exists a built-in support. If we were to use Scala collections we would have to deal with tagging, at the very least.

hubertp · 2024-07-02T08:31:32Z

In the current state of PR there is zero presence of io.circe.yaml class loading (and usage) during startup despite it being still on the classpath. This is because I'm having trouble with getting rid of the encoding part which is obviously not present during startup. I think we can deal with it in the follow up PR.

Added a custom SnakeYAML Node updater to mimick the JSON -> YAML -> JSON conversion needed for updating fields. The algorithm recursively follows the key-path and inserts the desired Node. This is not a performance oriented code on purpose.

`circe-core` was marked as `provided` but no one eventually included it in the final jar, hence `NoClassFoundException`.

Akirathan

I would prefer more explicit encoding/decoding. The best would be if the implementation of our SnakeYamlEncoder and Decoder would be in Java. But if this change gives us around 300 ms improvement in startup, that is good enough.

build.sbt

lib/scala/distribution-manager/src/main/scala/org/enso/distribution/config/GlobalConfig.scala

hubertp · 2024-07-03T21:35:30Z

I would prefer more explicit encoding/decoding. The best would be if the implementation of our SnakeYamlEncoder and Decoder would be in Java. But if this change gives us around 300 ms improvement in startup, that is good enough.

We are using SnakeYAML which is as close to Java as one can get. I understand the sentiment but YAML parsing is still used 99% of the time in Scala code. And implicits eliminate a lot of useless boilerplate code in this case.

JaroslavTulach

Overall it looks like a good move forward.

hubertp added the CI: Clean build required CI runners will be cleaned before and after this PR is built. label Jun 20, 2024

hubertp force-pushed the wip/hubert/9113-snakeyaml branch from bd01755 to b07c1d6 Compare June 20, 2024 15:08

hubertp added the CI: No changelog needed Do not require a changelog entry for this PR. label Jun 20, 2024

hubertp added 8 commits June 21, 2024 01:09

wip: more tests passing

741150f

Fix remaining tests in ConfigSpec

787f981

Fixing YAML decoder for editions

f9f8197

Dropping circe as a decoder for Editions revealed some problems. Turns out the current implementation had even more special cases to deal with.

nit

486f5ab

Allow for empty exports

1bf62d1

Mostly complete encodin part

95ff29f

Replaced almost all `toYAML` locations with SnakeYAML equivalent. The encoding has to use Java collections for which there exists a built-in support. If we were to use Scala collections we would have to deal with tagging, at the very least.

Remove the last remaining Circe's YAML parser

0f84a07

Bug fix + further loop optimization

5aef16f

hubertp added 4 commits July 2, 2024 18:39

removal of some dependencies

4ee1be5

Remove circe-yaml

580b6bd

Added a custom SnakeYAML Node updater to mimick the JSON -> YAML -> JSON conversion needed for updating fields. The algorithm recursively follows the key-path and inserts the desired Node. This is not a performance oriented code on purpose.

Merge branch 'develop' into wip/hubert/9113-snakeyaml

4ae0057

Fix compilation issues

b567153

`circe-core` was marked as `provided` but no one eventually included it in the final jar, hence `NoClassFoundException`.

hubertp marked this pull request as ready for review July 3, 2024 12:56

hubertp requested review from 4e6, JaroslavTulach and Akirathan as code owners July 3, 2024 12:56

hubertp changed the title ~~WIP: Eliminating circe-yaml~~ Eliminate circe-yaml Jul 3, 2024

hubertp changed the title ~~Eliminate circe-yaml~~ Eliminate circe-yaml dependency Jul 3, 2024

fix licensing

099b7d6

hubertp requested review from radeusgd, jdunkerley, GregoryTravis, AdRiley and marthasharkey as code owners July 3, 2024 13:10

Akirathan approved these changes Jul 3, 2024

View reviewed changes

build.sbt Outdated Show resolved Hide resolved

build.sbt Outdated Show resolved Hide resolved

lib/scala/distribution-manager/src/main/scala/org/enso/distribution/config/GlobalConfig.scala Outdated Show resolved Hide resolved

hubertp added 2 commits July 3, 2024 23:28

Removing obsolete circe definitions

81aef66

fmt

57fb2b7

hubertp added 6 commits July 4, 2024 01:06

nits

5b53477

s/SnakeYamlDecoder/YamlDecoder

386ed1d

fmt

71495d2

Partial revert, PM needs JSON decoders/encoders

4c31bd8

style

923ed65

incremental compilation gone wrong

6c37c2c

AdRiley approved these changes Jul 4, 2024

View reviewed changes

jdunkerley approved these changes Jul 4, 2024

View reviewed changes

JaroslavTulach approved these changes Jul 5, 2024

View reviewed changes

hubertp merged commit c54c3b7 into develop Jul 5, 2024
41 checks passed

hubertp deleted the wip/hubert/9113-snakeyaml branch July 5, 2024 07:32

hubertp mentioned this pull request Jan 9, 2025

Backend produces messages inconsistent with protocol definition #11964

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eliminate circe-yaml dependency #10326

Eliminate circe-yaml dependency #10326

hubertp commented Jun 20, 2024 •

edited

Loading

hubertp commented Jul 2, 2024

Akirathan left a comment

hubertp commented Jul 3, 2024

JaroslavTulach left a comment

Eliminate circe-yaml dependency #10326

Eliminate circe-yaml dependency #10326

Conversation

hubertp commented Jun 20, 2024 • edited Loading

Pull Request Description

Important Notes

Checklist

hubertp commented Jul 2, 2024

Akirathan left a comment

Choose a reason for hiding this comment

hubertp commented Jul 3, 2024

JaroslavTulach left a comment

Choose a reason for hiding this comment

hubertp commented Jun 20, 2024 •

edited

Loading