Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate circe-yaml dependency #10326

Merged
merged 22 commits into from
Jul 5, 2024
Merged

Eliminate circe-yaml dependency #10326

merged 22 commits into from
Jul 5, 2024

Conversation

hubertp
Copy link
Collaborator

@hubertp hubertp commented Jun 20, 2024

Pull Request Description

This change adds our very-own YAML parser on top of SnakeYAML. Compared to Circe parser on top of SnakeYAML. The advantage? In some not-so-distant future we might actually get rid of circe and the related performance issues.

The logic is similar to what circe does i.e. analyzing SnakeYAML to build our own structure.
All configs already parseable. We could auto-generate some of the code but still some of the logic would have to be tweaked by hand; the current logic has a number of special cases, as I found out the hard way.

Closes #9113.

Important Notes

It's a bit hard to get a definite number of how things improved but here are some screenshots:
Before:
Screenshot from 2024-07-02 11-16-55
After:
Screenshot from 2024-07-02 11-16-38

There are a couple more savings in the range of 10-20ms each which all adds up easily to about 300ms.

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
  • All code follows the
    Scala,
    Java,
    TypeScript,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • Unit tests have been written where possible.

@hubertp hubertp added the CI: Clean build required CI runners will be cleaned before and after this PR is built. label Jun 20, 2024
This change adds our very-own YAML parser on top of SnakeYAML. Compared
to Circe parser on top of SnakeYAML. The advantage? In some not-so-distant
future we might actually get rid of circe and the related performance
issues.

The logic is similar to what circe does i.e. analyzing SnakeYAML to
build our own structure.
This change is not complete, as there are still some tests failing, but
most common Configs are already parseable.
We _could_ auto-generate some of the code but still some of the logic
would have to be tweaked by hand; the current logic has a number of
special cases, as I found out the hard way.
@hubertp hubertp force-pushed the wip/hubert/9113-snakeyaml branch from bd01755 to b07c1d6 Compare June 20, 2024 15:08
@hubertp hubertp added the CI: No changelog needed Do not require a changelog entry for this PR. label Jun 20, 2024
hubertp added 8 commits June 21, 2024 01:09
Dropping circe as a decoder for Editions revealed some problems. Turns
out the current implementation had even more special cases to deal with.
Replaced almost all `toYAML` locations with SnakeYAML equivalent.
The encoding has to use Java collections for which there exists a
built-in support. If we were to use Scala collections we would have to
deal with tagging, at the very least.
@hubertp
Copy link
Collaborator Author

hubertp commented Jul 2, 2024

In the current state of PR there is zero presence of io.circe.yaml class loading (and usage) during startup despite it being still on the classpath. This is because I'm having trouble with getting rid of the encoding part which is obviously not present during startup. I think we can deal with it in the follow up PR.

hubertp added 4 commits July 2, 2024 18:39
Added a custom SnakeYAML Node updater to mimick the JSON -> YAML -> JSON
conversion needed for updating fields. The algorithm recursively follows
the key-path and inserts the desired Node. This is not a performance
oriented code on purpose.
`circe-core` was marked as `provided` but no one eventually included it
in the final jar, hence `NoClassFoundException`.
@hubertp hubertp marked this pull request as ready for review July 3, 2024 12:56
@hubertp hubertp changed the title WIP: Eliminating circe-yaml Eliminate circe-yaml Jul 3, 2024
@hubertp hubertp changed the title Eliminate circe-yaml Eliminate circe-yaml dependency Jul 3, 2024
Copy link
Member

@Akirathan Akirathan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer more explicit encoding/decoding. The best would be if the implementation of our SnakeYamlEncoder and Decoder would be in Java. But if this change gives us around 300 ms improvement in startup, that is good enough.

build.sbt Outdated Show resolved Hide resolved
build.sbt Outdated Show resolved Hide resolved
@hubertp
Copy link
Collaborator Author

hubertp commented Jul 3, 2024

I would prefer more explicit encoding/decoding. The best would be if the implementation of our SnakeYamlEncoder and Decoder would be in Java. But if this change gives us around 300 ms improvement in startup, that is good enough.

We are using SnakeYAML which is as close to Java as one can get. I understand the sentiment but YAML parsing is still used 99% of the time in Scala code. And implicits eliminate a lot of useless boilerplate code in this case.

Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks like a good move forward.

@hubertp hubertp merged commit c54c3b7 into develop Jul 5, 2024
41 checks passed
@hubertp hubertp deleted the wip/hubert/9113-snakeyaml branch July 5, 2024 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Clean build required CI runners will be cleaned before and after this PR is built. CI: No changelog needed Do not require a changelog entry for this PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use SnakeYAML for YAML parsing (instead of circe)
5 participants