Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add post about corvus.jsonschema #6

Merged
merged 3 commits into from
Feb 11, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified .jekyll-metadata
Binary file not shown.
95 changes: 95 additions & 0 deletions _posts/2024/2024-02-11-dropping-codegen.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
---
title: "Dropping Project Support for Code Generation"
date: 2024-02-11 09:00:00 +1200
tags: [json-schema, codegen]
toc: true
pin: false
---

Some time ago, I released [my first attempt at code generation](/posts/exploring-codegen) from JSON Schemas. However, I've decided to deprecate the library in favor of _Corvus.JsonSchema_.

When I created _JsonSchema.Net.CodeGeneration_, I knew about _Corvus.JsonSchema_, but I thought it was merely an alternative validator. I didn't truly understand its approach to supporting JSON Schema in .Net.

Today we're going to take a look at this seeming competitor to see why it actually isn't one.

## What is _Corvus.JsonSchema_?

_Corvus.JsonSchema_ is a JSON Schema code generator that bakes validation logic directly into the model.

To show this, consider the following schema.

```json
{
"type": "object",
"properties": {
"foo": {
"type": "integer",
"minimum": 0
}
}
}
```

As one would expect, the library would generate a class with a single property: `int foo`. But more than mere auto-properties, it generates extra code in the setter to ensure that the model stays within the constraints expressed in the schema, even at runtime.

This means that the setter for `foo` would also contain logic similar to

```c#
if (value < 0)
throw new ArgumentException(nameof(value), "Value must be greater than 0");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not actually true - you can set the value but then the result won't be IsValid() == true any more. We avoid exceptions wherever possible because they are so expensive.

But the setters are strongly typed (including taking into account e.g. format.

Copy link
Collaborator Author

@gregsdennis gregsdennis Feb 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I didn't realize that.

Looking at some docs, it seems that it will take any (typed) value, and then validation is performed in the .IsValid() method, as you mentioned.

Alright. I'll update that. (I thought the model was enforced to be valid.)


_foo = value;
```

However _Corvus.JsonSchema_ has another trick up its sleeve. But before we get into that, it will help to have some understanding of how _System.Text.Json_'s `JsonElement` works.

## A quick review of `JsonElement`

Under the hood, `JsonElement` captures the portion of the parsed JSON text by using spans. This has a number of follow-on benefits:

- By avoiding substringing, there are no additional heap allocations.
- `JsonElement` can be a struct, which further avoids allocations, because it only maintains references to existing memory.
- By holding onto the original text, the value can be interpreted different ways. For example, numbers could be read as `double` or `decimal` or `integer`.

As an example, consider this string:

```json
{ "foo": 42, "bar": [ "a string", false ] }
```

Five different `JsonElement`s would be created:

- top-level object
- number value under `foo`
- array value under `bar`
- first element of `bar` array
- second element of `bar` array

But the kicker is that everything simply references the original string.

|Value|Backing span|
|:-|:-|
| top-level object | start: 0, length: 44 |
| number value under `foo` | start: 9, length: 2 |
| array value under `bar` | start: 20, length: 21 |
| first element of `bar` array | start: 22, length: 10 |
| second element of `bar` array | start: 34, length: 5 |

## Back to the validator

_Corvus.JsonSchema_ builds on this "backing data" pattern that `JsonElement` establishes. Instead of creating a backing field that is the same type that the property exposes, which is the traditional approach for backing fields, the generated code will use a `JsonElement`.

This means that a model generated by the library can usually be deserialized without any extra allocations, resulting in very high performance!

> For a much better explanation of what's going on inside the package than what I can provide, I recommend you watch their [showcase video](https://www.youtube.com/watch?v=aTcD-axJBac).
{: .prompt-tip }

## Keep moving forward

Ever since I saw that video, I've lamented the fact that it's only available as a `dotnet` tool. I've always envisioned this functionality as a Roslyn source generator.

To that end, I've paired with [Matthew Adams](https://github.com/mwadams), one of the primary contributors to _Corvus.JsonSchema_, as co-mentor on a [project proposal](https://github.com/json-schema-org/community/issues/614) for JSON Schema's submission to Google's Summer of Code. This project aims to wrap the existing library in an incremental source generator that uses JSON Schema files within a .Net project to automatically generate models at compile time.

This is a great opportunity to learn about incremental source generators in .Net and build your open source portfolio. If this sounds like a fun project, please make your interest known by commenting on the proposal issue linked above.

(Even if it's not accepted by GSoc, we're probably going to do it anyway.)