An Introduction to Corvus.JsonSchema Code Generator for .NET | endjin

Matthew Adams 18th December 2024

Open Source

Matthew Adams takes you through a high level tour of Corvus.JsonSchema, an open-source, high-performance JSON schema validation and serialization library.

The library generates C# code from JSON schema files, allowing for improved interaction and performance over traditional deserialization methods. The video demonstrates how to set up a .NET project, create JSON schema files, and utilize Corvus.JsonSchema's features like the newly added source generator, which eliminates the need for command-line tools by integrating code generation directly into the project. It also covers adding necessary NuGet packages and setting build actions for files. Practical examples include creating and validating a simple 'TestPerson' JSON schema file. The video concludes with a demonstration of generating structured data types based on schema constraints and highlights the benefits of a schema-first approach to code generation.

Transcript

Hello, I'm Matthew Adams from endjin, and today I'm going to be talking about Corvus.JsonSchema, the open source, high performance, low allocation, json schema validation and serialization library from endjin. If you like this kind of content, please hit and subscribe, it really helps us get our information out to the audience that could benefit from it. And it costs you absolutely nothing. Corvus.JsonSchema, what does it do? it takes JSON schema files and generates C# code from that so that you can interact with JSON in much the same way you would with deserialized JSON into, handcrafted type files, but with higher performance. And close to zero memory allocation in normal usage.

Now, we're on v4 of the library, that's about to be released alongside .NET 9 in November of 2024. And up to v3, so we're on v3. 1 in the released version at the moment, there's a command line tool that lets you generate your source code, and it's got various options to control the way that's generated. Now that's still the case in v4, there's still a command line tool, and in fact it's been enhanced with a whole bunch more options for the way you can generate your code, including naming conventions, and being able to generate multiple types all at once. But we are also, thanks to the requests of our excellent and much beloved community, creating a source generator for the v4 library. And what does that mean? that means you don't need the command line tool. What we do is embed the source generator right into the project and it generates code in place in the project as you modify the JSON Schema files with full IntelliSense and all the usual things you'd expect from the IDE.

So how do we go about doing that? I've set up a project here with two project files, a solution with two project files in it, one called WorkingWithJSONSchema and that's just a standard .NET 8 Library, we actually support .NET Standard as well, and we'll be supporting .NET 8. 9 as soon as it ships, and the preview in fact already supports .NET 9. And the other project we've got in here is just a standard console library application that's referencing that first project. So we're set up. So what do we need now? I guess some JSON schema would be a good place to start if we're going to be generating some JSON schema. So I'm going to add a JSON schema file here, and I'll just do it with the shortcut key, Control Shift A, and let's call this one TestPerson.

json. Okay, so we've got our TestPerson. json schema file there, and I suppose we should tell it that it's a schema file by adding our schema, and we're going to use the, standard draft 2012 schema. You can find more information about JSON Schema on jsonschema. org if you're not all that familiar. Now what should we do?

Let's make this a, let's add a title first, and the title can be an example person. So we're going to generate a person type. So this type, yep, we'll make it an object like it suggested there, and we will give it some properties. One thing I quite like about code generating from JSON Schema is that it's a really fast way to build up a kind of POCO object model, including constraints and validation. far faster, in my opinion, than going code first. one of the many reasons why I like this sort of schema first approach. once you're used to the syntax, that is. So the properties I'm going to create, let's create a first name.

Very good. It's, it's made a good guess. It's almost like I've done this before. let's have a first name whose type is string, and we'll give it a description. And we'll also have a last name, and the last name we're going to make exactly the same. type string and description, the person's last name and some spurious extra bits that IntelliSense added for me. So there we are. There's the first name and the last name. What else should we do? let's make, let's add a required field. So we'll make it so that you have to have a last name. You don't have to have a first name, but you have to have a last name. So there we go. There's a JSON schema file. for a very, simple person.

Now, what are we going to do with that? what we need to do is add a couple of NuGet packages, and one's a runtime NuGet package, and that provides the Corvus.JsonSchema type support for all the JSON types, and that's in a package called corvus json extended types, and we're going to install the v4 preview version. So we'll install that. That gives us a runtime. And then we also have, separately, we have the source generator. Now, it's likely that we'll roll this into the corvus json extended types package when it comes out of preview. But for now, we install it separately. So let's install corvus json extended types.

The source generator package. Now, this is a development time package. We don't actually ship anything from Corvus.Json.SourceGenerator with the files that are generated, just the corvus. json. extendedtypes file. So we've added those packages, and when we build, it still builds as you'd expect. Now, at this point, what we're going to do is restart Visual Studio. Now, why do we need to restart Visual Studio, you might ask? Source generators are somewhat sensitive to being bootstrapped. And when you first apply a source generator to a project and build it inside Visual Studio, sometimes it's not in a very happy state. So I find it's best to quit Visual Studio, and reload it, once you've added a source generation.

So that's what I'm going to do now, and you won't see any of that through the power of editing. So here we are, back in Visual Studio, as if nothing had ever happened. So the first thing we need to do is to understand that source generators can only work with files that are actually in the solution. You can't make external references in your JSON schema. You can't reference something on the web, for example, or another file arbitrarily somewhere in the file system. It's not allowed to do IO operations during source generation. That's a limitation of the source generator that's not there. for the command line tool, and might be a reason why you wish to continue using the command line tool as part of your build process.

But anyway, the source generator works with files in the solution. And when it's, needs access to those files, you can open up the properties on the file in question and set its build action to C# analyzer additional files. You might be familiar with that for embedding resources, or, there we go, embedding resources, or applying content and getting that to be part of your, your final package. But in this case, you specify a C# analyze additional file. And if you have a look at what that's done in the solution in the project csproj file. You can see it's added this additional files item group, and it's also removed it from its default action. You don't need to do that, you can just add it to the action group, but it's nice to keep it tidy.

So now this source generator, the source generator knows about our JSON schema file. So how do we get it to generate some code? the source generator has already done some work. If we look in the analyzers. So if we go to this, tab here and pop open our source generator, you can see that it's generated this JSON schema type generator attribute file already. And that in fact is the attribute that we apply in order to be able to get our source generator to emit code. So what are we going to do there? Let's roll that up again. Okay. Thanks. We're going to create a new file, and the file we're going to create, we'll create it in a models folder, and we're going to call this TestPerson.cs, okay? So in the models folder we've now got a TestPerson. cs, and let's just switch that into the simpler namespace format, and what we need to do is create a public read only struct. So all our emitted types are public read only structs. That's what helps with the high performance low allocation model for the application, but we also need to make it partial.

Because what the source generator is going to do is create additional code behind files that provide all the functionality for validation and property access and conversion and so forth. So what we can do is add that JSON schema type generator attribute to our partial struct, and then what we need to do is pass it the URI to the type that we want to generate. And in this case, we're just generating the root type, so we don't need a fragment part to this test person, JSO file. and we're giving it the relative path relative to the, file that this. TestPerson, attribute is being applied to. So that's the TestPerson. json file. Having saved that, we can see what's happened inside our source generator.

It's generated some TestPerson code behind. So let's see if we can use that. Let's go into the program, and we'll get rid of that little greeting. and we will create a TestPerson. now there's a number of ways we could do that. We could parse some JSON. but I'm going to use the TestPerson. create method. And you can see the create method has got two arguments, the last name and the first name. what we do when we generate these create methods is we put all the required, properties first. sorted alphabetically, and then we put all the optional properties, again sorted alphabetically. This helps us with stability, but it also helps you to create valid types, valid instances of the type.

So the last name, let's create a last name called Adams, and a first name called Matthew, because that's me. Now, So we've created a test person, and I can write that out to the console that will produce the JSON for that taste test person at the console. and I could also do some validity checking. so if it is valid, then we will write out that the, that it's valid te test person is valid. If I can type it straight, and otherwise we will write out is not valid. There we go. So when we run that we should find that test person is valid. There we go. There's the JSON created for it and there's the valid test person. Now what can we do to make that a little bit more complicated? let's go and just edit the JSON. Here's the schema. Here's the first name and the last name. Let's change the first name so that we require a minimum length of 10. So it's got to be at least 10 characters long the first name. A bit of a, an arbitrary choice I think, but min length of 10.

Now, if we run this again, we'll see that we are not valid now. Now we could actually print out all the validation results, but I won't go to that in this video. We'll produce another one for that. So it's not valid, so what happened there? we'll now see that we've got this firstName entity has been generated. This is no longer a string, so it's no longer using our standard JSON string implementation. It's now got a minLength of 10. So we generate this firstName entity with a validate function, and you'll see in here that we have A min length of 10, and it's going to use that in the validation. so let's see what happens when we change that back again.

So let's make another change, get the Solution Explorer out of the way. In the JSON scheme here, let's do something more realistic. Let's say if you're, if you have a first name present, Then what we're going to do is require that you have at least one character, so you're not allowed an empty. It's allowed to be missing, but it's not allowed to be empty. And if we say the max length is 50, a standard sort of constraint for strings, very typical thing to want to do. So when we save that JSON schema file, you'll see that it reruns the code generation in the source generator and adds the new max length validation in. So that when we come to run this up again, We can see that we're valid again, and our first name and last name are still represented in the JSON that we've produced.

So that's a really quick introduction to Corvus.JSONSchema's code generator support, and I hope you see how this kind of schema first approach to JSON serialization and code generation can be a really productive way of generating your data types. I'm Matthew Adams again, thank you very much for joining me and listening to the end, and if you did enjoy it, don't forget to like and subscribe.

Thank you.