Corvus.Text.Json - Enhancing JSON Data Handling in .NET | endjin

Matthew Adams 25th June 2025

Open Source

In this technical talk, Matthew Adams explores Corvus.Text.Json, an experimental fork of System.Text.Json that introduces powerful new capabilities for working with JSON documents in .NET.

This isn't just an incremental update to Corvus.JsonSchema - it's a fundamentally different approach that combines the best of immutable and mutable JSON handling.

Key Topics Covered

🔧 Core Features

Dual Model Support: Both immutable (for thread-safe operations) and mutable (for easier manipulation) JSON document handling
Generic Element Types: Support for strongly-typed, code-generated JSON elements as first-class citizens
JSON Schema Integration: Built-in schema matching and validation capabilities
Performance Optimizations: Property maps for efficient lookups and zero-allocation operations

💡 Technical Highlights

How the genericized document system works over custom element types
The innovative workspace and builder pattern for safe document mutation
Performance improvements through UTF-8 backing storage and optimized property lookups
Deep equality comparisons and efficient document manipulation

🚀 Benefits

Simplified API compared to existing Corvus.JsonSchema implementations
Better performance than System.Text.Json's node-based approach
Seamless integration between mutable and immutable representations
Thread-safe document sharing with copy-on-write semantics

Who Should Watch

This talk is ideal for:

.NET developers working with JSON processing
Engineers interested in high-performance JSON manipulation
Anyone using or considering Corvus.JsonSchema
Developers looking for alternatives to System.Text.Json limitations

Chapters

00:00 - Introduction to Corvus Text JSON
00:54 - Immutable Model Overview
02:16 - Mutable Model Request
03:35 - Experiment with System.Text.Json
09:07 - Schema Matching and Validation
11:59 - Deep Equality and Property Map
17:21 - Mutability and Document Builders
26:36 - Creating and Manipulating JSON Documents
38:51 - Handling Mutable and Immutable Elements
44:37 - Summary and Conclusion

Transcript

So what I'm gonna talk about today is Corvus.Text.Json, which is an experiment that I've been working on for a couple of weeks—about six weeks maybe—or V.Next of Corvus.JsonSchema code generation, or at least it started out as being V.Next of Corvus.JsonSchema code generation.

Really, it's the "something else" of Corvus.JsonSchema. And what do I mean by the "something else"? It's really a different sort of usage model, and there are a load of other benefits to it under the covers in terms of performance really and familiarity. But primarily it's for a different usage model.

And the usage model of the existing system is an entirely immutable model. Just like JsonElement is immutable. It's an entirely immutable model where you get, just like the immutable collections, you have an immutable view over a JSON document. And then you can build different versions of that JSON document by taking an immutable array, for example, and add—immutable JSON array—and see adding an item to it, and you get what's, what amounts to a copy of that array back.

The original is completely unchanged. Both the original parsed, and indeed the original version. If you've been partway through building, if you add an item and then add another item, you're getting mutated copies of the item back. And there are good reasons for doing that. One of which is that it's excellent if you've got very large documents, and you want to say parallelize processing of those things, 'cause you can guarantee those documents aren't changing under the covers. And then bring back your results at the end. You can make mutations in one path that don't affect any other instances or users of that document. And that's great. That's a very nice model.

But a request that we often get—because it's the kind of thing that people expect if they're coming from world of serialization—is, "We'd like a mutable one, please, where we can just make a load of changes and mutate the document as we go. And then when we are done, we're done." And have a great deal of sympathy for that because it's a lot simpler, an interaction model. And indeed System.Text.Json provides such a model with their JsonObject, their node—it's called the node-based system.

And that model is unlike the struct-based JsonElement document world. It's a class-based and .NET value type-based model that you can mutate. You've got list of things and you can build objects and add item properties to objects and remove them, and it just edits in place. And that's all great. Accepted quite widely. And there are also various performance issues to do with the same kind of thing we have in V1 of Corvus.JsonSchema, which is performance issues around that interface between the .NET world and the UTF-8 JSON world behind it.

So what this experiment has done has taken—what I've done is taken a fork of System.Text.Json and created Corvus.Text.Json. And there are two sort of fundamental things that I've done within that. One of which is that I've genericized it over the element type. So our document—so this is the Corvus.Text.Json equivalent of System.Text.Json.JsonDocument.

We've got this ParseJsonDocument of—and then this is the type of element that we produce natively from this document. A ParseJsonDocument of JsonElement is exactly equivalent to JsonDocument in Corvus.Text.Json. And in fact, when you do all the benchmarking and performance in 90% of the code inside ParseJsonDocument is exactly the same as in Corvus.Text.Json—System.Text.Json JsonDocument.

Only with some genericization. So the performance is the same, code is almost exactly the same. It's just genericized over this element type. The ParseJsonDocument is our world of immutable. These are all immutable documents. And so if I'll just—I'll run this up so we can step through this as we go.

Pop October there. And eventually we get some code here. Very familiar, very familiar code here. Visual Studio's on a bit of a go-slow at the moment. So we've parsed a JSON string here into a JSON—into a JsonDocument, to ParseJsonDocument of JsonElement. And if I get—use exactly the same JsonElement APIs that we know and love, only it's now a Corvus.Text.Json JsonElement. We can go and get the root element of that document and get the property called "age" and write it out to the console. And there it is. And we can parse a second document here, parse those two documents.

And then now this is where things start to get interesting. I'm now parsing a JSON document into a different sort of element type. This element type is called Person, and this Person is a code-generated type. So here is my Person type. It implements this interface, which is called IJsonElement.

You will never see IJsonElement in your end user client code because every single one of these things implements everything in there explicitly. And it's used to plumb into the genericized framework for document and so forth. And this Person looks very much like the generated code we have in Corvus.JsonSchema. There's a strongly named type for its Name property and a strongly typed Age property and a CompetedInYears property. All of these are generated code. It looks very similar to the kind of generated code you've got already, except—and this is the crucial part here—except we don't have all the "are you backed by a JsonElement or are you backed by a .NET element type" code in here. We defer to our document implementation, whichever document it is that owns this element. We just defer to that document and say, "Go and get me the name property value for this Name and get me it as an Age instance."

So this is exactly equivalent to the JsonDocument internals of the JsonDocument that says, "Get me a name property value. Here's the ReadOnlySpan or string or whatever of the name and get me a JsonElement out." It's exactly equivalent to that, only we've abstracted—just so we've extracted element through IJsonElement. We have abstracted document through IJsonDocument to allow us to plug in different types of JSON document for backing this element.

So here's our Person, and therefore we don't have the situation we had in Corvus.JsonSchema where we may be backed by a JsonElement or we may be backed by something else. That's all deferred into the world of document. And we've got exactly what you have in System.Text.Json JsonElement, which is a JsonElement consists of a reference to its own parent and its handle inside that parent document.

I won't go into the details of that in this talk today. We'll have another one on the internals of this later on. So I'm doing exactly the same thing here. I'm parsing out this Person. Here's our Person string, JSON string, and I'm parsing it out into a document ParseJsonDocument of Person.

And here's another one in which I've expressed the other names. Look, you can see here as a string with space-separated names rather than array. Space-separated names. I've got another one here where I've left out the name entirely. And I've got another one here where the age is minus seven, which hopefully you'll be able to see where this is going in a minute. And here we are, parse another one out with a name who's not got a first name. So he's got name, last name, and other names, and a negative age. And some yes, it's CompetedIn.

And now here's a thing you don't have in System.Text.Json, which is that all these elements have this IsSchemaMatch function, which is the equivalent of Validate. But Validate is a bit old-fashioned in JSON Schema terms. We should be talking about whether we are matching the schema. And if you match and your purpose of matching is validation, then you are validated. So it's just a more generic term for the same thing.

So now we can go through and say, "Is this root element a schema match for its schema? Is the actual instance we've got schema match for the schema?" Because remember, we are conceptualizing these things as views over the underlying data. So we have a view of the underlying data of this root element, which is a Person. And we are asking if this view is a schema match for the data that the view is actually referencing.

And we will observe that yes, person b3 is a match. We can have a quick look at the schema if we want. I've got my models here. So here's my Person. It's required to have a Name. The Name is a PersonName, the Age is an Age, CompetedInYears, PersonName. It's required to have a first name, it's like a NameComponent for the first name and NameComponent for the last name, and its OtherNames or an OtherNames type.

Let's have a look at OtherNames here. OtherNames can either be a NameComponent, i.e., just a string basically, or an array of NameComponents, an array of NameComponents. A NameComponent is a string between one and 256 characters. CompetedInYears is an array of Years. Year is a number. Age is a number. Oh, sorry. Year is a number that must be an Int32. And Age is a number that I've constrained to be between zero and 130. Obviously that's just for example purposes.

So that matches that schema. And b4 also matches that schema. And b5 is going to not match that schema because it's missing the name. And b6 will not match the schema because it's missing—because it's got an age which is negative. And b7 will not match that schema because it is both missing a first name and has a name and age which is negative. And we can see that is exactly the case here. We've got schema—concepts of schema and schema matching layered into our fork of System.Text.Json.

And we will go into some more detail in what's happening there. But that's not one of the core features. There's another feature that we have which is present in the current version of System.Text.Json, but wasn't for a long time. And that's this DeepEquals—basically do a deep comparison of one document, one JSON view with another. And that's basically the moral equivalent of "equals this JSON instance equal that JSON instance." There is a—it's not a simple stringy equality. It's a "do these things evaluate to the same values." So property ordering doesn't matter, though of course array ordering does, the exact encoding of names doesn't matter if they decode to the same values.

So we've got that sort of deep equality implemented and that now works generically. So you can generically apply DeepEquals to any two JSON element types. Obviously in System.Text.Json, there is only the one JsonElement type, so we just have to genericize over that. Another feature that we've got is the PropertyMap. And what PropertyMap does—or maybe I should explain first how System.Text.Json works, which is that to look up a property in your JSON object inside your JsonElement, it iterates. There's a database inside, a table basically of property name-value pairs. So you get a row for the property name and a row for the property value in pairs, or one or more rows for the property value, but one row property name.

We'll go into much more detail on that next time, but basically it iterates through the list of properties looking for a matching name, which is fine if you've got fewer than about 10 properties. You might recall a talk we did a wee while ago. If you've got fewer than about 10 properties and you are looking up one property name in that once, then that's fine, right? That's okay. The performance is pretty good. But if you are doing multiple lookups or you have a large number of properties—multiple lookups and you are doing a large number of properties—then it's more efficient to build a PropertyMap. So basically a dictionary that enables us to do a hash lookup on the property name.

And you may or may not recall, I did an experiment good few months back where I was looking at using a version of the UTF-8 string hashing algorithm that lets us do perfect hashes for short strings. And I've baked all of that into the JsonDocument implementation we've got here, so that you can do highly efficient property name lookups if you apply a PropertyMap to the element.

And so what I've done here is said, "We're going to use a PropertyMap for lookup on both of these two elements." And then I'm just verifying that DeepEquals still works now that we've got the using the PropertyMap to do property lookups. And actually there's an implementation—there's an optimization inside DeepEquals that if we have values—if you have out-of-order properties when you are comparing two objects, DeepEquals and this comes from the implementation System.Text.Json DeepEquals optimistically tries an in-order comparison because the likelihood is that the properties are exactly the same in exactly the same order because people tend to do common things like alphabetize or pick the properties from the same source or whatever.

So DeepEquals does an in-order property comparison first. As soon as it finds an out-of-order comparison, i.e., we don't get a match on the names as we iterate through the two in parallel, it then falls back to an out-of-order specialization that actually builds a dictionary. The System.Text.Json version builds a dictionary, allocates a dictionary on the fly. But what we do is we fall back to calling our EnsurePropertyMap method, which is a zero-allocation approach because it's actually built over some rented byte buffers under the covers—to zero-allocation approach to doing the same thing. And then of course, it hangs around thereafter.

So should you do DeepEquals twice—or because you, for example, use the property in two different locations and comparing in that way, or you've got multiple operations to do—then it doesn't have to build the map second time. It would use the PropertyMap 'cause it's already there and that will just be faster for this out-of-order case or indeed any other lookups that you do. So that's another feature that Corvus.Text.Json has baked in. And of course this, even if you're just using basic, none of the code gen, you're just using the basic JsonElement stuff, you benefit from this stuff too.

Then here we are writing out value document b3, which was one of our parsed values. We've written it out. And as you might expect, you get exactly—when you just emit it as a string, it attempts to present you with exactly what you passed in. And that's where we are with that now.

Now we come to mutability. Mutability in System.Text.Json is based on the same types and the same backend modeling of types. Remember I talked about those rows? They're part of what's basically a MetaDB, a metadata base that is produced when the reader parses the JSON and contains rows for all of the entities that it finds in that document. Where to find their values and where to find the backing values and the backing data. We'll have a deep dive in that later. But now we're going to use this mutable instance of the JSON instance. So we're going to create a document builder for our elements so that we can start manipulating it.

So there are a number of ways you can create a document builder, and this is creating a document builder from an existing element, right? And we create these document builders inside a workspace. A workspace is just basically some resources for us to be able to manage mutating these documents in a way that means we never have to hold onto JSON references to things—sorry, .NET references to things outside of the workspace. So the workspace basically maps from JSON references to indices of documents it knows about, so that under the covers we can just do byte manipulation in safety. Because you can't—basically can't references—as soon as you get references involved, you're having to deal with GC fences and all sorts of nonsense. Whereas this thing holds references to all the documents we've got and maps them into integers, fundamentally indices in an array. We can use that integer as a handle inside our byte slicing databases under the cover in the backend.

So this workspace from the client's point of view is just the "this is the context in which I'm manipulating these documents," and then "I would like to create a builder from this JsonElement in this workspace." And it gives me a JsonDocumentBuilder, and a JsonDocumentBuilder is an IJsonDocument type like ParseJsonDocument was. It implemented IJsonDocument. JsonDocumentBuilder implements IJsonDocument, participates in the being a document world. So this is where we've diverged from System.Text.Json. We now have two different document types: the builder and the parsed document. And the builder actually is an IMutableJsonDocument, which adds extra capabilities. And its natural element type is a mutable—whatever the element type is .Mutable.

By convention, we generate code which embeds a mutable version of the type into as a child member of the immutable version. And that mutable version—if we go and have a look at this guy for a moment without getting into too much detail—this is the one that has got the setters. Where are we? Get—these are the getters. Why don't I just look in here? So here we go. Look, you've got set versions just as you have the get item as this stat and the other. In the immutable—in the mutable—in the immutable version—that's surprisingly difficult to say—you've got the setters on the mutable one, so the setters don't exist on the immutable. You'll never get yourself into trouble. They only exist on the mutable version. And likewise, if we go and have a look at the Person we were looking at a moment ago, we will see that inside Person there is a mutable version of Person. And this, in addition to the getters, has got the SetName, the SetAge, and so forth.

At some point we'll have a look at why this type exists, but fundamentally this is where the union type has gone. You don't have to deal with the union types in the immutable world anymore. You deal with them inside the mutable world. They exist purely as ref structs on the stack purely for the lifetime of the process of actually calling set. And you can—in this case, you can see an Age is implicitly convertible from an int or an Age instance. So that's where that sort of setting comes in. We'll have a look at an example of that in a minute.

So we've created a builder here. And the first thing I'm go—we haven't yet—I'll hit F10 a few times. And then we've created the builder, and I'll just demonstrate that when we say go—do the same thing. I didn't need that ToString really, did I? If we go and do the same thing as we did before, then I've got my—that's my raw value. Now, under the covers, it figured out that nothing had changed about this builder yet, so it literally just deferred to the backing value in the original document and the index for that value. And just got it to write it out. So the builder didn't need to do anything. It just deferred through to the backing value 'cause nothing had changed. But that won't last long.

What we're now going to do is set the property—the age property—change the age property on that builder. The mutable element. And I've hit a breakpoint. Let me—let's Shift+F11 out of that. I didn't particularly want to show you that unnecessary detail at this stage. We'll get back out eventually. Here we are. So we set the property—set the age property. So 51, and now we're gonna write this line out and we'll see something interesting when we ToString this, which is that we've now modified that value so we can no longer defer to the original object's ToString, which literally just goes and grabs the value out of the backing string.

What we've done instead—we now iterate, we now do the moral equivalent of writing to a UTF-8 JsonWriter. Though actually it just goes straight to a string. Where for each of these properties that we start with a StartObject and for each of these properties we haven't changed, we defer to writing—we defer to the backing object to write the whole property, right? But this one here, we had our own backing value for that 51. And we've—so we have used our own—we've—sorry, we have used our own backing to write the 51, but we've used the JsonDocument's backing to write the age property name. We didn't need to restore the property name. We just used the backing value from the original one and updated our value. And then the rest of these things we iterated over the values and got the backing document to write those values out for us. We wrote it out in compact form because that's the default way we write strings out.

So this has got the same characteristics of Corvus.JsonSchema of using the existing backing values. But we've avoided two things, one of which is we've avoided having to deal with property names where they already exist because we've been able to use the original property name. And we had all sorts of horrible dances around strings and string values and realizing strings when we were doing that in Corvus.JsonSchema. We avoid all of that. And we've also done something interesting when we write it out, which is gonna be the topic of another conversation, which is we've actually serialized that value—or rather formatted that value, I should say—into some backing storage that lives behind our document builder.

So it's actually been serialized as a JSON value in the backend, and that really simplifies our ability to deal with the complexities of multiple different numeric types and formatted types of various kinds because we're going straight to the UTF-8 backing store for those values and just pointing at that slice of backing store rather than trying to hang on to that .NET value for any time and then reason about it and try and figure out what the best way to compare this with that is. We don't have the—and make that all marry up with the way it works with the UTF-8 representation. We just live in UTF-8 representation land. Turned out that there were some benefits to that too for performance, which we'll see in a little bit.

So now I'm going to parse another JSON document with some slightly more complicated JSON in it, some nested values. And then I'm going to build a value from that—a builder from that value—and just confirm that I've not messed up writing that out, which we've got the raw value there again. And then—sorry, that's the raw value of—what am I doing? Sorry. I've set the five value complex. Now I've replaced this whole value here with the number 42. So that whole value there has now been replaced by the number 42. And there we go. We've replaced that whole complex value with the number 42.

Now, why is this interesting? This is an interesting part of the experiment because what we've done here is replaced this big hairy object, which is multiple things, with a single primitive value. So what we had—what we did when we did that was we replaced all the rows in the database that represent these things. So there's a row for StartObject, a row for that property, a row for that value, a row for that property, a row for that value, a row for that property. A row for StartArray, a row for that value. A—whoops, stored bad—a row for that value. A row for StartObject, a row for that property, a row for that value. A row for EndObject and a row for EndArray. And a row for EndObject for that object. All of those we've replaced with a single row and compacted the database when we did that.

And my concern was that this would be significantly slower than doing equivalent things in JsonObject in the System.Text.Json node world. But it turns out that because what we actually do is we work out a delta. We work out a delta between the number of rows we require for what we are writing in and the number of rows that we're replacing. And all we're then doing is a blit of a set of bytes—the remainder of the bytes in the thing. It turns out that it's not astronomically slower than doing the exact equivalent thing to something that's backed by an array under the covers, even though it's a little bit smaller. It is slower, but it's not that much slower.

And it turns out that because we are then not having to do all the work of converting everything into that iterable form to write out to a writer, our process of doing these changes and then writing the results is faster than the process of doing the same set of objects and writing the results in JsonObject land. And that was a huge benefit 'cause I was expecting to pay a small cost for doing this, for the convenience and ease of being able to do it like this. And in fact actually the end-to-end process of manipulating these things and making changes and then producing the—serializing the result out, which is obviously what—it's entirely valueless if you don't write it somewhere afterwards—we turn out to be actually faster.

And of course in the case where you are replacing one simple value like a string or a number or whatever with another simple value, like another—like a different string or a number or—it doesn't matter what, boolean or whatever they are—you replace one row with one row, so nothing happens. Or if we replace an object with the same number of properties with a different object with the same number of properties, which is also an extremely common occurrence because we're simply changing—we're replacing one set of—one value with another value that's just multi-property—you don't have to pay that cost at all. You are simply writing the new values in. So in real scenarios, this turns out to be a lot faster than—for most cases even just at the property setting stage.

So that's replacing complex properties with simple properties. And in this case, I am walking the tree. And oh, I should remove these breakpoints. Really go away, breakpoint. I'll just remove all the breakpoints. Apologies for that 'cause no doubt we've now got a million. There we go.

So now we're just using standard JsonElement syntax. We're getting a property. We're getting another property. And you'll see I'm mixing—just for fun—I'm mixing up UTF-8 and string syntax here, and then I'm getting the second value in the array and setting a property on that second value in the array. And this is a property that didn't exist before, and then I'm setting the first item to null in that—I could have hung onto those things. Setting the second item to null in that array, and then writing out the results. And we should see—there we go, I've now got an array which consists of the four, and then null, null in the results set.

Now I've got some stuff here that I'm not going—oh, no, here we go. So now I'm just—this is testing WriteTo—so I'm just checking that you can do exactly what you'd normally do, and write to a UTF-8 JsonWriter. But you'll see that if I switch to .NET 10 here, so then things light up properly. There we go. You will see that I am not creating my own UTF-8 JsonWriter. It's one of the most allocated things that people do in their path of least resistance code is newing up UTF-8 JsonWriters and buffers to write to and just leaving them to go away or disposing them and you new up an object.

So what the other feature we've got in workspace is that you can rent a writer and buffer, and you can rent a writer and buffer that are set up with the—you can actually set your workspace up with default UTF-8 JsonWriter options. So that you can guarantee you're using the same options everywhere. You only have to set them up once. You can rent them. And it actually gives you a thread-local instance, because by and large you are just reading and writing. You are reading stuff, manipulating it, then writing it back out. So you never need more than one writer on a thread, which is actually housed—and System.Text.Json does it internally. And we found a place to let you—to expose that to you so that you can gain the benefit of that, about buffer model.

Here we are just writing the values out. And then we are returning the writer and buffer at the end. You probably do that in a finally block just to be on the safe side in real code. But here we go. We're just writing out the various different things we've got there. So then we should be able to see those. Here we go. We've written out all of those values that we've put together.

And I'm just checking a belt right now. So here's another builder model. This is—I'll just turn over that so that we can see the code again. This builder model is saying we're gonna create a builder completely from scratch. So we're gonna create a mutable document with our JsonDocumentBuilder completely from scratch, and then just build up as efficiently as we can an object for a value of some kind. And in this case it's an object. And we've got this BuildCallback mechanism that lets you build these things up so that we can do that with a degree of efficiency.

So this guy here is an object builder, and in my object builder I am adding some properties to my object. So I'm adding a property called name. My name is using another object builder which is this first name, last name. I'm adding these properties, other names. I'm using an array builder. And my array builder has got AddItems for just items rather than properties. And then I've added an age which—so I've got overloads for all of the same things as we have in set on JsonElement. We've got exactly the same equivalent things in builder.

And then here I'm creating another array. Now, the interesting thing about this, of course, is that this is all code, right? So this is as efficient as it can get. It's just static lambdas for these callbacks. So that'll be fairly efficient. I'll just validate that we actually write that out, which we do. But here's an example where I've got some data in from somewhere. So here's some years that have competed in years. And you'll see here I've got a non-static lambda, which is capturing those years. And I'm then iterating those years to add those items to my array.

So the nice thing here is we can now contain code. We can now add code into our object creation or entity creation logic, and keep it closely located to the thing we're actually adding. And of course this could be—you could implement functions to these things to do common types of activities. And you get the benefit of reuse in that kind of code in kind of model.

And have I just stopped running? Yes, I have. Let's take a breakpoint on there and run again until we get to that point. So I'm just demonstrating here that we can set some—we can set some values here. And this is that—these are the examples I was showing you there of how we can set those values from a builder so we can do it in code. So OtherNames, I'm creating an object here which lets me—which specifically has just the overloads on it for the things that you need for that entity. So your IntelliSense will help you out with that. And it's got implicit conversions for that source thing that we saw, so that I can just use a string or I can just—or I can use the more complicated case.

And here's an example of that. Not using a JsonElement now—oh no, as a Person.CreateDocument, sorry. Not using—using Person.CreateDocument, but using the Create function that's available on the strongly typed mutable builder that we've got here, so that I can just call this Create function on PersonName and pass it the values that I want in. So this is the exact equivalent in Corvus.JsonSchema of the strongly typed thing.Create, only now you get it through a builder callback, so it doesn't pollute the main type API. But still gives you all the same kind of functionality that you had before, and that can write those things out.

Now, here's some interesting stuff. If I get my root element out of my doc builder, I've got a mutable Person out of it. But I can implicitly convert the mutable version to the immutable version, and that's absolutely fine. I can also, as you can see here, explicitly upcast from the immutable version to the mutable version. But that will throw if that immutable version did not come from a mutable version. So it's that general rule that you don't have an implicit conversion if it might throw. But you can have an explicit conversion and it will throw if you've done the wrong thing. And this is handy when you're wanting to pass in and out of some code that knows how to process an immutable entity and yet use it back in a mutable context.

And then all the usual things you'd expect is that we can assign values that have come from other documents and then it will happily assign these existing instances. It won't create copies of those things. It just puts the row in the MetaDB that points at that original source, much as Corvus.JsonSchema used to. And that will all be quite happy. We should be able to see that. There we go. That all was quite happy.

We're near the end now, as you can see from the scroll bar. So now I'm parsing out another JSON document here. And I'm building a mutable document here from a property of the value—of a property inside the document. So you're not constrained to doing this on a root element. You can do it anywhere inside the document. Now I've got a mutable document for that name value. I can just get the value out and write it out to the console as a mutable document. Now look at that formatting there. The rule of get string—sorry, ToString—is that it tries to leave things as close to the source as it possibly can, and that's exactly what it's done here. It's actually used the backing of the direct backing of all the bits that it could and wrapped that in its own object value backing. And that means you've picked up all the whitespace from the original document in the output string.

So that's something to bear in mind. It is much more—it's less overhead to do that. And it's more in the spirit of System.Text.Json ToString, which you get whatever garbage you put in. It just might not be what you're expecting from the ToString-ing. If you want nicely formatted output—which ToString does not purport to be—then you should write it to UTF-8 JsonWriter and control the output style. Just as with JSON.

And then that's just testing modification of this type of built from bits and pieces doc. So now here's an interesting thing. This last name here, I acquired the last name value—this JsonElement.Mutable last name. I got that property there, stashed it away, and then I have modified the document, right? I've modified the document that contains this item here. If I now try to use that cunningly stashed away last name object, then it's going to throw an exception, and the exception it throws is "Operation is not valid due to the current state of the object"—an InvalidOperationException.

The reason it's throwing that exception is that internally, this—it works just like Dictionary, for example, which is it maintains an internal version number that it increments each time the document is modified. And if you break the usage model and hang on to values inside a mutable document and then mutate the document—just if you hang on to a—like you are iterating a dictionary, you're hanging on to a dictionary iterator, for example, and you modify the dictionary or collection iterator, modify the collection, you get an InvalidOperationException. Because the dictionary is mutated while you were hanging onto a value that depended on the state of the dictionary. And we support exactly the same thing here. And that's absolutely throughout everything through the mutable and immutable elements. If the document has been mutated underneath it, you'll get an exception.

I was considering whether really this was a Debug.Assert, because really you should only see that if you've written incorrect code. So you should always be able to pick it up at development time. But because that's not how other sort of mutable collections work, we are throwing—we're deliberately throwing the exception. And that's what that verifies.

So we've got a model. In summary, we've got a fork of System.Text.Json that now supports mutable and immutable representations of the data in a builder model that's at least as efficient and mostly more efficient than the original System.Text.Json, is genericized over entity types so that our code generation can participate—our code-generated types can participate as first-class citizens inside that. And we've layered in a couple of significant additional features, particularly around the PropertyMap generation. So we can do efficient property lookups for large entities and the sharing of the backing data across documents in a workspace to be able to manipulate those documents.

It does all that in a—oh, and additionally JSON Schema matching support is baked in as a sort of first-class concept. And it does all that in a low-to-no allocation fashion, including some really interesting performance improvements in JSON Schema validation that comes from being able to—two things. One, which is being able to directly access that backing data in validation land. And secondly, in some cunning performance improvements we've got in how we capture context and state in validation.

And in my next talk about this, I will talk in detail about the validation process and the code-generated validation. And then in the one after that, I will talk about the innards of this and how we leverage what's in System.Text.Json to add those additional capabilities without adding any additional storage overhead requirements and what that metadata DB looks like. That's it.

Thank you very much. I hope that was interesting and that you've got some follow-up questions.