Skip to content
Ian Griffiths By Ian Griffiths Technical Fellow I
.NET JsonElement Error Handling

At endjin, we are fans of System.Text.Json in the .NET runtime libraries. In particular, the JsonDocument and JsonElement types, are able to provide astonishingly high performance. Not quite as fast as the streaming Utf8JsonReader, but surprisingly close given how much easier it is to use. There's one small trap for the unwary, however: if your inputs aren't in exactly the form you expect, there are two subtly different ways in which it can tell you. This means that your code might not be as robust in the face of malformed input as you thought.

For more information about how to use these APIs, and why we like them, you could look at some of Matthew's blogs. He's written about how combining JSON Schema and code generation with System.Text.Json can offer the programmability benefits of conventional serialization but with the performance benefits of JsonElement. More recently, he wrote about how our open source Corvus.Schema library, which embodies those techniques, enjoyed a 20% performance boost thanks to improvements in .NET 8.0. Or you could watch my talk about why JsonDocument and JsonElement are able to perform so much better than more conventional techniques.

Parsing properties

The JsonElement type can represent any part of a JSON document. The JsonDocument.RootElement returns a JsonElement representing the entire document. In cases where the 'document' consists of a single value, (e.g. null) that will be the only element, but if the document is an array or an object, you can then ask the root element to give you the array items, or the object properties.

Typically, you'll have expectations about the structure of incoming JSON. For example, you might expect it to look something like this:

{
    "id": 1331,
    "action": "frobnicate"
}

We could write the following code to process data in that form:

using JsonDocument document = JsonDocument.Parse(input);
JsonElement idElement = document.RootElement.GetProperty("id");
JsonElement actionElement = document.RootElement.GetProperty("action");

long id = idElement.GetInt64();
if (actionElement.ValueEquals("frobnicate"u8))
{
    Frobnicate(id);
}

You might find this a little less convenient than conventional serialization, in which we define a .NET type whose structure resembles the JSON we expect to see, and then use JsonSerializer.Deserialize to convert the JSON into an instance of that type. But if you're familiar with these APIs (or you've watched and read the video and blogs linked to above) you'll know that the approach shown here places a much lower load on the garbage collector, and that it has the potential to provide much higher throughput in high-performance systems.

There are some subtle details here. The code carefully avoids ever asking to see the value of the action property, and instead just asks if it has a particular value, and this can enable JsonElement to avoid having to create a string object on the heap. The exact type of input matters here too. In some cases, JsonDocument will need to make a copy of the data, but for certain input types such as byte[] or ReadOnlyMemory<byte> it can avoid making any copies at all, working entirely with the JSON data in whatever input buffer it lies. You need to be careful with how you use JsonElement if you want the best performance, but when you get it right, the results can be impressive. However, performance isn't the main topic of this particular post.

There's one obvious problem with the code I've shown: it does not handle malformed input.

When properties go bad

What would happen if the preceding code snippet was handed JSON that looks like this?

{
    "id": 3.14159,
    "action": true
}

I've changed two things. The id property is still a number, but it's no longer an integer. From a JSON perspective, that's not actually a change in type: it considers both just to be numbers. I've also made a change that JSON does consider a change in type: I've replaced the action property's string value with the boolean true value.

With this input, the GetInt64 method will throw a FormatException, because that number is not an integer. This is idiomatically consistent with other parts of the .NET runtime libraries: if you were to pass the string "3.14159" to long.Parse it would throw the same type of exception.

Depending on what kind of code we're writing, that might be good enough. If we're just making a little throwaway utility, the fact that JsonElement.GetInt64() notices the discrepancy and throws an exception might be all that we require. But if we're processing messages that might originate from sources outside of our control (e.g., we're writing a public-facing web API), we should be a bit more rigorous about detecting problems with the input.

Our preferred approach is to write a JSON Schema that defines precisely what is and is not acceptable. Our open source Corvus.Schema library is designed for this approach: it can generate C# code that will validate a document against a JSON schema, and it also provides strongly-typed accessors for retrieving the data that the schema says will be present.

There are other options. Maybe you want to write your own code to check that the document looks like it should. (Don't underestimate how easy that is to get wrong. The formalism of JSON schema might seem initially offputting, but with anything non-trivial, it's almost always much easier to get right than rolling your own solution.)

Having seen that GetInt64() reports invalid input in the same way as long.Parse, you might wonder whether JsonElement offers an equivalent to long.TryParse. It does, and we can write this:

if (!idElement.TryGetInt64(out long id))
{
    Console.WriteLine("Invalid input: id should be an int64");
}
else if (actionElement.ValueEquals("frobnicate"u8))
{
    Frobnicate(id);
}

This correctly detects bad input of the kind shown above. So you might write this, test it, and be satisfied that your code is now robust against bad input. Unfortunately, you will have fallen into JsonElement's trap.

Different kinds of bad

Let's look at another slight tweak to the input:

{
    "id": "1331",
    "action": "frobnicate"
}

I've changed the id property from a number to a string. As it happens, the string is a perfectly valid decimal representation of the same number that my first example represented. That's definitely a string that could be parsed into a long. However, if I run it against either of the C# examples I've shown, they will both fail with an InvalidOperationException.

For me, the most surprising aspect of this was that an example of the Try-Parse pattern is throwing an exception because of unparsable input. Isn't the whole point of methods that follow this pattern that they don't throw exceptions when the input turns out not to be in the form you wanted?

There's a second surprise here which is more subtle, but which turns out to shed some light on the first one: in the first example, idElement.GetInt64() throws a different exception here. When the id property was a non-integer JSON number (3.14159), we got a FormatException, but in this case, where the id property is a JSON string, we get an InvalidOperationException.

Why is JsonElement using two different exceptions to report two slightly different forms of the same problem?

Boneheaded, exogenous, or vexing?

Many years ago, Eric Lippert (who was at the time a developer on the C# compiler team) wrote about the distinction between boneheaded, exogenous, and vexing exceptions. (He describes one more category, fatal, but that's not relevant here.) This distinction explains the different ways in which JsonElement reports errors here.

Eric describes an exception as boneheaded if it indicates a programming error. For example, passing null where an object is required, or passing a boolean value where a number is required. Statically typed languages such as C# can make it impossible to make certain boneheaded errors. (That's arguably the main point of static typing.) If you attempt to call int.Parse(false), the compiler will reject it, eliminating any possibility that the API might need to report such a mistake at runtime with an exception. More recently, the addition of nullable references to C# can often enable the compiler to block an inappropriate attempt to pass null at compile time. But before that feature was introduced, programming errors of that kind were detected at runtime, and reported with an ArgumentNullException, the paragon of boneheaded exceptions in .NET.

Eric uses the term exogenous for exceptions that represent unavoidable external factors. For example, disk drives sometimes fail, and there's no avoiding that.

A vexing exception is one that could and arguably should have been avoided with a different design choice. The 'arguably' here indicates that the distinction between these exception categories is subjective, and that's actually part of the problem in the JsonElement case here. You could argue that attempting to pass the string "banana" to int.Parse is boneheaded—that could never work. You could equally argue that if the text came from an external party and you can't control it, that makes this sort of error either exogenous (because the root causes are outside of your control) or vexing (because bad input is inevitable). Eric categorizes FileNotFoundException as exogenous, but you could argue that the problems it reports are just as fundamentally unavoidable as malformed external input, meaning that this is in fact a vexing exception, and .NET should have provided a File.TryOpen API for exactly the same reason that int offers TryParse.

(There is a distinction between bad input and a missing file. If you've already checked that input is valid, it's not suddenly going to become invalid. Conversely, the filesystem's contents can change at any time, so even if you've checked that a file exists, it might no longer exist by the time you try to open it. There are some situations where checks need to be performed at the exact instant that you try to perform an operation. However, nothing about this requires an exception-based design. A hypothetical File.TryOpen would enable us to deal with this without needing exceptions. After all, that's basically how the underlying OS APIs for opening files work.)

To label an exception as vexing is essentially to disagree with the API designer about this subjective judgement. Often there won't be a clear, indisputable dividing line, which is why some libraries offer two APIs for the same thing, one of which treats bad input as exceptional (e.g. int.Parse) and one which does not (e.g. int.TryParse). Even then there's still room for disagreement: you might choose int.Parse because you consider malformed input to be impossible unless you've made a programming error (a boneheaded scenario) or perhaps because you consider it to be exogenous. It's possible to imagine different scenarios in which one or other of those is a reasonable points of view, and there may well be scenarios in which reasonable people might disagree about the categorisation.

When JsonElements gets opinionated

APIs that offer both exception-throwing and Try... forms are essentially enabling you to decide which category a particular error belongs to. JsonElement offers both. If you believe that some JSON element would only contain something other than an integer in exceptional circumstances, you can call GetInt32. But if you think it likely that it might contain a number that is not a valid 32-bit signed integer, you can use TryGetInt32 instead.

But here's the catch. JsonElement is still making a non-negotiable distinction: it treats certain kinds of bad inputs as boneheaded. You only get to decide the category for specific errors.

Specifically, if the JSON type does not match the type you're looking for, JsonElement considers this to be a boneheaded mistake. So whether you call GetInt32 or TryGetInt32, if the JsonElement in question does not represent a number, you will get an InvalidOperationException from either of these.

It's certainly possible to defend this design decision. However, I was certainly very surprised the first time I discovered that JsonElement.TryGetInt32 returns false for some kinds of bad input, but throws an InvalidOperationException for other kinds of bad input.

(I've put that in bold because it's the critical and, in my opinion, non-obvious point. It's the whole reason I wrote this post.)

My expectation was that bad input is bad input—I don't see a big qualitative difference between these two scenarios:

  1. the caller passed a floating point number when they should have passed an integer
  2. the caller passed a string when they should have passed an integer

From a C# perspective these both look wrong for the same reason. But JsonElement has decided that these are different because it takes a more JSON-centric view. Just in case you've not fully internalized the JSON specification, the introduction has this to say about the type system for JSON documents:

JSON can represent four primitive types (strings, numbers, booleans, and null) and two structured types (objects and arrays).

So from a JSON perspective, the value 1234 and 3.141 have the same type: they are both numbers. But "1234" is a completely different type: a string. So in the land of JSON, 1 above is not a type mismatch, but 2 is. JsonElement essentially characterises these as:

  1. right JSON type, but an unsuitable value
  2. wrong JSON type

It considers 1 to be the kind of error where you get to decide whether it's exceptional (GetInt32 throws a FormatException but TryGetInt32 just returns false). It considers 2 to be a boneheaded exception. It expects us to have checked the JSON type before attempting to parse the property. If you want to detect both kinds of error without causing exceptions, you need to write this:

if (idElement.ValueKind != JsonValueKind.Number ||
    !idElement.TryGetInt64(out long id))
{
    Console.WriteLine("Invalid input: id should be an int64");
}
...

And if you're OK with malformed input producing an exception, you need to be aware that you might get either a FormatException or an InvalidOperation exception, depending on the specific way in which the input is malformed.

Showing my working

While researching this, I decided to perform some systematic tests to check that my understanding was correct. I wrote a small program that tries using 12 different inputs. The first five are all, from JSON's perspective, numbers, but some of them have characteristics (e.g., being negative, being a fraction, being too large for some representations) that mean they can't be represented by all .NET numeric types. There are also four inputs all of which are, from JSON's perspective, just strings, but some of which happen to be valid Base 64, or a GUID, or a datetime. There are also inputs representing JSON's boolean, null, and object types.

This table shows the result of trying to process these with JsonElement. The column indicates which Get method was called, determining the .NET type we want as output. Each cell in the table indicates the result. A tick (✓) indicates that the method succeeded. "Inv" indicates that it threw an InvalidOperationException, and Fmt indicates a FormatException.

sbyte int16 int32 int64 byte uint16 uint32 uint64 double decimal bool string guid base64 datetime datetimeoffset
42 Inv Inv Inv Inv Inv Inv
-42 Fmt Fmt Fmt Fmt Inv Inv Inv Inv Inv Inv
100000 Fmt Fmt Fmt Fmt Inv Inv Inv Inv Inv Inv
1000000000000 Fmt Fmt Fmt Fmt Fmt Fmt Inv Inv Inv Inv Inv Inv
42.0 Fmt Fmt Fmt Fmt Fmt Fmt Fmt Fmt Inv Inv Inv Inv Inv Inv
true Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv
guid Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Fmt Fmt Fmt
base64 Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Fmt Fmt Fmt
datetime Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Fmt Fmt
string Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Fmt Fmt Fmt Fmt
null Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv
object Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv Inv

This is consistent with what I described in this article, but I certainly found it helpful to see the results laid out like this, so I thought you might too.

Conclusion

JsonElement is a flexible, high-performance mechanism for processing JSON. It can convert JSON values into various .NET types, but it takes a potentially surprising view on the nature of errors, dividing malformed input into two different categories. This means that even the TryGet... methods might throw an exception if the input is not in the form you expect.

Our view is that the best way to deal with this is to use a formal mechanism for validating input, such as JSON schema, because it enables you to be certain that data is in the form you expect before you begin to process it. (Any exceptions that occur due to format mismatches are then necessarily boneheaded ones.) We make Corvus.Schema available because we think it's the best way to handle bad inputs.

Ian Griffiths

Technical Fellow I

Ian Griffiths

Ian has worked in various aspects of computing, including computer networking, embedded real-time systems, broadcast television systems, medical imaging, and all forms of cloud computing. Ian is a Technical Fellow at endjin, and Microsoft MVP in Developer Technologies. He is the author of O'Reilly's Programming C# 10.0, and has written Pluralsight courses on WPF (and here) and the TPL. He's a maintainer of Reactive Extensions for .NET, Reaqtor, and endjin's 50+ open source projects. Technology brings him joy.