Skip to content
Matthew Adams By Matthew Adams Co-Founder · 4 min read
Introducing Corvus.Text.Json V5: TOON - Compact JSON for LLMs

At endjin, we maintain Corvus.JsonSchema, and in the previous post we looked at extended types - URIs, BigNumber, and NodaTime. This time we're crossing into AI territory: how do you feed structured data to an LLM without burning through your token budget?

LLMs process tokens, not bytes. Every {, }, ", and repeated property name costs tokens. Those tokens cost money and latency. TOON (Token-Oriented Object Notation) is a compact text format that preserves the JSON data model while stripping extraneous detail, and making it easier for LLMs to interpret the content.

The problem: repeated property names

Consider a 100-row array of objects - a common pattern when you feed query results or catalogue data into an LLM:

[
  {"id": 1, "name": "Alice", "score": 95},
  {"id": 2, "name": "Bob", "score": 87},
  {"id": 3, "name": "Carol", "score": 91}
]

The property names id, name, and score are repeated on every row. The braces, colons, and quotes add overhead that carries no new information after the first row.

In TOON, the same data is a table:

[3]{id,name,score}:
  1,Alice,95
  2,Bob,87
  3,Carol,91

The field list appears once. Each row is a comma-delimited value list. For arrays with many rows, the token saving is substantial.

Packages

Package Dependency Use when
Corvus.Text.Json.Toon Corvus.Text.Json You want ParsedJsonDocument<T> and the Corvus document model
Corvus.Toon.SystemTextJson System.Text.Json only You want TOON conversion without a dependency on Corvus.Text.Json

Install with:

dotnet add package Corvus.Text.Json.Toon

or, for the lighter-weight package:

dotnet add package Corvus.Toon.SystemTextJson

Parsing TOON into a document

Parse TOON into the same pooled document model used by the rest of Corvus.Text.Json:

using Corvus.Text.Json;
using Corvus.Text.Json.Toon;

string toon = """
    name: Alice
    age: 30
    active: true
    scores[3]: 95,87,92
    """;

using ParsedJsonDocument<JsonElement> document = ToonDocument.Parse<JsonElement>(toon);
JsonElement root = document.RootElement;

Console.WriteLine($"Name:   {root.GetProperty("name").GetString()}");
Console.WriteLine($"Age:    {root.GetProperty("age").GetInt32()}");
Console.WriteLine($"Scores: {root.GetProperty("scores")}");

Output:

Name:   Alice
Age:    30
Scores: [95,87,92]

The returned ParsedJsonDocument<T> uses ArrayPool-backed memory - the same pooled lifetime model described in Part 4.

Converting TOON to JSON

When you need a JSON string (e.g. to pass to another API):

using Corvus.Text.Json.Toon;

string toon = """
    [2]{id,name,score}:
      1,Alice,95
      2,Bob,87
    """;

string json = ToonDocument.ConvertToJsonString(toon);
// [{"id":1,"name":"Alice","score":95},{"id":2,"name":"Bob","score":87}]

Converting JSON to TOON

The reverse direction detects uniform object arrays and emits them as tables automatically:

using Corvus.Text.Json.Toon;

string json = """[{"id":1,"name":"Alice","score":95},{"id":2,"name":"Bob","score":87}]""";
string toon = ToonDocument.ConvertToToonString(json);

Result:

[2]{id,name,score}:
  1,Alice,95
  2,Bob,87

Zero-allocation UTF-8 path

For hot paths, you can write TOON directly to an IBufferWriter<byte>. There is no intermediate string allocation:

using System.Buffers;
using Corvus.Text.Json.Toon;

ArrayBufferWriter<byte> buffer = new(256);
ToonDocument.ConvertToToon(
    """[{"id":1,"name":"Alice","score":95},{"id":2,"name":"Bob","score":87}]"""u8,
    buffer);

ReadOnlySpan<byte> utf8Toon = buffer.WrittenSpan;

This measured 0 B/op in benchmarks. Prefer the UTF-8 overloads whenever your input is already UTF-8 or your output destination accepts bytes.

Reader and writer options

Expanding dotted keys

By default, user.name is a literal property name. Enable path expansion to convert it into nested JSON:

using Corvus.Text.Json.Toon;

ToonReaderOptions options = new()
{
    ExpandPaths = ToonPathExpansion.Safe,
};

string json = ToonDocument.ConvertToJsonString(
    "user.name: Alice\nuser.age: 30",
    options);
// {"user":{"name":"Alice","age":30}}

Folding nested JSON keys

The reverse operation folds nested objects into dotted keys in TOON output:

using Corvus.Text.Json;
using Corvus.Text.Json.Toon;

ToonWriterOptions options = new()
{
    KeyFolding = ToonKeyFolding.Safe,
};

using ParsedJsonDocument<JsonElement> document =
    ParsedJsonDocument<JsonElement>.Parse("""{"user":{"name":"Alice"},"active":true}""");

JsonElement root = document.RootElement;
string toon = ToonDocument.ConvertToToon(in root, options);
// user.name: Alice
// active: true

All options

Option Default Description
ToonReaderOptions.Strict true Checks declared array counts and duplicate object keys
ToonReaderOptions.IndentSize 2 Spaces per indentation level
ToonReaderOptions.ExpandPaths Off Expands dotted keys into nested objects when Safe
ToonWriterOptions.IndentSize 2 Spaces per indentation level
ToonWriterOptions.Delimiter Comma Delimiter for arrays and tables (Comma, Pipe, or Tab)
ToonWriterOptions.KeyFolding Off Folds nested objects into dotted keys when Safe
ToonWriterOptions.FlattenDepth int.MaxValue Max path segments to fold

Error handling

Invalid TOON input throws ToonException with a 1-based line and column location:

using Corvus.Text.Json.Toon;

try
{
    ToonDocument.ConvertToJsonString("[2]: 1");
}
catch (ToonException ex)
{
    Console.WriteLine(ex.Message);
    // Reports the line and column where parsing failed
}

Corvus vs Cysharp

Cysharp's ToonEncoder is an established .NET package for encoding System.Text.Json values to TOON. The key difference: Cysharp is an encoder (JSON → TOON only), while Corvus packages are bidirectional converters. If you need to consume TOON and produce JSON, use Corvus. If you only need to serialize POCOs to TOON, Cysharp may be the simpler fit.

Benchmarks on a 100-row person array show Corvus is 1.04–1.74× faster for encoding, with the UTF-8 buffer path allocating 0 B/op compared to Cysharp's 368–648 B.

Next up

In the final post, we'll cover migration from V4, the production analyzers, and how to get started.

FAQs

What is TOON? TOON (Token-Oriented Object Notation) is a compact text format for JSON-shaped data. It keeps the JSON data model but removes repeated punctuation and property names. Nested objects use indentation, and uniform arrays of objects become tables.
When should I use TOON instead of JSON? Use TOON at boundaries where token count matters - LLM prompts, streaming to agents, compact wire formats. Keep JSON for storage, APIs, validation, and application contracts.
What is the difference between Corvus.Text.Json.Toon and Corvus.Toon.SystemTextJson? Corvus.Text.Json.Toon integrates with the Corvus pooled document model (ParsedJsonDocument). Corvus.Toon.SystemTextJson depends only on System.Text.Json - use it when you don't need the full Corvus pipeline.

Matthew Adams

Co-Founder

Matthew Adams

Matthew was CTO of a venture-backed technology start-up in the UK & US for 10 years, and is now the co-founder of endjin, which provides technology strategy, experience and development services to its customers who are seeking to take advantage of Microsoft Azure and the Cloud.