Introducing Corvus.Text.Json V5: YAML 1.2 - Zero-Allocation Conversion
At endjin, we maintain Corvus.JsonSchema, and in the previous post we looked at JsonLogic for safe business rules.
Now let's talk about a format that sits alongside JSON in almost every modern development workflow: YAML.
YAML is everywhere
Kubernetes manifests, GitHub Actions workflows, Azure DevOps pipelines, Docker Compose files, Helm charts, OpenAPI specifications. Almost every infrastructure-as-code and CI/CD tool uses YAML as its primary configuration format.
If you validate, transform, or process configuration, you inevitably need to convert YAML to JSON. The schema validation, query languages, and processing tools all operate on JSON.
V5 includes a YAML 1.2 to JSON converter that does this with zero allocation on the hot path.
Quick start
Two packages are available:
# Full Corvus document model - when you want ParsedJsonDocument<T>, schema validation, etc.
dotnet add package Corvus.Text.Json.Yaml
# System.Text.Json only - when you want a lightweight JsonDocument, no Corvus dependencies
dotnet add package Corvus.Yaml.SystemTextJson
Parse YAML to a typed document
using Corvus.Text.Json;
using Corvus.Text.Json.Yaml;
string yaml = """
name: Alice
age: 30
hobbies:
- reading
- cycling
""";
using var doc = YamlDocument.Parse<JsonElement>(yaml);
JsonElement root = doc.RootElement;
Console.WriteLine(root.GetProperty("name").GetString()); // "Alice"
Console.WriteLine(root.GetProperty("age").GetInt32()); // 30
That gives you a ParsedJsonDocument<JsonElement>, the same pooled-memory document we discussed in post 4. From here you can validate against a schema, query with JMESPath or JSONata, mutate with a builder, or just read values.
Parse YAML directly to a strongly-typed element
This is where the real power of YAML-to-JSON conversion becomes clear. Because YamlDocument.Parse<T> is generic over any IJsonElement<T>, you can parse YAML directly into a schema-generated type. There is no intermediate untyped step, and the result is fully validated and strongly typed from the moment you access it.
Consider a Kubernetes Deployment manifest. You'd typically write it in YAML, but the Kubernetes API schema is published as JSON Schema. Generate your types from that schema, and then:
// Your YAML manifest - the format every Kubernetes user writes in
string manifest = """
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-frontend
labels:
app: web
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: nginx
image: nginx:1.27
ports:
- containerPort: 80
""";
// Parse directly to the generated Deployment type
using var doc = YamlDocument.Parse<Deployment>(manifest);
Deployment deployment = doc.RootElement;
// Strongly-typed access - IntelliSense, compile-time safety, no casting
string name = (string)deployment.Metadata.Name; // "web-frontend"
int replicas = (int)deployment.Spec.Replicas; // 3
string image = (string)deployment.Spec.Template.Spec
.Containers[0].Image; // "nginx:1.27"
// Schema validation is built in
bool isValid = deployment.EvaluateSchema();
The YAML bytes flow through the tokenizer into the document's pooled memory, and you get a typed view with full IntelliSense and schema validation. The same pattern works for any schema-defined format that people author in YAML: OpenAPI specifications, GitHub Actions workflows, Helm values files, Azure Resource Manager templates, and more.
Convert to a JSON string
string json = YamlDocument.ConvertToJsonString("key: value");
Console.WriteLine(json); // {"key":"value"}
Stream to a Utf8JsonWriter
For pipeline scenarios where you're writing directly to an output buffer:
using var stream = new MemoryStream();
using var writer = new Utf8JsonWriter(stream,
new JsonWriterOptions { Indented = true });
YamlDocument.Convert("items:\n - one\n - two"u8, writer);
writer.Flush();
System.Text.Json only
If you don't need the Corvus document model:
using Corvus.Yaml;
string yaml = "name: Bob\nage: 25";
using JsonDocument doc = YamlDocument.Parse(yaml);
Console.WriteLine(doc.RootElement.GetProperty("name").GetString());
It uses the same tokenizer and achieves the same conformance, without any Corvus dependency.
How it works
The converter uses a custom ref struct tokenizer that operates directly on UTF-8 bytes. There's no intermediate object model. The tokenizer emits events (scalar, sequence start, mapping start, etc.) that the converter translates directly into Utf8JsonWriter calls.
This means the hot path allocates nothing. The YAML goes in as bytes, the JSON comes out through a writer, and the only allocations are the ones Utf8JsonWriter makes for its own output buffer (which is pooled if you configure it that way).
Event streaming
The internal event model is also exposed as a public API. YamlDocument.EnumerateEvents calls your callback for each parse event, giving you zero-copy access to the raw UTF-8 data:
YamlDocument.EnumerateEvents(yamlBytes, static (in YamlEvent e) =>
{
switch (e.Type)
{
case YamlEventType.Scalar:
Console.WriteLine($"Scalar: {Encoding.UTF8.GetString(e.Value)}");
break;
case YamlEventType.MappingStart:
Console.WriteLine("Object start");
break;
case YamlEventType.SequenceStart:
Console.WriteLine("Array start");
break;
}
return true; // continue parsing (return false to stop early)
});
Each YamlEvent is a ref struct whose spans point directly into the source buffer. The event types mirror the YAML specification: StreamStart/End, DocumentStart/End, MappingStart/End, SequenceStart/End, Scalar, and Alias. Events also carry line/column positions, anchor names, tags, and scalar styles. This is useful when you need to process YAML without converting to JSON at all. For example, you might extract specific values from a large file without parsing the whole thing.
JSON to YAML
Conversion works in both directions. YamlDocument.ConvertToYamlString takes a JSON element or raw UTF-8 JSON and produces YAML output:
string yaml = YamlDocument.ConvertToYamlString(
"""{"name": "Alice", "roles": ["admin", "user"]}""");
// name: Alice
// roles:
// - admin
// - user
There's also a streaming overload that writes to an IBufferWriter<byte> or Stream:
YamlDocument.ConvertToYaml(jsonElement, outputStream);
YamlWriterOptions controls the output format. IndentSize sets the indentation width, and SkipValidation disables structural validation for a small performance gain:
var options = new YamlWriterOptions { IndentSize = 4 };
string yaml = YamlDocument.ConvertToYamlString(json, options);
This works with both System.Text.Json.JsonElement and Corvus IJsonElement<T> types, so you can round-trip YAML through a ParsedJsonDocument. Once parsed, the document can be changed through a builder and written back out as YAML.
Utf8YamlWriter
For fine-grained control over YAML output, Utf8YamlWriter is a ref struct that writes directly to an IBufferWriter<byte> or Stream. Its API mirrors System.Text.Json.Utf8JsonWriter, so the programming model will feel familiar:
var bufferWriter = new ArrayBufferWriter<byte>();
using var writer = new Utf8YamlWriter(bufferWriter, new YamlWriterOptions { IndentSize = 2 });
writer.WriteStartMapping();
writer.WritePropertyName("name"u8);
writer.WriteStringValue("Alice"u8);
writer.WritePropertyName("roles"u8);
writer.WriteStartSequence();
writer.WriteStringValue("admin"u8);
writer.WriteStringValue("user"u8);
writer.WriteEndSequence();
writer.WriteEndMapping();
This produces:
name: Alice
roles:
- admin
- user
The writer supports block and flow collection styles. You can mix them in the same document. For example, use flow style for short inline sequences:
writer.WritePropertyName("tags"u8);
writer.WriteStartSequence(YamlCollectionStyle.Flow);
writer.WriteStringValue("v5"u8);
writer.WriteStringValue("release"u8);
writer.WriteEndSequence();
// tags: [v5, release]
When SkipValidation is false (the default), the writer validates structural correctness. Property names must precede values in mappings, containers must be properly closed, and you can't write a second root value. This catches mistakes at the point of the write call rather than producing silently broken output.
Schema modes
The converter supports four YAML schema modes:
| Schema | Behaviour |
|---|---|
| Core (default) | YAML 1.2 Core Schema. Recognizes null, true/false, integers (decimal, 0o77, 0xFF), floats (decimal, .inf, .nan) |
| JSON | Strict JSON-only: only null, true/false, and JSON-style numbers |
| Failsafe | All scalars become JSON strings. No implicit type coercion |
| YAML 1.1 | Backward compatibility. Adds yes/no/on/off/y/n booleans, sexagesimal integers, and merge keys (<<) |
var options = new YamlReaderOptions
{
Schema = YamlSchema.Core,
DocumentMode = YamlDocumentMode.SingleRequired,
DuplicateKeyBehavior = DuplicateKeyBehavior.Error,
};
using var doc = YamlDocument.Parse<JsonElement>(yaml, options);
Multi-document streams
YAML supports multiple documents in a single stream, separated by ---:
---
name: Alice
---
name: Bob
Set DocumentMode = YamlDocumentMode.MultiAsArray to wrap all documents in a JSON array:
var options = new YamlReaderOptions
{
DocumentMode = YamlDocumentMode.MultiAsArray,
};
using var doc = YamlDocument.Parse<JsonElement>(multiDocYaml, options);
// Result: [{"name":"Alice"},{"name":"Bob"}]
All YAML features
The converter supports every YAML 1.2 feature:
- Scalar styles: plain, single-quoted, double-quoted, literal block (
|), folded block (>) - Collections: block and flow sequences, block and flow mappings
- Anchors and aliases:
&anchorand*aliaswith billion-laughs protection - Tags:
!!str,!!int,!!float,!!null,!!bool,!!seq,!!map, and custom tags - Multi-document:
---and...document markers - Comments: preserved in the event stream (ignored in JSON output)
Billion-laughs protection
The YAML "billion laughs" attack uses nested anchor/alias expansion to create exponentially large documents from tiny input. The converter enforces two configurable limits:
var options = new YamlReaderOptions
{
MaxAliasExpansionDepth = 64, // Default
MaxAliasExpansionSize = 1_000_000, // Default - max nodes from alias expansion
};
Expansion that exceeds either limit throws a YamlException.
Conformance
The converter passes 100% of the JSON-testable cases in the yaml-test-suite. That means 279 valid and 94 error cases (373 of 402 total). The remaining 29 cases exercise YAML features with no JSON equivalent (complex keys, empty keys, bare tags) and don't provide JSON reference output.
Next up
In the next post, we'll look at JSON Patch. It provides RFC 6902 support with a fluent builder that operates directly on the mutable document model.