Optimising .NET code: Avoiding allocations using Span<T>

By Jonathan George Software Engineer IV 7th September 2023

Optimising .NET code: Avoiding allocations using Span<T>

In this series we're looking at ways to improve the performance and memory efficiency of our .NET code, using my solutions to Advent of Code puzzles as a testing ground. In the first post, I introduced 2022's Day 1 puzzle - Calorie Counting - and my initial LINQ-based solution:

public string Solve(string input)
{
    return input.Split(Environment.NewLine + Environment.NewLine)a
        .Max(row => row.Split(Environment.NewLine).Select(int.Parse).Sum())
        .ToString();
}

The initial benchmark figures for this code looked like this:

Method	Mean	Error	StdDev	Gen0	Gen1	Allocated
RunAgainstRealInput	78.23 us	0.512 us	0.212 us	7.6904	0.7324	126.43 KB

In part 2, we went through a basic pass of trying to understand where the memory is being allocated, which showed us that the majority of our allocations come from calls to string.Split. And in part 3, we had a look at the impact of LINQ and found that if we want to get allocations as close to zero as possible we need to move away from it.

As a result, the code currently looks like this:

public string Solve(string input)
{
    string[] rucksacks = input.Split(Environment.NewLine + Environment.NewLine);

    int currentMaxElfLoad = 0;

    foreach (string rucksack in rucksacks)
    {
        int currentElfLoad = CountRucksackCalories(rucksack);

        if (currentElfLoad > currentMaxElfLoad)
        {
            currentMaxElfLoad = currentElfLoad;
        }
    }

    return currentMaxElfLoad.ToString();
}

private static int CountRucksackCalories(string rucksack)
{
    int sum = 0;

    string[] rucksackItems = rucksack.Split(Environment.NewLine);

    foreach (string item in rucksackItems)
    {
        sum += int.Parse(item);
    }

    return sum;
}

Method	Mean	Error	StdDev	Gen0	Gen1	Allocated
RunAgainstRealInput	59.84 us	0.396 us	0.371 us	7.0190	0.7324	114.73 KB

At the moment, we're making two calls to string.Split; firstly to split the input into "rucksacks", and the second to split a rucksack into its individual lines. Each of these calls resulted in a significant chunk of memory being allocated.

The reason for this is that each of these calls results in strings being copied in memory. To avoid that, we could rewrite the code to only split once and detect when we're on a new rucksack by looking for empty lines.

public string Solve(string input)
{
    string[] contents = input.Split(Environment.NewLine);

    int currentMaxElfLoad = 0;
    int currentElfLoad = 0;

    foreach (string item in contents)
    {
        if (string.IsNullOrEmpty(item))
        {
            if (currentElfLoad > currentMaxElfLoad)
            {
                currentMaxElfLoad = currentElfLoad ;
            }

            currentElfLoad = 0;
        }
        else
        {
            currentElfLoad += int.Parse(item);
        }
    }

    if (currentElfLoad > currentMaxElfLoad)
    {
        currentMaxElfLoad = currentElfLoad ;
    }

    return currentMaxElfLoad.ToString();
}

Method	Mean	Error	StdDev	Gen0	Gen1	Allocated
RunAgainstRealInput	54.71 us	0.808 us	0.716 us	4.8828	1.1597	80.39 KB

This is better again - we've saved ourselves just over 30 KB. The only way we could reduce this further is to avoid the copying we incur as a result of the remaining call to string.Split. This is where one of the new types introduced in .NET Core, Span<T> comes into play.

Introducing `Span<T>`

Span<T> is a new value type which represents a contiguous region of memory. The memory in question can be allocated on the stack or the heap - or even, in some cases, external memory that isn't owned by the CLR - and you can create one from an existing array. Essentially, it's like a pointer with benefits.

Since a string is essentially an array of char, it will come as no surprise that it's possible to create a ReadOnlySpan<char> (which is exactly what it sounds like) from an existing string, using the AsSpan() method.

Why is this helpful? Firstly, many other parts of the framework have been modified to work with spans. For example, the int.Parse method we're using in our code can now take a ReadOnlySpan<char> as its argument, meaning that instead of needing a string containing the number, we can pass our safe pointer to the portion of the string which contains that number which results in no allocation at all.

Here's some code which demonstrates exactly that:

[MemoryDiagnoser]
public class ParsingIntsWithSpans
{
    private const string InputData = "This string contains the number 1532 at position 32";

    [Benchmark(Baseline = true)]
    public void IntParseSubstring()
    {
        string raw = InputData[32..36];

        int result = int.Parse(raw);
    }

    [Benchmark]
    public void IntParseSpan()
    {
        ReadOnlySpan<char> input = InputData.AsSpan();
        ReadOnlySpan<char> raw = input[32..36];

        int result = int.Parse(raw);
    }
}

The Introduction to Rx.NET 2nd Edition (2024) Book, by Ian Griffiths & Lee Campbell, is now available to download for FREE.

The allocation in the first method comes from copying the relevent section of the InputData string into a new string. In the second method we bypass that by obtaining a ReadOnlySpan<char> which points to the portion of InputData we care about.

Method	Mean	Error	StdDev	Ratio	Gen0	Allocated	Alloc Ratio
IntParseSubstring	12.160 ns	0.0448 ns	0.0397 ns	1.00	0.0019	32 B	1.00
IntParseSpan	8.510 ns	0.0056 ns	0.0050 ns	0.70	-	-	0.00

Unfortunately there isn't a native equivalent of the string.Split method that returns spans. Fortunately, this gap has been filled by Gérald Barré, who's published some code on his blog that does exactly what we need.

As it can no longer return an array of strings (since this is exactly what we need to avoid), the new SplitLines extension method returns a StringSplitEnumerator allowing us to move through the segments of the input data. This doesn't implement IEnumerator because that would result in an allocation, but it does have the necessary functionality to work with foreach loops.

The enumerator gives access to a LineSplitEntry which contains the current line as a ReadOnlySpan<char> as well as its position in the input string. Both the StringSplitEnumerator and LineSplitEntry are ref structs. This is a relatively recent addition to .NET and results in a type whose instances are guaranteed to be allocated on the stack. We'll talk about these more in future posts.

With that code safely embedded in the solution, let's have a look at how the code changes.

public string Solve(string input)
{
    int currentMaxElfLoad = 0;
    int currentElfLoad = 0;

    foreach (StringExtensions.LineSplitEntry entry in input.SplitLines())
    {
        if (entry.Line.Length == 0)
        {
            if (currentElfLoad > currentMaxElfLoad)
            {
                currentMaxElfLoad = currentElfLoad;
            }

            currentElfLoad = 0;
        }
        else
        {
            currentElfLoad += int.Parse(entry.Line);
        }
    }

    if (currentElfLoad > currentMaxElfLoad)
    {
        currentMaxElfLoad = currentElfLoad;
    }

    return currentMaxElfLoad.ToString();
}

As you can see, the code looks very similar. Instead of splitting the input and iterating over it, we're iterating the result of calling input.SplitLines(), with each entry being LineSplitEntry. This contains a Line property which is a ReadOnlySpan<char>.

Programming C# 12 Book, by Ian Griffiths, published by O'Reilly Media, is now available to buy.

The results of benchmarking that code speak for themselves:

Method	Mean	Error	StdDev	Allocated
RunAgainstRealInput	45.77 us	0.140 us	0.131 us	32 B

We've gone from our original code, which took 80.57us to run and allocated 126.43 KB of memory, to our reworked code which takes just over half the time and only allocates 32 B of memory.

The performance gain is interesting. Around three fifths of it seems to have come from dropping LINQ, and the other two fifths from avoiding memory allocation - after all, we haven't come up with a new and innovative way to sum the contents of each rucksack and then find the max; we still have to add up every number and keep track of our current maximum.

That remaining 32 B is a result of calling currentMaxElfLoad.ToString() in the return statement. If I hadn't imposed that limitation on my code, we'd be able to get all the way down to zero.

But if, as I said in my previous post, the overhead of using LINQ is actually negligible, why do my results show getting rid of it as having had the biggest impact on performance? In reality, that's a side effect of the way I approached the problem; getting rid of LINQ early made it easier to introduce the new code for zero-allocation string splitting, and I knew it would be necessary to reduce allocations as far as possible. If I'd been happy to accept the minimal overhead and allocations that LINQ brings, but still brought in the Span<T> based code, we would still have seen significant performance improvements. But I'll leave the proof of that as an exercise for the reader.

Summary

We've learned a few things as part of working through this relatively simple problem.

Strings are a big source of memory allocations and you're probably copying them more than you think. Span<T> is the first big thing that can help here.
Reducing memory allocations will almost certainly give you a performance boost.
LINQ is great for writing simple code, but comes with a performance penalty (both speed and memory).
Writing performant code will most likely result in writing more code that's potentially harder to understand.

Finally, a word of warning: this work is addictive. Once you've started analysing memory usage and begun to cut down allocations in your code, it can become an obsession. That's the reason for this blog series - I'd like to get better at optimising code like this and I hope that others will get something out of my experience too.

In the next post we'll look at part 2 of the first day's problem and see what can be done to improve things there.

Who We Are

What We Do

Who We Help

What We Think

Contact Us

Introducing `Span<T>`

Summary

Also worth reading:

Optimising .NET code: Let's blame LINQ

Jonathan George06/09/2023

Optimising .NET code: Hunting for allocations

Jonathan George05/09/2023

An Overview of the Corvus.Globbing Library

Liam Mooney05/12/2022

Jonathan George

Software Engineer IV

Who We Are

What We Do

Who We Help

What We Think

Contact Us

Introducing Span<T>

Summary

Optimising .NET code: Let's blame LINQ

Jonathan George06/09/2023

Optimising .NET code: Hunting for allocations

Jonathan George05/09/2023

An Overview of the Corvus.Globbing Library

Liam Mooney05/12/2022

Introducing `Span<T>`