Optimising .NET code: Hunting for allocations
In this series we're looking at ways to improve the performance and memory efficiency of our .NET code, using my solutions to Advent of Code puzzles as a testing ground. In the first post, I introduced 2022's Day 1 puzzle - Calorie Counting - and my initial LINQ-based solution:
public string Solve(string input)
{
return input.Split(Environment.NewLine + Environment.NewLine)
.Max(row => row.Split(Environment.NewLine).Select(int.Parse).Sum())
.ToString();
}
The initial benchmark figures for this code looked like this:
Method | Mean | Error | StdDev | Gen0 | Gen1 | Allocated |
---|---|---|---|---|---|---|
RunAgainstRealInput | 80.57 us | 1.596 us | 2.184 us | 7.6904 | 0.7324 | 126.43 KB |
Whilst this seems to be running pretty fast, it's allocating what seems like a relatively large amount of memory, 126 KB, to get to the answer. If we're going to try and reduce this, we need to know what that memory's being used for.
In this post, we're going to take a pretty basic approach to determining where the allocations come from - we'll look at more advanced techniques in a future post. For now though, we're going to start from an empty method and add the components of the solution back in, one at a time, and look at the benchmark report in each step.
Step 1 - empty method returning string.Empty
public string Solve(string input)
{
return string.Empty();
}
Method | Mean | Error | StdDev | Allocated |
---|---|---|---|---|
RunAgainstRealInput | 1.096 ns | 0.0018 ns | 0.0017 ns | - |
Excellent performance and no allocations at all - as you'd expect.
Step 2 - splitting the input into chunks
The first step is to split the input into "chunks", with each representing an "elf rucksack".
public string Solve(string input)
{
string[] rucksacks = input.Split(Environment.NewLine + Environment.NewLine);
return string.Empty;
}
Method | Mean | Error | StdDev | Gen0 | Gen1 | Allocated |
---|---|---|---|---|---|---|
RunAgainstRealInput | 13.51 us | 0.230 us | 0.192 us | 1.8921 | 0.1984 | 31.09 KB |
Immediately we see this causes a chunk of memory to be allocated. This is the result of creating our array of chunks; each segment of the input is copied into a new string. Strings are allocated on the heap, hence we suddenly see that we've allocated just over 30KB.
This is interesting because my input file is only about 13KB on disk. Why does it take up so much more space in memory?
Aside: String representation in memory
A quick check will tell us that the input is 12,718 characters long, including whitespace. On disk, the file is encoded as UTF-8, and the relatively simple nature of the file means that each character will only require a single byte on disk, explaining the ~13KB size. However, .NET strings represent text as a sequence of UTF-16 code units, meaning that each character takes up 2 bytes in memory.
But that still only adds up to 25.44KB. What about the rest?
To understand what's going on, we need to know a few things about what's going on under the covers when we create these strings. Firstly, a string is a reference type, meaning it's stored on the heap. When a string is created, a chunk of memory is set aside for it. That chunk of memory needs to contain all of the two-byte characters that make up the string, and it also contains some additional information; a 4-byte field called the SyncBlockIndex
, another 4-byte field which contains the type pointer and a final 4 bytes which contains the string length. Although .NET does not use null-terminated strings, they do have a 2-byte null terminator at the end to allow them to be passed to native code without need for copying. This means that the total memory allocated for each string is 4 + 4 + 4 + (length * 2) + 2
bytes. So when our 12,718-character input gets split into individual "rucksacks", of which there are 249, we need an additional 249 * 14 bytes of memory to store them.
So that brings us up to 28.24B. We're still missing just under 3KB though; we can account for a tiny amount of that because we're storing the strings in an array, which has its own overhead. The rest can be chalked up to things happening as part of the implementation of string.Split
.
Let's move onto the next step.
Step 3 - splitting each chunk into its component parts
Next, we'd like to see the impact of splitting each "rucksack" into its constituent parts. This is where things get a bit hairy - in the original code, we're not storing these anywhere, we're parsing the numbers then adding them up. The closest I can get to approximating this is as follows:
public string Solve(string input)
{
string[] rucksacks = input.Split(Environment.NewLine + Environment.NewLine);
var contents = rucksacks.Select(row => row.Split(Environment.NewLine).Count()).ToArray();
return string.Empty;
}
Method | Mean | Error | StdDev | Gen0 | Gen1 | Allocated |
---|---|---|---|---|---|---|
RunAgainstRealInput | 42.33 us | 0.235 us | 0.220 us | 7.0801 | 0.8545 | 115.74 KB |
As you'd expect we now have a bunch more allocations as a result of calling row.Split()
on each rucksack. This is because we now have both the original chunks and the new individual rows in memory.
Step 4 - parsing the contents as integers and calculating the result
public string Solve(string input)
{
string[] rucksacks = input.Split(Environment.NewLine + Environment.NewLine);
int result = rucksacks.Max(row => row.Split(Environment.NewLine).Select(int.Parse).Sum());
return string.Empty;
}
Method | Mean | Error | StdDev | Gen0 | Gen1 | Allocated |
---|---|---|---|---|---|---|
RunAgainstRealInput | 77.63 us | 0.735 us | 0.688 us | 7.6904 | 0.7324 | 126.4 KB |
We've had to allocate another ~12KB to do this, and it's not entirely obvious where this would be happening. Something to dig into shortly. We've also nearly doubled our execution time; this is unsurprising as this is the guts of the work.
Step 5 - remove the intermediate variable
public string Solve(string input)
{
int result = input.Split(Environment.NewLine + Environment.NewLine)
.Max(row => row.Split(Environment.NewLine).Select(int.Parse).Sum());
return string.Empty;
}
Method | Mean | Error | StdDev | Gen0 | Gen1 | Allocated |
---|---|---|---|---|---|---|
RunAgainstRealInput | 76.63 us | 0.531 us | 0.471 us | 7.6904 | 0.7324 | 126.4 KB |
Here we're not longer storing the results of the initial call to string.Split
as a string[]
variable. And this makes absolutely no difference; the memory still needs to be allocated to do the work, it doesn't matter whether we store it in a variable or not.
Aside - what is a variable?
Why doesn't it make any difference whether we store the results of string.Split
as a variable?
To understand that, we need to understand what a variable actually is. At this point, I suggest you read my colleague Liam's excellent blog post explaining variables, the stack and the heap.
Our array of strings is stored on the heap - arrays are reference types in C#, so are always stored on the heap (although there are some interesting tricks in this area which we'll return to later.)
This means that when we have something like this:
string[] myArray = new[] { "this", "is", "an", "array" };
what we end up with is a chunk of memory allocated on the heap to store the data, and a pointer to that data on the stack. This is why you can have multiple variables pointing to the same thing:
string[] myArray = new[] { "this", "is", "an", "array" };
string[] myArrayAgain = myArray;
string[] myArrayStrikesAgain = myArrayAgain;
In this code, myArray
, myArrayAgain
and myArrayStrikesAgain
are all separate things on the stack, but they all point to the same data on the heap.
This is why removing our intermediate variable in the code above made no difference to the amount of memory allocated on the heap. The data exists there regardless of how many things we have on the stack that point at it.
Let's demonstrate this further with some more benchmarks:
[MemoryDiagnoser]
public class Variables
{
private const string Input = "Lorem ipsum dolor sit amet..."; // Snipped here, but the full code is longer.
[Benchmark(Baseline = true)]
public void StringSplitAndAssignment()
{
var target = Input.Split(' ');
}
[Benchmark]
public void StringSplitWithDiscard()
{
_ = Input.Split(' ');
}
[Benchmark]
public void StringSplitWithMultipleAssignments()
{
var target = Input.Split(' ');
var target2 = target;
var target3 = target;
}
[Benchmark]
public void MultipleStringSplitAndAssignment()
{
var target = Input.Split(' ');
var target2 = Input.Split(' ');
var target3 = Input.Split(' ');
}
}
Method | Mean | Error | StdDev | Ratio | RatioSD | Gen0 | Gen1 | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
StringSplitAndAssignment | 1.132 us | 0.0171 us | 0.0160 us | 1.00 | 0.00 | 0.3471 | 0.0057 | 5.68 KB | 1.00 |
StringSplitWithDiscard | 1.155 us | 0.0208 us | 0.0194 us | 1.02 | 0.02 | 0.3471 | 0.0057 | 5.68 KB | 1.00 |
StringSplitWithMultipleAssignments | 1.133 us | 0.0188 us | 0.0176 us | 1.00 | 0.02 | 0.3471 | 0.0057 | 5.68 KB | 1.00 |
MultipleStringSplitAndAssignment | 3.385 us | 0.0293 us | 0.0245 us | 2.99 | 0.04 | 1.0414 | 0.0191 | 17.04 KB | 3.00 |
Here we can see that the first three methods which all carry out a single invocation of string.Split()
, all allocate exactly the same amount of memory regardless of how many variables we have for the result. The only method that actually results in extra memory being allocated is the third, and in this case each of the variables end up referring to a completely different array of strings on the heap.
Step 6 - return the result
And finally back to the original code:
public string Solve(string input)
{
return input.Split(Environment.NewLine + Environment.NewLine)
.Max(row => row.Split(Environment.NewLine).Select(int.Parse).Sum())
.ToString();
}
Method | Mean | Error | StdDev | Gen0 | Gen1 | Allocated |
---|---|---|---|---|---|---|
RunAgainstRealInput | 75.70 us | 0.538 us | 0.504 us | 7.6904 | 0.7324 | 126.43 KB |
Which causes what looks like an extra 0.03 KB to be allocated.
Summary
So, what did the above tell us?
- The majority of our allocations come from calls to
string.Split
. - There was some additional memory allocated at the point we parsed the integers and added them up, but we're not entirely clear where that came from.
- Storing the results of each step in intermediate variables made no difference to the amount of memory allocated.
- Converting the final result to a string resulted in a small amount of additional memory allocated at the end.
The obvious takeaway from this is that we need to look at the strings. But before we get into that, there's something else we need to have a look at, which we'll pick up in the next post.