C# 7.2 added support for
ref struct types, and as I've discussed before, these are critical to achieving high throughput in certain kinds of code by minimizing GC overhead and copying. But you pay a price in flexibility. In that last article, I showed how to work around the fact that you cannot use a
ref struct inside an
async method. In this article I'll discuss an issue we ran into when writing tests for the Ais.Net parser I recently blogged about.
We have numerous tests for each AIS message type our parser supports. We typically define a test for each element of the message that our parser can extract, and we normally supply multiple examples. For example, in a Position Report Class A (one of the most common formats in which vessels report their location, heading, and other information about their progress), one of the fields is the Repeat Indicator, a common element found at the start of many AIS messages. The test looks like this:
You can see it in context here.
We're using SpecFlow, which enables us to write tests in the Cucumber language. It's more commonly used for test specifications that express requirements in terms of an application's business domain, and it's slightly unusual to use it for a unit test. However, we found it made for highly readable tests for this library. In particular, Scenario Outlines were very well suited to testing Ais.Net—they enable us to write a single test definition and then to define multiple sets of inputs and the corresponding expected outputs. Here you can see we've got 4 tests with obviously faked up payloads (evident from the fact that they're mostly 0), one for each of the possible values for this field, and then 4 examples taken from real messages transmitted by actual vessels, also covering all 4 possible values. We find Cucumber to be a more convenient and readable way to express this sort of thing than the data-driven features of any of the popular .NET test frameworks. It's one of the reasons we like SpecFlow a lot at endjin.
But I digress. The important point here is that we repeatedly execute the same step—
When I parse '<payload>' with padding <padding> as a Position Report Class A—with various different inputs. In fact, if you go and look at the full feature file you'll see that all the tests use that same step, because all the tests entail parsing a message.
SpecFlow will execute all the steps in this Scenario Outline once for each row from the
Examples: table. We want to pass the first two columns of each row to the constructor for the message parser, like this:
Normally, when writing the code for this sort of test step, you'd just store the result of this expression either in the SpecFlow
ScenarioContext or in a field of the step bindings class. However, we can't do that here because
NmeaAisPositionReportClassAParser is a
So that's the challenge this whole post is about: how can we write tests for
ref struct in the way we want, given the restrictions these types impose?
As you may recall from the previous blog posts linked to above,
ref struct types have some desirable performance characteristics. They are a key part of the features added to C# 7.2 that make it possible to write libraries such as Ais.Net that can perform high speed parsing with minimal copying of data and very low GC overhead. But you pay for this efficiency in constraints: in particular a
ref struct type can only live on the stack. (And in case you're having a knee-jerk "No, C# structs don't always live on the stack" reaction, yes, I know that, but
ref struct types are an exception: they really do absolutely have to live on the stack in the current .NET runtime implementations.) This means we can't store it in a context or step binding object, because those live on the heap.
For the test to work, we're going to need to execute the test specified by the
Then clause in such a way that the
NmeaAisPositionReportClassAParser under test is above it on the stack. This is not totally straightforward because the way SpecFlow works is that each step in a test is implemented as a method that is executed completely before moving onto the next step. SpecFlow requires our
When clause to complete before it will start the
We therefore need to defer the work specified in the
When clause until SpecFlow is ready to run the
Then clause. The code implementing our
Then steps typically looks something like this:
This passes a callback containing the test to a helper method that will construct the parser in the manner previously described by the
When clause, and then pass that into the callback. And the code for those
When clauses typically looks like this:
So rather than constructing the parser, we create a callback which, when invoked, will construct the parser using whatever arguments the test requires. This is what enables the deferred operation. The
When helper used here just stores the callback in a field, which the
Then helper then uses when it's time to run the test for real:
ParserMaker type here is a delegate type defined by the test class:
We need to define one of these for each parser type. You might be wondering why we don't just use a generic delegate type here, e.g.
Func<NmeaAisPositionReportClassAParser>. It's because you cannot use a
ref struct type as a generic type argument. The reason is that there are all sort of constraints on what you can do with
ref struct types, but if you could just plug them into any old generic type, that might let you bypass these restrictions. For example, suppose some generic type declares a variable of type
T in an
async method. If the compiler let you use a
ref struct as the argument for that type parameter
T, that would provide a sneaky way to use a
ref struct in an
async method. Since the compiler blocks use of
ref struct in these situations for good reasons, it would be bad to be able to bypass the restrictions.
The obvious way to fix this would be for C# to introduce a new kind of generic constraint. You could imagine writing
class <T> where T : ref struct. Any type or method declared with such a constraint would prevent you from using the type parameter anywhere that a
ref struct is not allowed, and with that guarantee in place it would then become safe to supply a
ref struct as a type argument. Unfortunately, no such generic constraint exists today. (And even if it did, it wouldn't enable us to use
Func<T> because that type wouldn't have this constraint anyway.)
So we have to define a dedicated non-generic delegate type, something that's very rarely necessary.
With these elements in place, we can write tests in the obvious way, with separate steps for describing what we want to do and what the outcome should be, while fitting into the constraints imposed by a
The effect of the test steps above is as though we'd written this code:
That seems a lot simpler, and you might be wondering why we didn't just write that in the first place. But it's not really quite that simple: where do
padding come from here? In SpecFlow tests, we separate out the setup and the expected outcome:
When steps define the setup, and
Then steps define the particular thing we want our test to verify. This means that the inputs to the test normally aren't directly available to the step that performs the assertion. We could of course modify our feature file so that we pass everything into the
Then step, but that would mean making the test specifications look weird just to work around some technical constraints. I prefer to keep test specification files as readable as possible, so that's not a good option. (This separation of concerns is, after all, part of the point of writing tests this way.) Or we could make the
When step store those inputs in fields, so that the
Then step has access to them, and can construct the
NmeaAisPositionReportClassAParser itself. But I don't really like that either: while it leaves the feature file looking clean, it would make the associated step bindings harder to follow because we would have moved the setup out of the step that's supposed to be defining the setup.
So the advantage of this technique is that it enables feature files to read naturally, and for setup and test code to go where you'd expect it to, while fitting around the constraints imposed by
ref struct types.