When C# 9.0 patterns go wrong: mechanism over intent
.NET 5.0 became available in November, which means that C# 9 is also now ready. This latest edition of C# continues to extend the pattern matching capabilities. Unfortunately this seems to have encouraged a wave of pattern abuse. In this post I talk about my love for C# patterns, and, more importantly, where I think they might be the wrong thing to use.
Patterns: a brief recap
The last few versions of C# have gradually been adding pattern matching constructs: expressions that perform a runtime test against a value, and may go on to extract certain information, e.g.:
if (o is string s)
{
Console.WriteLine($"A piece of string is {s.Length} characters long.");
}
That tests whether the variable o
refers to a string
, and if so, makes that available through an appropriately-typed variable s
.
C# 8 added a few more, including property patterns. These introduced support for nesting—you can put patterns within patterns, e.g.:
if (someShape is { Position: { X: 0, Y: double y } })
{
Console.WriteLine($"Shape is left-aligned, and {y} high");
}
This example uses a property pattern testing its input's Position
property, and that test is another property pattern that inspects the Position
property value's X
and Y
properties. Each of these defines a further pattern—a simple constant value pattern for X
(0
), while Y
is tested with a type pattern that makes the tested value available in a variable called y
. So as this shows, property patterns provide two distinct services: testing, and extracting values.
The first example has been around since C# 7, and property patterns arrived in C# 8. C# 9 has added some new ways of combining and modifying patterns. For example, we now have the not
keyword, which inverts a pattern. We can combine this with the existing null
pattern to form a new pattern, not null
. We also get and
and or
, enabling a new pattern to be formed by combining two other patterns. In addition to these new ways of combining patterns, we get some new relational patterns. I'm using one in this variation on my previous example (It's the >
in X: > 0
):
if (someShape is { Position: { X: > 0, Y: double y } })
{
Console.WriteLine($"Shape is to the right of the vertical axis, and {y} high");
}
(There's a C# 9 feature only tangentially related to patterns, by way of discards, but I have to mention it because it's one of my favourite small new features. Last year in a blog on Discards & Underscores I complained that the new _
discard syntax was unevenly supported. C# 9 improves this: the compiler now recognises discard lambda arguments, meaning we no longer have to write, say, (_,__) => true
, and can instead write (_,_) => true
.)
When patterns are used wisely, they can be an excellent way to make your code express its intent clearly. With simple type patterns, the code often says pretty directly what it means ("if o is a string..."). C# 9's new not
combinator arguably provides a more direct way to express a test that something is not null (if (x is not null)
) than the pitfall-laden traditional approach (if (x != null)
, with its operator overloading traps for the unwary). It is certainly less oblique than using a type test to achieve the same thing—if your intention is to test whether a variable is null
, I don't think that writing if (x is string)
makes it very clear that this was your intent. At any rate, nothing like as clear as if (x is not null)
.
I find I use property patterns far less often than the others. So far, the usage I've found most effective is when they can be used to express some sort of exemplar. For example, this seems reasonable to me:
bool isAtOrigin = shape is { Position: { X: 0, Y: 0 } };
Here, I've effectively used a property pattern to draw a picture of what objects of interest look like.
However, property patterns can also be used in ways that do not seems wise to me.
A disturbing new pattern
As C# 9 approached I started noticing a resurgence in interest in pattern matching on twitter. In particular, there was a new-found love for C# 8's property patterns. Recently, someone posted a link to this "Modern code review" pull request, and drew attention to changes from this sort of thing:
if (args.Length > 0)
to this sort of thing:
if (args is { Length: > 0 })
My first thought on reading this is: I wonder if the author deliberately changed the meaning here, or if they just like property patterns. These examples don't necessarily mean the same thing, although in some scenarios they do, which is exactly what I dislike about this style: I now have to solve a puzzle to work out a) what the code does, b) what the developer meant, and c) whether a) and b) are actually the same thing.
If you're wondering what's different here, it concerns nullability. In the first example, the developer is confident that args
is not null: args.Length > 0
would throw a NullReferenceException
if it were. (And if you've enabled nullable reference types, the compiler will warn you if it thinks your confidence is misplaced.) The second will simply not enter the body of the if
if args
is null, because args is { Length: > 0 }
will just evaluate to false.
That's a pretty dramatic change in runtime behaviour when handling null
: one throws an exception, while the other silently skips a section of code.
This particular example comes from a change to a sample in the Azure IoT repo, and from context, we can see that args
is a variable of type string[]
. (Or rather, we can't see that, because this example is using the new C# 9 support for top-level functions. But if you happen to have committed to memory how that works, you will know that args
is a string[]
made magically available despite not having been declared.) And as it happens, the argument passed into Main
(whether declared explicitly, or used implicitly as in this case) is never null
. So in this case (assuming you know enough about how C# top-level functions work) you can deduce that the fact that the new pattern-based version would handle null
differently is of no consequence, because it will never see a null
.
So from this analysis we can conclude that the developer probably didn't make this change in order to change null
handling. (Although we can't actually rule that out—perhaps they've misunderstood something. Perhaps I've misunderstood something.) But that was a lot of mental effort to expend in order to come to the conclusion that this change is a matter of style, not substance.
Here are some problems I see with the new version in general use:
- the syntax is more verbose
- it requires more context to understand (nullability issue described above)
- it is not always evident whether the increased permissiveness around null is a deliberate choice or an accidental side effect of the developer's style preferences
- I find the
: >
a distracting piece of syntax—the colon gets in the way and makes it look less like a comparison and more like an F# upcast operator
There doesn't seem to be a whole lot of upside that I can see. There's the frissant of getting to use a new language features, I suppose, but that's a pretty bad reason to do something in production code. From Twitter conversations, I gather that some people have some sort of mental model in which they want to state that a variable conforms to some particular specification, and they feel that patterns are a better match for that than the simple first example. I guess that may explain it, but I think it misses something important.
Express intent, not mechanism
In fact, neither the "before" nor the "after" above is ideal, because neither really states the point of the exercise. What would be better? Here's something from one of the earlier tweets that I noticed in this trend:
Old Hotness: !string.IsNullOrEmpty(x)
New Hotness: x is { Length: > 0}
Oh the humanity!
This is, in my view, entirely backwards. The item described as "Old Hotness" here is better than either of the alternatives shown above. It says what we're actually trying to achieve. We want to know whether we've got a non-empty string. That's why we're doing the test, so to have that stated in words is ideal. We know what the intent is here because the name of the method describes it. (Taking a broader view, this might not be going far enough—at some level I might want to express why I care whether there's a non-empty string here, so I might want something more along the lines of inputData.Validate()
, but within the implementation of validity checks, "is this null or empty?" is likely to be a reasonable thing to want to ask.)
args.Length > 0
is less good than !string.IsNullOrEmpty(arg)
because it expresses mechanism, not intent. And arg is { Length: > 0}
is worse still because not only does it say nothing about intent, it's not even the clearest way to express the mechanism. I have to decode that to the point where I can understand "Oh, it just wants to see if we've got a non-empty string". Why make people reading your code do all that work when your code could just have said that in the first place?
Did you notice that the data type in this example is not the same as in the earlier examples by the way? The earlier args is { Length: > 0 }
was actually testing to see if we've got a non-empty array, whereas x is { Length: > 0 }
is looking for a non-empty string. If you didn't, then I offer this as evidence in support of my view that this particular style of pattern matching often requires more context to understand than the alternatives. When you see !string.IsNullOrEmpty(x)
you don't need to think too hard to work out what type x
might be here.
Be helpful to your readers
You might be inclined to accuse me of bike shedding—of wasting energy on mere superficial style. I would disagree. Clarity of communication to human readers is of the highest importance in code, second only to providing correct instructions for the computer to follow. (You could even argue communication to those reading the code is the more important of the two: code that happens to do the right thing to do when run but which conveys the wrong thing to anyone reading it is highly likely to be broken when someone next changes it; code that does the reverse has a chance of being fixed when someone looks at it and spots the discrepancy. Faulty expression of intent will tend to increase buginess over time; clear expression of intent will tend to reduce it.)
For me, style isn't the point here. This is about clarity of communication. Of course, coding style has some relevance, just as prose style affects clarity in natural language. But the point is clarity; style should be mostly in support of and always in second place to clarity. The fact that I find x is { Length: > 0 }
weird-looking matters to me at an aesthetic level, but if it improved clarity I'd be all for it, and if it were merely neutral on clarity, I wouldn't be writing this post.
The reason it's a problem is because it is unnecessarily obscure. And that's because it pushes the focus towards the way the code works, and away from what the code is trying to achieve.