Lazy and once-only C# async initialization

By Ian Griffiths Technical Fellow I 10th January 2023

Lazy and once-only C# async initialization

There are a couple of closely related optimizations that are often useful: lazy initialization, and "just once" initialization. These are conceptually pretty straightforward, although the details can get messy once multiple threads are involved. But what if the initialization work involves asynchronous operation?

Lazy initialization

Lazy initialization is an important performance technique. Putting off work until you know you need the results enables you to get on with more useful things in the meantime. And, once you've done the work, hanging onto the results may save you from repeating that work later. It's a pretty simple concept, but in multi-threaded environments it's surprisingly easy to get it wrong, and so .NET provides some helper types that handle the subtle race conditions for you: Lazy<T> and LazyInitializer. (The latter is more lightweight, but uses an optimistic concurrency policy that means it will occasionally execute your initialization code twice, and then discard one of the results. If you can't tolerate this, use Lazy<T>.)

Just-once initialization

Lazy initialization incorporates a simpler idea that we could use in isolation: if you're going to do some expensive work, better to do it just once, instead of repeating it every time you need the results of that work. This idea is so simple that it might seem like it barely needs stating, but once you add in concurrency, it is the reason lazy initialization is surprisingly easy to get wrong.

Lazy<T> gives us just-once initialization, but what if we only want the just-once behaviour, and don't require laziness? Perhaps there's some work that our program is inevitably going to need to perform, so there's no particular benefit in deferring it, but we still want the just-once behaviour.

That shouldn't be so hard, right? We could do that in the constructor:

public class TypeWithExpensiveOneTimeWork
{
    private readonly ExpensivelyCalculatedResults data;

    public TypeWithExpensiveOneTimeWork()
    {
        this.data = ExpensiveWorker.PerformSlowWork();
    }

    public string DoSomething(int input) => data.GetResult(input);
}

There is one objection to this: some take the view that constructors shouldn't do non-trivial work. However, I'm not going to be dogmatic about that. Instead, I want to look at a variation on this: what if the slow work we need to perform just once involves asynchronous operation?

Async just-once eager initialization

If PerformSlowWork in the preceding example returned Task<T>, the simple approach I just showed won't work. Constructors cannot be declared as async (because they can't return a Task—by definition they return an instance of the type they construct). We might be tempted to do this:

public class TerribleIdeaNeverEverDoThis
{
    private readonly ExpensivelyCalculatedResults data;

    public TerribleIdeaNeverEverDoThis()
    {
        this.data = ExpensiveWorker.PerformSlowWorkAsync().Result; // NOOOOOOOOOOOOOOOOO!
    }

    public string DoSomething(int input) => data.GetResult(input);
}

Don't do that.

In general, it's a really bad idea to retrieve the Result of a Task<T> unless you can be certain that the task has already completed (which it almost certainly won't have done in this example). There are a handful of exceptions to that rule, but they are specialized and tricky to get right. Unless you are in complete control of the context in which it runs, using Result in the way shown above risks causing a deadlock. (Reading Result on an unfinished task blocks the calling thread, and if the thread has ownership of anything required to complete the asynchronous work, that will prevent completion.)

However, there is a pretty simple way to get the same effect safely:

public class TypeWithExpensiveOneTimeAsyncWork
{
    private readonly Task<ExpensivelyCalculatedResults> dataTask;

    public TypeWithExpensiveOneTimeAsyncWork()
    {
        this.dataTask = ExpensiveWorker.PerformSlowWorkAsync(); // Note: no await
    }

    public async ValueTask<string> DoSomethingAsync(int input)
    {
        ExpensivelyCalculatedResults data = await dataTask.ConfigureAwait(false);
        return data.GetResult(input);
    }
}

Since in this scenario we know we will definitely need to perform the slow work, we don't need lazy behaviour, so we kick the work off immediately in the constructor. But we don't wait for it to finish there—we just store the resulting task. And then, any method that needs access to that expensive-to-obtain information can just await that task.

This works because you are allowed to await the same Task<T> any number of times. (Note that the field has to be a Task<T>, not a ValueTask<T>. You're only allowed to await a ValueTask<T> once.) Calls to DoSomethingAsync that occur before the expensive work is complete will block at the await. If there are multiple concurrent calls to the method while we're in that state, that's fine, they'll all just block, and then when the expensive initialization completes, they will all become runnable simultaneously. (Whether they actually run concurrently at that point will typically be down to the task scheduler, which by default will defer to the thread pool.)

In a program that expects to call that DoSomethingAsync method many times in succession, we would expect the first call to be slow (because it will have to wait for the expensive asynchronous initialization to complete) but subsequent calls will not have to wait because that data task will have completed, so the await won't need to wait. This is why I've made DoSomethingAsync return a ValueTask<string>. Asynchronous methods that you expect mostly not to need to block in practice are more memory-efficient if they return a ValueTask<string>.

Async lazy initialization

The eager asynchronous initialization just shown is good if you know you're definitely going to need the results of the work, and are likely to need it as early as possible. But in cases where your code might not need the results at all (e.g., you're writing a command line tool, and only certain command line flags will trigger the behaviour that needs this particular data), then lazy initialization is a better bet.

These Lazy<T> and LazyInitializer types mentioned earlier do not offer any direct support for asynchronous code. That's essentially because they don't need to. You can just use Lazy<Task<T>>, e.g.:

public class TypeWithExpensiveLazyAsyncWork
{
    private readonly Lazy<Task<ExpensivelyCalculatedResults>> dataTaskSource;

    public TypeWithExpensiveLazyAsyncWork()
    {
        this.dataTaskSource = new(() => ExpensiveWorker.PerformSlowWorkAsync());
    }

    public async ValueTask<string> DoSomethingAsync(int input)
    {
        ExpensivelyCalculatedResults data = await dataTaskSource.Value.ConfigureAwait(false);
        return data.GetResult(input);
    }
}

Programming C# 12 Book, by Ian Griffiths, published by O'Reilly Media, is now available to buy.

This avoids starting the expensive work until something asks for it. So this will give you "at most once" initialization—if the program never hits the code path that asks for the results of this expensive work, it will never be performed. But once something does ask, Lazy<T> will ensure that it only runs the code that builds the Task<T> once.

What if your organization tolerates failure?

Although Blofeld might not approve, sometimes it is necessary to tolerate failure. The problem with the async techniques just shown is that if the initialization fails, you are stuck with Task<T> that is in a faulted state, so every attempt to await it will throw an exception. To be able to recover from this, you would need to be able to reset the field.

Here's one way you could do that:

public class TypeWithRetriableExpensiveLazyAsyncWork
{
    private Lazy<Task<ExpensivelyCalculatedResults>> dataTaskSource;

    public TypeWithRetriableExpensiveLazyAsyncWork()
    {
        this.dataTaskSource = InitializeDataTaskSource();
    }

    private Task<ExpensivelyCalculatedResults> Data
    {
        get
        {
            Task<ExpensivelyCalculatedResults> result = this.dataTaskSource.Value;
            if (result.IsFaulted)
            {
                // Try one more time. If the underlying cause of the problem remains,
                // this will also fail, but each subsequent attempt to get the data
                // will kick off a new try.
                result = this.InitializeDataTaskSource().Value;
            }

            return result;
        }
    }

    private Lazy<Task<ExpensivelyCalculatedResults>> InitializeDataTaskSource()
    {
        return this.dataTaskSource = new(() => ExpensiveWorker.PerformSlowWorkAsync());
    }

    public async ValueTask<string> DoSomethingAsync(int input)
    {
        ExpensivelyCalculatedResults data = await this.Data.ConfigureAwait(false);
        return data.GetResult(input);
    }
}

The Introduction to Rx.NET 2nd Edition (2024) Book, by Ian Griffiths & Lee Campbell, is now available to download for FREE.

In practice, error recovery behaviour is often application-specific, so you might need something more complex. But the basic idea of creating a new Lazy<Task<T>> (or new Task<T> if you need just-once but don't care about laziness) will remain.

Summary

There's no specific support in .NET for lazy or once-only initialization, but you don't need it. A field of type Lazy<Task<T>> will do the job. And if you don't need the lazy part, you can get once-only async initialization by storing just a Task<T> in a field.

Who We Are

What We Do

Who We Help

What We Think

Contact Us

Lazy initialization

Just-once initialization

Async just-once eager initialization

Async lazy initialization

What if your organization tolerates failure?

Summary

Also worth reading:

C# Design Patterns - Iterator - Language Features

Liam Mooney11/07/2024

Async pitfalls: deferred work and resource ownership

Ian Griffiths21/09/2018

Retrying tasks with TPL, async and synchronous code

Matthew Adams23/05/2013

Ian Griffiths

Technical Fellow I