C# Design Patterns - Iterator - Language Features | endjin

By Liam Mooney Software Engineer I 11th July 2024

C# Design Patterns - Iterator - Language Features

The previous design patterns post covered the basics of the iterator pattern, showing how to create an iterator object from scratch and how to use it to iterate over the items in a collection.

This post will show you how the iterator pattern is built into C# through various types, which we'll use to refactor the example code presented in the previous post. Afterwards it covers some of the more advanced iteration-related features in C#, like the yield statement and the IAsyncEnumerable<T> interface.

IEnumerator<T>

Instead of writing our own iterator interface as we did in the previous post (IIterator<T>) we could have used .NET's built in one - IEnumerator<T>.

IEnumerator<T> is a little different to the interface we defined. First, it doesn't contain the word "iterator", and the methods are a little different. Here's the definition

public interface IEnumerator<T>
{
    public T Current { get; }

    public bool MoveNext()

    public void Reset()

    public void Dispose()
}

The members, Current and MoveNext(), make up the "core" of the interface; they capture the same functionality as the HasNext() and Next() methods from our custom IIterator interface from the first post.

Current is a property that acts as a getter for the current item in the collection. MoveNext() checks if all items in the collection have already been seen and if so returns false, if not it sets Current to the next item in the collection and returns true.

Comparing these with HasNext() and Next() from our custom IIterator, we see that the behaviour of HasNext() is captured by MoveNext(), and the behaviour of Next() is captured by MoveNext() & Current. MoveNext and Current provide better separation of concerns compared to HasNext and Next. Progressing through the collection is covered by MoveNext, and retrieving a value from the collection is covered by Current, whereas, with IIterator, both of these concerns are covered by Next. Anyway - the logic is the same, it's just arranged differently.

Now for the other methods in IEnumerator<T>. The Reset() method is intended to set the iterator to it's initial position, however this method is a hold-over from old versions of .NET, and Microsoft recommend throwing a NotSupportedException when implementing. The Dispose() method enables cleaning up of resources, like closing database connections and disposing of file handles.

With IEnumerator<T> we can write code like this to iterate through the items in a collection:

using IEnumerator<T> enumerator = collection.GetEnumerator();
while (enumerator.MoveNext())
T item = enumerator.Current;
// Do something with item here.

Let's refactor our accountancy code sample to use IEnumerator<T> instead of IIterator<T>.

Refactoring to use IEnumerator<T>

`ExpenseAccountIterator` and `SalesAccountIterator`

ExpenseAccountIterator and SalesAccountIterator are concrete iterators that know how to iterate over ExpenseAccount and SalesAccount types, respectively. Let's refactor these to implement IEnumerator<T> instead of IIterator<T>.

public class ExpenseAccountIterator: IEnumerator<Transaction>
{
    private Transaction[] transactions;
    private int position = -1;
    private Transaction currentItem = default(Transaction);

    public ExpenseAccountIterator(Transaction[] transactions) => this.transactions = transactions;

    public bool MoveNext()
    {
        if(++position >= transactions.Length || transactions[position] == null)
        {
            return false;
        }
        else
        {
            currentItem = transactions[position];
        }
        return true;
    }

    public void Reset() { throw new NotSupportedException() }

    void IDisposable.Dispose() {}

    public Transaction Current
    {
        get { return currentItem; }
    }

    Object IEnumerator.Current
    {
        get { return Current; }
    }
}

public class SalesAccountIterator: IEnumerator<Transaction>
{
    private List<Transaction> transactions;
    private int position = -1;
    private Transaction currentItem = default(Transaction);

    public SalesAccountIterator(List<Transaction> transactions) => this.transactions = transactions;

    public bool MoveNext()
    {
        if(++position >= transactions.Count)
        {
            return false;
        }
        else
        {
            currentItem = transactions[position];
        }

        return true;
    }

    public void Reset() { throw new NotSupportedException() }

    void IDisposable.Dispose() {}

    public Transaction Current
    {
        get { return currentItem; }
    }

    Object IEnumerator.Current
    {
        get { return Current; }
    }
}

Apart from the additional methods, the code in these class definitions hasn't changed a great deal. Something worth noting is that the position field (which tracks the index of the current item in the iterator) is initialised to -1, this is necessary with IEnumerator<T> because its value gets incremented before Current is assigned a value from the collection.

I have followed the recommendation and implemented Reset() to throw a NotSupportedException. I haven't bothered implementing the Dispose() method as it's outside the scope of this post, plus we're not dealing with system resources in this example.

Next, let's update IAccount, the concrete account types, and Accountant to use the new iterators.

IAccount

The CreateIterator method on IAccount now needs to return an IEnumerator<Transaction> instead of an IIterator<Transaction>.

public interface IAccount
{
    void AddTransaction(string name, float amount, float taxRate, bool isReconciled);
    IEnumerator<Transaction> CreateIterator();
}

SalesAccount and ExpenseAccount

SalesAccount and ExpenseAccount need to be modified to properly implement the new version of IAccount.

public class SalesAccount: IAccount
{
    private List<Transaction> transactions;

    public SalesAccount()
    {
        transactions = new List<Transaction>();

        // Add Transaction items to transactions
    }

    public void AddTransaction(string name, float amount, float taxRate, bool isReconciled)
    {
        Transaction transaction = new Transaction(name, amount, taxRate, isReconciled);
        this.transactions.Add(transaction);
    }

    public IEnumerator<Transaction> CreateIterator()
    {
        return new SalesAccountIterator(transactions);
    }
}

public class ExpenseAccount: IAccount
{
    private static int maxItems = 3;
    private int numberOfItems = 0;

    private Transaction[] transactions;

    public ExpenseAccount()
    {
        transactions = new Transaction[maxItems];

        // Add Transaction items to transactions
    }

    public void AddTransaction(string name, float amount, float taxRate, bool isReconciled)
    {
        Transaction transaction = new (name, amount, taxRate, isReconciled);
        if (numberOfItems >= maxItems)
        {
            Console.WriteLine("Sorry, account is full! Can't add transaction to account");
        }
        else
        {
            transactions[numberOfItems] = transaction;
            numberOfItems += 1;
        }
    }

    public IEnumerator<Transaction> CreateIterator()
    {
        return new ExpenseAccountIterator(transactions);
    }
}

Accountant

Finally, Accountant now needs to work with IEnumerator<Transaction> to iterate over the Transaction items in an account and print the details of each. So, we need to update the overload to PrintTransactions to use Current and MoveNext() from IEnumerator<T> to perform iteration.

public class Accountant
{
    private IAccount expenseAccount;
    private IAccount salesAccount;

    public Accountant (IAccount expenseAccount, IAccount salesAccount)
    {
        this.expenseAccount = expenseAccount;
        this.salesAccount = salesAccount; 
    }

    public void PrintTransactions()
    {
        IEnumerator<Transaction> salesIterator = salesAccount.CreateIterator();
        IEnumerator<Transaction> expensesIterator = expenseAccount.CreateIterator();

        Console.WriteLine("ACCOUNT\n----\nSALES");
        PrintTransactions(salesIterator);
        Console.WriteLine("\nEXPENSES");
        PrintTransactions(expensesIterator);
    }

    public void PrintTransactions(IEnumerator<Transaction> iterator)
    {
        while(iterator.MoveNext())
        {
            Transaction transaction = iterator.Current;
            Console.WriteLine($"{transaction.Name}\n{transaction.Amount}\n{transaction.TaxRate}\n{transaction.IsReconciled}");
            Console.WriteLine();
        }
    }
}

IEnumerable<T>

Going a step further, we don't even need to use IEnumerator<T> ourselves since the collection types we're using to store items internally - List<Transaction> & Transaction[] - implement the IEnumerable<T> interface.

Programming C# 12 Book, by Ian Griffiths, published by O'Reilly Media, is now available to buy.

IEnumerable<T>, has a single member - GetEnumerator(), with a return type of IEnumerator<T>. This interface essentially says: "I represent a collection of things that can be iterated over".

All collection types in the .NET Standard library implement IEnumerable, therefore we can just ask them to provide an IEnumerator, and then use that to iterate over the items stored in them.

To implement that in our accounting example we need to modify IAccountant to expose an IEnumerable<Transaction> as a public property for consumers to use for iteration. This replaces the CreateIterator() method.

public interface IAccount
{
    void AddTransaction(string name, float amount, float taxRate, bool isReconciled);
    IEnumerable<Transaction> Transactions { get; }
}

Now, to have our account types implement the new version of IAccount, in each case we need to expose the internal collection of transactions as a property with type IEnumerable<T>. To do that we define a private field with the concrete collection type and a public property, Transactions, of type IEnumerable<T> with a getter that returns a reference to the collection of transactions.

This way, consumers only see the collection as IEnumerable<T>, meaning they can only iterate over the collection (which is exactly what we want), whilst internal code gets to see the concrete collection, meaning it can add items and so on, which is what we need.

public class SalesAccount: IAccount
{
    private List<Transaction> transactions;

    public IEnumerable<Transaction> Transactions => transactions;

    public SalesAccount()
    {
        this.transactions = new List<Transaction>();
        
        // Add Transaction items to transactions
    }

    public void AddTransaction(string name, float amount, float taxRate, bool isReconciled)
    {
        Transaction transaction = new Transaction(name, amount, taxRate, isReconciled);
        this.transactions.Add(transaction);
    }
}

public class ExpenseAccount: IAccount
{
    private static int maxItems = 3;
    private int numberOfItems = 0;

    private Transaction[] transactions;

    public IEnumerable<Transaction> Transactions => transactions;

    public ExpenseAccount()
    {
        this.transactions = new Transaction[maxItems];

        // Add Transaction items to transactions
    }

    public void AddTransaction(string name, float amount, float taxRate, bool isReconciled)
    {
        Transaction transaction = new (name, amount, taxRate, isReconciled);
        if (numberOfItems >= maxItems)
        {
            Console.WriteLine("Sorry, account is full! Can't add transaction to account");
        }

        else
        {
            this.transactions[numberOfItems] = transaction;
            numberOfItems += 1;
        }
    }
}

We can expose the concrete collection types - List<T> and T[] - as IEnumerable<T> because they both implement IEnumerable<T> - see covariance.

Now let's update Accountant to work with the latest versions of our other types.

public class Accountant
{
    private IAccount expenseAccount;
    private IAccount salesAccount;

    public Accountant (IAccount expenseAccount, IAccount salesAccount)
    {
        this.expenseAccount = expenseAccount;
        this.salesAccount = salesAccount; 
    }

    public void PrintTransactions()
    {
        IEnumerable<Transaction> salesTransactions = salesAccount.Transactions;
        IEnumerable<Transaction> expenseTransactions = expenseAccount.Transactions;

        Console.WriteLine("ACCOUNT\n----\nSALES");
        PrintTransactions(salesTransactions);
        Console.WriteLine("\nEXPENSES");
        PrintTransactions(expenseTransactions);
    }

    public void PrintTransactions(IEnumerable<Transaction> transactions)
    {
        using IEnumerator<Transaction> iterator = transactions.GetEnumerator(); 

        while(iterator.MoveNext())
        {
            Transaction transaction = iterator.Current;
            Console.WriteLine($"{transaction.Name}\n{transaction.Amount}\n{transaction.TaxRate}\n{transaction.IsReconciled}");
            Console.WriteLine();
        }
    }
}

The foreach loop

You won't often see code working with IEnumerator<T> directly because C# provides a more convenient way - the foreach loop.

Instead of grabbing an IEnumerator<T> from an IEnumerable<T> manually, you just pass the IEnumerable<T> to the foreach loop and it selects each item in the collection one at a time.

Here's what the overloaded PrintTransaction() method looks like when using a foreach loop:

public void PrintTransactions(IEnumerable<Transaction> transactions)
{
    foreach (Transaction transaction in transactions)
    {
        Console.WriteLine($"{transaction.Name}\n{transaction.Amount}\n{transaction.TaxRate}\n{transaction.IsReconciled}");
        Console.WriteLine();
    }
}

The compiler just translates the foreach loop into code that works with the IEnumerator<T>, like we were doing previously. It also takes care of cleaning-up resources, so it's recommended over the manual approach.

The yield statement and iterator methods

Methods that contain the yield return keywords are called iterator methods.

These methods return an IEnumerable<T>. However, unlike collection types such as arrays and lists, these enumerables do not represent a collection of items stored in memory; instead they define how to generate items from a sequence when requested. In other words, they generate (or yield) items "on the fly".

The following example shows an iterator method that defines a sequence of consecutive integers between given start and end values.

IEnumerable<int> GenerateNumbers(int start, int end)
{
    for (int i = start; i <= end; i++)
    {
        yield return i;
    }
}

We can create a source for enumeration by calling the method, and then loop through it using foreach as you would for any other enumerable:

IEnumerable<int> numbersOneToThree = GenerateNumbers(1, 3);
foreach(int number in numbersOneToFive)
{
    Console.WriteLine(number);
}
// Displays the following output:
// 1
// 2
// 3

What are they doing?

Calling an iterator method does not start enumeration, it's only when you start looping through the IEnumerable<T> (i.e. when IEnumerator<T>.MoveNext() is called) that the code in the iterator method runs.

When the yield return statement is reached, an item is returned to the caller and execution of the method pauses. On the next iteration of the loop (or the next time MoveNext() is called), the execution of the method resumes from where it last paused until pausing again when the next yield return statement is reached.

It's easier to see this if we add some print statements to the iterator method and enumerate its items one at a time using IEnumerator<T>.

IEnumerable<int> GenerateNumbers(int start, int end)
{
    Console.WriteLine("Iterator started");
    for (int i = start; i <= end; i++)
    {
        Console.WriteLine($"Iterator about to yield {i}");
        yield return i;
        Console.WriteLine($"Iterator yielded {i}");
    }
    Console.WriteLine("Iterator ended");
}

IEnumerable<int> numbersOneToThree = GenerateNumbers(1, 3);
using IEnumerator<int> enumerator = numbersOneToThree.GetEnumerator();

Now we have an IEnumerator<T> we can grab each item from the sequence one at a time.

enumerator.MoveNext();
int number = enumerator.Current;
Console.WriteLine($"Returned to caller: {number}");
// Displays the following output:
// Iterator started
// Iterator about to yield 1
// Returned to caller: 1

You can determine from the displayed output that the first line in the above sample caused execution of the method up to but not including the line Console.WriteLine($"Iterator yielded {i}");. If we progress the enumerator once more, we'll see that this line executes first.

enumerator.MoveNext();
// Displays the following output:
// Iterator yielded 1
// Iterator about to yield 2

And we can see that the line containing yield return has executed by grabbing the current value, which we expect to be 2:

int number = enumerator.Current;
Console.WriteLine($"Returned to caller: {number}");
// Displays the following output:
// Returned to caller: 2

The Introduction to Rx.NET 2nd Edition (2024) Book, by Ian Griffiths & Lee Campbell, is now available to download for FREE.

Iterator methods provide a lazy approach to iteration. You can see from the above example that items are only generated when requested, whereas with regular collections the items already exist in memory.

An IEnumerable based on an iterator method clearly works quite differently to one based on an in-memory collection. Under the covers iterator methods are based on state machines.

IAsyncEnumerable<T>

The key iteration-related interfaces - IEnumerable<T> and IEnumerator<T> - have asynchronous counterparts: IAsyncEnumerable<T> and IAsyncEnumerator<T>.

These allow you to operate asynchronously with enumerables that take some time to produce items. Their APIs are basically the same as their non-async counterparts, except the methods are async and return Tasks.

IAsyncEnumerable<T> is useful in scenarios that might involve asynchronous operations, like reading from a database, or making web requests.

We can create a sequence that takes some time to produce items by creating an iterator method containing a call to Task.Delay() before the yield return.

async IAsyncEnumerable<int> GenerateNumbersAsync(int start, int end)
{
    for (int i = start; i <= end; i++)
    {
        await Task.Delay(500);
        yield return i;
    }
}

You can loop through the items in an IAsyncEnumerable<T> using the async counterpart to the foreach loop: await foreach:

await foreach (int item in GenerateNumbersAsync(1, 5))
{
    Console.Write(item + " ");
}
// Displays the following output:
// 1 2 3 4 5

Can I still use IAsyncEnumerable<T> in non-async situations?

Yes. Since IAsyncEnumerable<T> is based on ValueTask<T>, it's designed to be efficient in synchronous and asynchronous scenarios. This means it is efficient in situations where items might already be in memory (and retrieved synchronously) or might not be. If you know all items are in memory, you should use IEnumerable<T>. Use IAsyncEnumerable<T> if items need to be or might need to be retrieved asynchronously.

For instance, consider a sequence with a caching mechanism. The first enumeration might be slow, involving asynchronous operations to fetch data, but subsequent enumerations should be fast, accessing the cache synchronously. Let's put together an example to illustrate this.

The following example defines a class that uses a Dictionary<int, T> to cache items of type T. The constructor takes a Func<int, Task<T>> which is used to fetch items asynchronously (e.g. calls to a database). The public EnumerateItemsAsync() is an iterator method - it uses the yield keyword and returns an IAsyncEnumerable<T>, allowing items to be enumerated one at a time. EnumerateItemsAsync() illustrates the earlier "items might be available in-memory" scenario: initially, items aren't cached, so each loop iteration calls _dataRetriever() (the slow operation) asynchronously; subsequent calls hit the cache, resulting in fast synchronous retrieval from the dictionary.

public class CachedAsyncEnumerable<T>
{
    private readonly Dictionary<int, T> _cache = new Dictionary<int, T>();
    private Func<int, Task<T>> _dataRetriever;

    public CachedAsyncEnumerable(Func<int, Task<T>> dataRetriever)
    {
        _dataRetriever = dataRetriever;
    }

    public async IAsyncEnumerable<T> EnumerateItemsAsync()
    {
        for (int i = 1; i <= 5; i++)
        {
            T item;
            int itemId = i;
            if(!_cache.ContainsKey(itemId))
            {
                // If item not in cache, then perform slow (asynchronous) operation to get it
                item = await _dataRetriever(itemId);
                _cache[itemId] = item;
            }
            else
            {
                // If item in cache retrieve item quickly (synchronously)
                item = _cache[itemId];
            }

            yield return item;
        }
    }
}

A more realistic example

As a more realistic example let's say you wanted to request the contents of multiple webpages and perform some processing on each of the results.

Making requests over a network is a relatively slow operation and should be done asynchronously. Here's a method that takes a collection of urls, requests the contents from each asynchronously, and counts the number of times the word "endjin" appears.

public static async IAsyncEnumerable<string> CountNumberOfEndjinsInWebpagesAsync(IEnumerable<string> urls)
{
    using (var httpClient = new HttpClient())
    {
        foreach (string url in urls)
        {
            HttpResponseMessage response = await httpClient.GetAsync(url);
            string data = await response.Content.ReadAsStringAsync();
            
            // Find the number of occurrences of the substring "endjin"
            int count = data.Split("endjin").Length - 1;
            yield return $"Data from {url} contains {count} occurrences of the substring 'endjin'";
        }
    }
}

I can call it by passing an array of strings representing urls, and enumerate the results using await foreach:

string[] endjinUrls =
{
    "https://endjin.com/blog/2024/01/analysing-wpf-performance-using-etw-and-perfview",
    "https://endjin.com/blog/2023/07/csharp-design-patterns-the-iterator-pattern"
};

await foreach (string result in CountNumberOfEndjinsInWebpagesAsync(endjinUrls))
{
    Console.WriteLine(result);
}
// Displays the following output:
// Data from https://endjin.com/blog/2024/01/analysing-wpf-performance-using-etw-and-perfview contains 115 occurrences of the substring 'endjin'
// Data from https://endjin.com/blog/2023/07/csharp-design-patterns-the-iterator-pattern contains 116 occurrences of the substring 'endjin'

@lg_mooney | @endjin

FAQs

What is an iterator method

An iterator method is a method that uses the `yield` keyword. Like regular in-memory collections they provide a source for enumeration, however, instead of holding items in memory ready to be used, they define how to generate items on the fly from a sequence when requested.

Who We Are

What We Do

Who We Help

What We Think

Contact Us

C# Design Patterns - Iterator - Language Features

IEnumerator<T>

Refactoring to use IEnumerator<T>

`ExpenseAccountIterator` and `SalesAccountIterator`

IAccount

SalesAccount and ExpenseAccount

Accountant

IEnumerable<T>

The foreach loop

The yield statement and iterator methods

What are they doing?

IAsyncEnumerable<T>

A more realistic example

FAQs

Also worth reading:

Async pitfalls: deferred work and resource ownership

Ian Griffiths21/09/2018

Flow control in C#

Jessica Hill12/01/2022

C# Design Patterns - Iterator - The Pattern

Liam Mooney27/07/2023

Liam Mooney

Software Engineer I

Who We Are

What We Do

Who We Help

What We Think

Contact Us

IEnumerator<T>

Refactoring to use IEnumerator<T>

ExpenseAccountIterator and SalesAccountIterator

IAccount

SalesAccount and ExpenseAccount

Accountant

IEnumerable<T>

The foreach loop

The yield statement and iterator methods

What are they doing?

IAsyncEnumerable<T>

A more realistic example

FAQs

Async pitfalls: deferred work and resource ownership

Ian Griffiths21/09/2018

Flow control in C#

Jessica Hill12/01/2022

C# Design Patterns - Iterator - The Pattern

Liam Mooney27/07/2023

`ExpenseAccountIterator` and `SalesAccountIterator`