Internals to C# Iterators






4.67/5 (6 votes)
Internals to C# iterators
If you are new to this post, I would recommend that you read my other post too from the Internals series. In this series, I am trying to cover the basic C# programming and also relate it with the compiled MSIL. In my previous post, while I was going through with the internals of foreach
loop, I told you to cover more on iterators in my next post. It is time to cover the basis on which the C# IEnumerable
stands and the iterators.
If you ask me why I like to work on C# not VB.NET or other languages, I would point to some of the flexibilities that I get with C#. Even though in VB.NET vNEXT, iterators are coming into being but still C# is the primary language which introduced yield.
In this post, I am going to demonstrate the basic feature behind the C# iterators and also introduce you to the secret behind the yield
keyword of C#.
The Basics
Before we start with C# iterators, let me explain what an iterator means exactly. Surprisingly, there are many who know nothing about IEnumerable
, the next section is for them. If you already know about IEnumerable
and IEnumerator
, please skip the next paragraph and read ahead.
IEnumerable and IEnumerator
C# comes with 2 basic interface
s, namely IEnumerable
and IEnumerator
which represent the base for any collection. IEnumerable
is an interface
that defines a GetEnumerator
which gets an IEnumerator
. An IEnumerator
on the other hand provides a simple iteration over a collection. Using the interface
ensures that you could use this collection in a foreach
loop of C# or ForEach
in VB.NET. If you look back to MSDN, it says:
IEnumerator is the base interface for all enumerators. Enumerators only allow reading the data in the collection. Enumerators cannot be used to modify the underlying collection.
Initially, the enumerator is positioned before the first element in the collection. Reset also brings the enumerator back to this position. At this position, calling Current throws an exception. Therefore, you must call MoveNext to advance the enumerator to the first element of the collection before reading the value of Current.
Almost all the collections in .NET class library are derived from IEnumerable
and hence you can iterate through the collection it internally holds and use it. To know more about these, please go through my previous post on Loops and move to Foreach section.
Iterator
Iterators in C# is one of the best features of all times. C# 2.0 comes with a new
keyword called yield
which lets you generate an IEnumerable
instantly. Iterators in C# is actually a method or get
accessor of a property which returns IEnumerable
without letting you manually create the whole enumerable and enumerator yourself. The C# iterator block invokes a yield
return to return each individual element of the block and yield
break to end the enumerator. The return type of the iterator method is IEnumerable
or IEnumerator
which represents their actual implementation.
Let me put a sample iterator implementation:
class Program
{
static void Main(string[] args)
{
var enumerable = new Program().GetEnumerated(10, 20);
Console.WriteLine("After I got the Enumerable");
foreach (int i in enumerable)
{
Console.WriteLine("Got i = {0}", i);
Thread.Sleep(10);
}
Console.Read();
}
public IEnumerable<int> GetEnumerated(int start, int end)
{
Console.WriteLine("Starting Enumerating!!!");
Stopwatch watch = new Stopwatch();
watch.Start();
for (int i = start; i <= end; i++)
{
Console.WriteLine("Value of watch = {0} before yield", watch.ElapsedTicks);
yield return i;
Console.WriteLine("Value of watch = {0} after yield", watch.ElapsedTicks);
}
watch.Stop();
}
}
In this implementation, I am using a stopwatch
to see what happens in the background. Let's see the output for the code above:
In the output console, you can see the console prints "After I got the Enumerable", that means the function actually returns immediately after the call is made? Yes, you are right. So to get an enumerable, it doesn't need to enumerate the whole collection within the property. Now pointing to the next lines, you can see after it gets the value of i
, the value of watch increases considerably. That means the method finds the yield
and stops the execution and again waits for the enumerator to call its next value. Hence, you can have your iterator running as the program goes, you can store the IEnumerator
to fetch the data whenever it is required.
The Internals
In fact, the C# iterator internally holds a state machine for each iterator. The state machine is actually a CompilerGenerated
class which is capable of storing the local variable as properties of the class, the execution point as delegate, etc. Thus the state machine allows you to Pause and Resume execution of the block as and when required.
This is a very cool concept. Let me demonstrate the fact with an example:
public IEnumerable<int> GetFirst10Nos()
{
for (int i = 0; i < 10; i++)
yield return i;
}
This is the most simple method which returns the first 10 numbers starting from 0. Now let's see how it looks like after compilation:
// Methods
public IEnumerable<int> GetFirst10Nos()
{
<GetFirst10Nos>d__0 d__ = new <GetFirst10Nos>d__0(-2);
d__.<>4__this = this;
return d__;
}
// Nested Types
[CompilerGenerated]
private sealed class <GetFirst10Nos>d__0 :
IEnumerable<int>, IEnumerable, IEnumerator<int>, IEnumerator, IDisposable
{
// Fields
private bool $__disposing;
private bool $__doFinallyBodies;
private int <>1__state;
private int <>2__current;
public Iteratordemo <>4__this;
private int <>l__initialThreadId;
public int <i>5__1;
// Methods
[DebuggerHidden]
public <GetFirst10Nos>d__0(int <>1__state)
{
this.<>1__state = <>1__state;
this.<>l__initialThreadId = Thread.CurrentThread.ManagedThreadId;
}
private bool MoveNext()
{
bool CS$1$0000;
try
{
this.$__doFinallyBodies = true;
if (this.<>1__state == 1)
{
goto Label_0068;
}
if (this.<>1__state == -1)
{
return false;
}
if (this.$__disposing)
{
return false;
}
this.<i>5__1 = 0;
while (this.<i>5__1 < 10)
{
this.<>2__current = this.<i>5__1;
this.<>1__state = 1;
this.$__doFinallyBodies = false;
return true;
Label_0068:
if (this.$__disposing)
{
return false;
}
this.<>1__state = 0;
this.<i>5__1++;
}
this.<>1__state = -1;
CS$1$0000 = false;
}
catch (Exception)
{
this.<>1__state = -1;
throw;
}
return CS$1$0000;
}
[DebuggerHidden]
IEnumerator<int> IEnumerable<int>.GetEnumerator()
{
if ((Thread.CurrentThread.ManagedThreadId ==
this.<>l__initialThreadId) && (this.<>1__state == -2))
{
this.<>1__state = 0;
return this;
}
Iteratordemo.<GetFirst10Nos>d__0 d__ = new Iteratordemo.<GetFirst10Nos>d__0(0);
d__.<>4__this = this.<>4__this;
return d__;
}
[DebuggerHidden]
IEnumerator IEnumerable.GetEnumerator()
{
return this.System.Collections.Generic.IEnumerable<System.Int32>.GetEnumerator();
}
[DebuggerHidden]
void IEnumerator.Reset()
{
throw new NotSupportedException();
}
[DebuggerHidden]
void IDisposable.Dispose()
{
this.$__disposing = true;
this.MoveNext();
this.<>1__state = -1;
}
// Properties
int IEnumerator<int>.Current
{
[DebuggerHidden]
get
{
return this.<>2__current;
}
}
object IEnumerator.Current
{
[DebuggerHidden]
get
{
return this.<>2__current;
}
}
}
Well, basically the compiler generates a type for holding the state machine for you. The type is generated in such a way so that it implements the IEnumerator
so that it can produce the iterators and hold the state of the method within itself. Let me explain few methods for you:
- Our method actually creates a nested class <getfirst10nos>
d__0
which holds the state machine and also implementsIEnumerable
andIEnumerator
. Once our method is called, it creates a new object of it and returns back the object. As the class implements theIEnumerable
, it doesn't produce any problem. I should remind, no code from our method is still executed yet. - Initially, when we use the
IEnumerable
in foreach loop, it internally calls theGetEnumerator
. If you see closely, this method checks if the call is made from the current Thread or not and also checks for the state to be-2
. You can see, while creating the object, it passes the state as-2
. Hence to conclude, theGetEnumerator
always creates a new object ofEnumerator
if the call is made either for the first time, or through a different thread than which owns it. You should note, while creating the object fromGetEnumerator
,the object is initialized to0
, which states that the enumerator is initialized. MoveNext
, being the important part of the object, actually checks the value of the state, to indicate the various stages of the object.0
represents before callingMoveNext
-1
end of the enumerator, returnsfalse
-2
represents no enumerator is fetched. (before call toGetEnumerator
)1
represents the enumeration in running, sets the value ofthis.<>2__current
and returnstrue
- Now as for each request to
MoveNext
, the state is checked and the initialGoTo
statement moves the control toLabel_0068:
, the object keeps on running our code and starts producing numbers. - Finally when the
while
loop fails condition, the state is set to-1
and the execution terminates.
So, the state machine object is capable of producing numbers and also to pause and resume the method.
Member variables of State Machine represents:
- Locals, Parameters, etc. are created as members variable, such that local variable
i
is represented as<i>5__1
. - Two boolean variables to hold the state of disposing and finally execution
$__disposing
and$__doFinallyBodies
. - Current value of the object in
<>2__current
. - State in which the object is (even though the state is not given any enumerated names).
<>1__state
- Stores object which invokes the iterator,
<>4__this
.
You should note the variable, methods and types are generated in such a way so that it doesn't represent a valid C# type, and thus eliminates the occurrence of another type of the same name in the assembly.
Conclusion
Well, C# iterators are by far the best thing in .NET language. It is really tedious to build each enumerator by hand. Linq and other language features extensively uses this feature to achieve the goal of making C# more reliable yet simple to write on. I tried to demonstrate the fact behind all that occurs for iterators. I hope you like this post and also read my other post on Internals to .NET.
Thanks for reading.