Click here to Skip to main content
15,881,089 members
Articles / Programming Languages / C#

Lowering in the C# Compiler (and What Happens When You Misuse It)

Rate me:
Please Sign up or sign in to vote.
4.95/5 (19 votes)
26 May 2017CPOL5 min read 16.5K   5   13
Lowering in the C# compiler (and what happens when you misuse it)

Turns out that what I'd always thought of as "Compiler magic" or "Syntactic sugar" is actually known by the technical term "Lowering" and the C# compiler (a.k.a Roslyn) uses it extensively.

But what is it? Well, this quote from So You Want To Write Your Own Language? gives us some idea:

Lowering One semantic technique that is obvious in hindsight (but took Andrei Alexandrescu to point out to me) is called 'lowering' It consists of, internally, rewriting more complex semantic constructs in terms of simpler ones. For example, while loops and foreach loops can be rewritten in terms of for loops. Then, the rest of the code only has to deal with for loops. This turned out to uncover a couple of latent bugs in how while loops were implemented in D, and so was a nice win. It's also used to rewrite scope guard statements in terms of try-finally statements, etc. Every case where this can be found in the semantic processing will be win for the implementation.

-- by Walter Bright (author of the D programming language)

But if you're still not sure what it means, you can read Eric Lippert's post on the subject, Lowering in language design, which contains this quote:

A common technique along the way though is to have the compiler 'lower' from high-level language features to low-level language features in the same language.

As an aside, if you like reading about the Roslyn compiler source, you may like these other posts that I've written:

What Does 'Lowering' Look Like?

The C# compiler has used lowering for a while, one of the oldest or most recognised examples is when this code:

C#
using System.Collections.Generic;
public class C {
    public IEnumerable<int> M() 
    {
        foreach (var value in new [] { 1, 2, 3, 4, 5 })
        {
            yield return value;
        }
    }
}

is turned into this:

C#
public class C
{
    [CompilerGenerated]
    private sealed class <M>d__0 : IEnumerable<int>, IEnumerable, 
                                   IEnumerator<int>, IDisposable, IEnumerator
    {
        private int <>1__state;
        private int <>2__current;
        private int <>l__initialThreadId;
        public C <>4__this;
        private int[] <>s__1;
        private int <>s__2;
        private int <value>5__3;
        int IEnumerator<int>.Current
        {
            [DebuggerHidden]
            get
            {
                return this.<>2__current;
            }
        }
        object IEnumerator.Current
        {
            [DebuggerHidden]
            get
            {
                return this.<>2__current;
            }
        }
        [DebuggerHidden]
        public <M>d__0(int <>1__state)
        {
            this.<>1__state = <>1__state;
            this.<>l__initialThreadId = Environment.CurrentManagedThreadId;
        }
        [DebuggerHidden]
        void IDisposable.Dispose()
        {
        }
        bool IEnumerator.MoveNext()
        {
            int num = this.<>1__state;
            if (num != 0)
            {
                if (num != 1)
                {
                    return false;
                }
                this.<>1__state = -1;
                this.<>s__2++;
            }
            else
            {
                this.<>1__state = -1;
                this.<>s__1 = new int[] { 1, 2, 3, 4, 5 };
                this.<>s__2 = 0;
            }
            if (this.<>s__2 >= this.<>s__1.Length)
            {
                this.<>s__1 = null;
                return false;
            }
            this.<value>5__3 = this.<>s__1[this.<>s__2];
            this.<>2__current = this.<value>5__3;
            this.<>1__state = 1;
            return true;
        }
        [DebuggerHidden]
        void IEnumerator.Reset()
        {
            throw new NotSupportedException();
        }
        [DebuggerHidden]
        IEnumerator<int> IEnumerable<int>.GetEnumerator()
        {
            C.<M>d__0 <M>d__;
            if (this.<>1__state == -2 && this.<>l__initialThreadId == 
                                         Environment.CurrentManagedThreadId)
            {
                this.<>1__state = 0;
                <M>d__ = this;
            }
            else
            {
                <M>d__ = new C.<M>d__0(0);
                <M>d__.<>4__this = this.<>4__this;
            }
            return <M>d__;
        }
        [DebuggerHidden]
        IEnumerator IEnumerable.GetEnumerator()
        {
            return this.System.Collections.Generic.IEnumerable<System.Int32>.GetEnumerator();
        }
    }
    [IteratorStateMachine(typeof(C.<M>d__0))]
    public IEnumerable<int> M()
    {
        C.<M>d__0 expr_07 = new C.<M>d__0(-2);
        expr_07.<>4__this = this;
        return expr_07;
    }
}

Yikes, I'm glad we don't have to write that code ourselves!! There's an entire state-machine in there, built to allow our original code to be halted/resumed each time round the loop (at the 'yield' statement).

The C# Compiler and 'Lowering'

But it turns out that the Roslyn compiler does a lot more 'lowering' than you might think. If you take a look at the code under "/src/Compilers/CSharp/Portable/Lowering" (VB.NET equivalent here), you see the following folders:

Which correspond to some C# language features you might be familar with, such as 'lambdas', i.e., x => x.Name > 5, 'iterators' used by yield (above) and the async keyword.

However, if we look at bit deeper, under the 'LocalRewriter' folder we can see lots more scenarios that we might never have considered 'lowering', such as:

So a big thank you is due to all the past and present C# language developers and designers. They did all this work for us, imagine that C# didn't have all these high-level features, we'd be stuck writing them by hand, it would be like writing Java. :-)

What Happens When You Misuse It

But of course, the real fun part is 'misusing' or outright 'abusing' the compiler. So I set up a little twitter competition just how much 'lowering' could we get the compiler to do for us (i.e., the highest ratio of 'input' lines of code to 'output' lines).

It had the following rules (see this gist for more information):

  1. You can have as many lines as you want within method M()
  2. No single line can be longer than 100 chars
  3. To get your score, divide the '# of expanded lines' by the '# of original line(s)'
    1. Based on the default output formatting of https://sharplab.io, no re-formatting allowed!!
    2. But you can format the intput however you want, i.e., make use of the full 100 chars
  4. Must compile with no warnings on https://sharplab.io (allows C# 7 features)
    1. But doesn't have to do anything sensible when run
  5. You cannot modify the code that is already there, i.e., public class C {} and public void M()
    1. Cannot just add async to public void M(), that's too easy!!
  6. You can add new using ... declarations, these do not count towards the line count

For instance with the following code (interactive version available on sharplab.io):

C#
using System;
public class C {
    public void M() {
        Func<string> test = () => "blah"?.ToString();
    }
}

This counts as 1 line of original code (only code inside method M() is counted).

This expands to 23 lines (again only lines of code inside the braces (<span class="p">{</span>, }) of class C are counted.

Giving a total score of 23 (23 / 1).

C#
....
public class C
{
    [CompilerGenerated]
    [Serializable]
    private sealed class <>c
    {
        public static readonly C.<>c <>9;
        public static Func<string> <>9__0_0;
        static <>c()
        {
            // Note: this type is marked as 'beforefieldinit'.
            C.<>c.<>9 = new C.<>c();
        }
        internal string <M>b__0_0()
        {
            return "blah";
        }
    }
    public void M()
    {
        if (C.<>c.<>9__0_0 == null)
        {
            C.<>c.<>9__0_0 = new Func<string>(C.<>c.<>9.<M>b__0_0);
        }
    }
}

Results

The first place entry was the following entry from Schabse Laks, which contains 9 lines-of-code inside the M() method:

C#
using System.Linq;
using Y = System.Collections.Generic.IEnumerable<dynamic>;

public class C {
    public void M() {
((Y)null).Select(async x => await await await await await await await await await await await
await await await await await await await await await await await await await await await await
await await await await await await await await await await await await await await await await
await await await await await await await await await await await await await await await await
await await await await await await await await await await await await await await await await
await await await await await await await await await await await await await await await await
await await await await await await await await await await await await await await await await
await await await await await await await await await await await await await await await await
await await await await await await await await await await await await await await await x.x()());
    }
}

This expands to an impressive 7964 lines of code (yep you read that right!!) for a score of 885 (7964 / 9). The main trick he figured out was that adding more lines to the input increased the score, i.e., is scales superlinearly. Although it you take things too far the compiler bails out with a pretty impressive error message:

error CS8078: An expression is too long or complex to compile

Here's the Top 6 top results:

SubmitterEntryScore
Schabse Lakslink885 (7964 / 9)
Andrey Dyatlovlink778 (778 / 1)
arlzlink755 (755 / 1)
Andy Gocke *link633 (633 / 1)
Jared Parsons *link461 (461 / 1)
Jonathan Chamberslink384 (384 / 1)

* = member of the Roslyn compiler team (they're not disqualified, but maybe they should have some kind of handicap applied to 'even out' the playing field?)

Honourable Mentions

However, there were some other entries that whilst they didn't make it into the Top 6, are still worth a mention due to the ingenuity involved:

The post Lowering in the C# Compiler (and what happens when you misuse it) first appeared on my blog Performance is a Feature!

This article was originally posted at http://www.mattwarren.org/atom.xml

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer
United Kingdom United Kingdom
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralMy vote of 5 Pin
D V L15-Jun-17 20:07
professionalD V L15-Jun-17 20:07 
GeneralRe: My vote of 5 Pin
matt warren16-Jun-17 1:11
matt warren16-Jun-17 1:11 
GeneralMy vote of 5 Pin
User 1106097927-May-17 0:27
User 1106097927-May-17 0:27 
GeneralRe: My vote of 5 Pin
matt warren30-May-17 1:21
matt warren30-May-17 1:21 
QuestionYou had me in the first paragraph! Pin
Dewey26-May-17 17:22
Dewey26-May-17 17:22 
AnswerRe: You had me in the first paragraph! Pin
matt warren30-May-17 1:22
matt warren30-May-17 1:22 
QuestionFascinating! Pin
Marc Clifton26-May-17 2:26
mvaMarc Clifton26-May-17 2:26 
AnswerRe: Fascinating! Pin
matt warren30-May-17 1:38
matt warren30-May-17 1:38 
QuestionMaybe interesting but almost unreadable - please check encoding and HTML formatting!! Pin
softexpert25-May-17 22:14
softexpert25-May-17 22:14 
AnswerRe: Maybe interesting but almost unreadable - please check encoding and HTML formatting!! Pin
matt warren25-May-17 22:54
matt warren25-May-17 22:54 
GeneralRe: Maybe interesting but almost unreadable - please check encoding and HTML formatting!! Pin
softexpert25-May-17 23:59
softexpert25-May-17 23:59 
GeneralRe: Maybe interesting but almost unreadable - please check encoding and HTML formatting!! Pin
matt warren26-May-17 0:40
matt warren26-May-17 0:40 
GeneralRe: Maybe interesting but almost unreadable - please check encoding and HTML formatting!! Pin
softexpert26-May-17 23:36
softexpert26-May-17 23:36 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.