Click here to Skip to main content
16,017,279 members
Articles / Programming Languages / C#

Reweaving IL code with Mono.Cecil

Rate me:
Please Sign up or sign in to vote.
4.93/5 (28 votes)
26 May 2014CPOL8 min read 79.4K   1.6K   54   14
First one of a series of tutorials about C# code reweacing using Mono.Cecil

Table of contents

  1. Introduction
  2. Mono.Cecil
  3. Faking return values
  4. Throwing exceptions
  5. Reweaving properties
  6. Using the solution
  7. Conclusion
  8. Update history

Introduction

In the past month I had the opportunity to get to know Mono.Cecil and use it extensively in our programs. However while I was learning it I was faced with a number of Mono.Cecil tutorials, most of them using an older version of Mono.Cecil with classes having been since removed or replaced (like AssemblyFactory of CilWorker).

This lack of coverage is the reason why I started writing tutorials for Mono Cecil. My aim is to create a series where the first two tutorials focus on testing while the other two on the use of IL code weaving for Aspect Oriented Programming.

If you are new to IL coding, you can check out my post about the basics of IL programming

Mono.Cecil library

The Mono.Cecil is a very powerful library for inspecting and reweaving IL code. Though the target language of these articles is C#, don’t forget that we could inspect or reweave any .NET compliant language with Cecil or even generate IL code.

From these characteristics comes that the class hierarchy maps to the concepts of IL programming instead of those of the C# language and it does in a straightforward way so if you know the IL language, getting started with Mono.Cecil will be very easy and fast. But don’t worry, I will introduce the concepts of the intermediate language in parallel with Cecil, so an in-depth knowledge of IL programming is not a prerequisite.

Start our tour with getting a look at Mono.Cecil’s class hierarchy from a bird’s-eye view:

Image 1

In this structure there is a highlighted ModuleDefinition which will be our point of entry for getting the types and methods of an assembly (exposed through the AssemblyDefinition’s MainModule property). This module will be always part of the module collection (Modules property) and most of the time it will be the only element.

The reason for this is simple: in Visual Studio one project will be compiled into exactly one module contained in one assembly. To create multi-module assemblies we have to either use the related compiler options of the CSC specifically or use the Assembly Linker.

Faking return values

We will begin with an obligatory Hello World example with faking the return value of a static function. While doing so we will use a simplified interface that resembles mocking libraries like Moq. The steps we need to take are very straightforward, which are the following:

  1. Load the assembly
  2. Get the method we want to overwrite
  3. Inject code at the beginning of the method’s body
  4. Save the assembly

This is how faking a return value will look like with our example library:

C#
//Example 1. Faking return value - primitives
ilCodeWeaver.Setup(() => HelloMessages.GetHelloWorld())
            .Returns("All your bases are belong to us");  

First we load the AssemblyDefinition at the constructor without forgetting to store the assembly path (as it won’t be stored in the AssemblyDefinition instance):

C#
public ILCodeWeaver(string assemblyPath)
{
    _assemblyPath = assemblyPath;
    _assemblyDefinition = AssemblyDefinition.ReadAssembly(assemblyPath); 
} 

Emphasizing this simple piece of code is because reading in an assembly was achieved with the AssemblyFactory class before Cecil’s 0.9 version which was completely removed yet most of the existing few tutorials still use this class and other deprecated ones.

After getting the assembly we need to infer the declaring type and method info of the method we want to overwrite. Here you can see how easy it is to mix Reflection with Mono.Cecil.

C#
public SetupContext Setup(Expression<Action> expression)
{
    var methodCall = expression.Body as MethodCallExpression;
    var methodDeclaringType = methodCall.Method.DeclaringType;
 
    var type = _assemblyDefinition.MainModule.Types
        .Single(t => t.Name == methodDeclaringType.Name);
    var method = type.Methods
        .Single(m => m.Name == methodCall.Method.Name);
 
    return new SetupContext {  
        MainModule = _assemblyDefinition.MainModule,
        Method = method,
    };
}  

Last we insert two IL instructions to return our choice of value (currently limited only to strings).

C#
public void Returns(object returnObject)
{
    var returnString = returnObject as string;
    
    //Get the site of code injection
    var ilProcessor = Method.Body.GetILProcessor();
    var firstInstruction = ilProcessor.Body.Instructions.First();
 
    ilProcessor.InsertBefore(firstInstruction, ilProcessor.Create(
        OpCodes.Ldstr, returnString));
    ilProcessor.InsertBefore(firstInstruction, ilProcessor.Create(OpCodes.Ret));
} 

As you can see, this only injects code at the beginning of the function. The reason for not meddling with the existing code is that it would take more time to change locals, etc. Think about the following usage of faking:

MSIL
ilCodeWeaver.Setup(() => HelloMessages.GetSumMessage(5,11))
            .Returns("the sum of x and y is none of your concern");  

If we replaced the GetSumMessage function’s body, then its IL code would look like this:

MSIL
.method public hidebysig static GetSumMessage (int32 x,  int32 y) cil managed 
{
    .maxstack 2
    .locals init (
        [0] int32
    ) 
    IL_0000: ldstr "the sum of x and y is none of your concern"
    IL_0005: ret
} // end of method HelloMessages::GetSumMessage 

Which would compile and run without a problem, but other than the unused local int variable which in this case is only inelegant this approach would later cause us problems (injecting multiple conditional fake code for example).

Throwing exceptions

n the following example we will manipulate a function to throw an exception instead of its original purpose. To accomplish this we have to instantiate a new Exception object and insert the throw OpCode after that. The resulting IL code will look like this:

MSIL
IL_0000: newobj instance void [mscorlib]System.Exception::.ctor()
IL_0005: throw  

To create this code first we have to get hold of a MethodReference to the empty constructor of the Exception class. The easiest way for us to do that is to use Reflection first, then import it to our main module then with using the ILProcessor as before we insert the object creation and throwing Opcodes:

C#
public void Throws()
{
    //Obtain the class type through reflection
    //Then import it to the target module
    var reflectionType = typeof(Exception);
    var exceptionCtor = reflectionType.GetConstructor(new Type[]{});
 
    var constructorReference = MainModule.Import(exceptionCtor);
 
    //Get the site of code injection
    var ilProcessor = Method.Body.GetILProcessor();
    var firstInstruction = ilProcessor.Body.Instructions.First();
 
    ilProcessor.InsertBefore(firstInstruction, ilProcessor.Create(
        OpCodes.Newobj, constructorReference));
    ilProcessor.InsertBefore(firstInstruction, ilProcessor.Create(OpCodes.Throw));
} 

Now that we know how to create an object we will create a function which throws a custom exception with its required arguments.

C#
public void Throws<TException>(params object[] arguments) where TException : Exception
{
    var reflectionType = typeof(TException);
    var argumentTypes = arguments.Select(a => a.GetType()).ToArray();
    var exceptionCtor = reflectionType.GetConstructor(argumentTypes);
    var constructorReference = MainModule.Import(exceptionCtor);
 
    //Get the site of code injection
    var ilProcessor = Method.Body.GetILProcessor();
    var firstInstruction = ilProcessor.Body.Instructions.First();
 
    //Load arguments to the evaluation stack
    foreach (var argument in arguments)
    {
        ilProcessor.InsertBefore(firstInstruction,
                    ilProcessor.CreateLoadInstruction(argument));
    }
    ilProcessor.InsertBefore(firstInstruction, ilProcessor.Create(
        OpCodes.Newobj, constructorReference));
    ilProcessor.InsertBefore(firstInstruction, ilProcessor.Create(OpCodes.Throw));
} 

Basically there is very little difference from what we have done before: we use the same way of getting the required constructor via Reflection, but now we've extended it to get constructors with parameters as well. After obtaining the required constructor reference by importing it using the MainModule, we insert our IL code at the beginning of the method as usual.

The new part is the CreateLoadInstruction extension of the ILProcessor which is a simplification over the methods creating load instructions:

C#
public static Instruction CreateLoadInstruction(this ILProcessor self, object obj)
{
    if (obj is string)
        return self.Create(OpCodes.Ldstr, obj as string);
    else if (obj is int)
        return self.Create(OpCodes.Ldc_I4, (int)obj);
 
    throw new NotSupportedException();
}  

Reweaving properties

Replacing the code of a getter method of a property is very similar to what we used for faking the return value of a static function. First we create a new class ReweavePropContext that will hold the PropertyDefinition instance.

C#
public ReweavePropContext SetupProp(Expression<Func<string>> expression)
{
    var memberExpression = expression.Body as MemberExpression;
    var declaringType = memberExpression.Member.DeclaringType;
    var propertyType = memberExpression.Member;
 
    var typeDef = _assemblyDefinition.MainModule.Types
        .Single(t => t.Name == declaringType.Name);
    var propertyDef = typeDef.Properties
        .Single(p => p.Name == propertyType.Name);
 
    return new ReweavePropContext
    {
        MainModule = _assemblyDefinition.MainModule,
        Property = propertyDef,
    };
}

To get a certain property we just have to access the Properties collection of TypeDefinition instance and get the PropertyDefinition we want to use. After that in the reweaving function we inject the new return value:

C#
public void Returns(object returnValue)
{
    var getterMethod = Property.GetMethod;
    var returnString = returnValue as string;
 
    //Get the site of code injection
    var ilProcessor = getterMethod.Body.GetILProcessor();
    var firstInstruction = ilProcessor.Body.Instructions.First();
 
    ilProcessor.InsertBefore(firstInstruction, ilProcessor.Create(
        OpCodes.Ldstr, returnString));
    ilProcessor.InsertBefore(firstInstruction, ilProcessor.Create(OpCodes.Ret));
}

To obtain the getter function we just need to access the PropertyDefinition’s GetMethod property and now we can use again a MethodDefinition for overwriting the getter’s return value. From here on we can use the same code we used before for method overwriting.

Before we begin reweaving the setter of a property, first let’s examine its IL code:

MSIL
.method public hidebysig specialname instance void set_Occupation 
    (string 'value') cil managed {
    //omitted code that indicates this is a compiler generated code
    .maxstack 8
    IL_0000: ldarg.0
    IL_0001: ldarg.1
    IL_0002: stfld string TestLibrary.Person::'<Occupation>k__BackingField'
    IL_0007: ret
}

Going through this IL code the first thing we stumble upon is that there are two instructions that say to load argument 0 and 1 onto the evaluation stack (ldarg.0 and ldarg.1). But hey, we only got one argument in the method description which is the value we want to set our property to. So why is there two load instructions when we have only one argument? The answer is that for every method that is annotated with the instance keyword there is a 0. argument that is the instance itself. This will be clear when we take a look at the usage of the Occupation property’s setter:

MSIL
IL_00c9: ldloc.2
IL_00ca: ldstr "Programmer"
IL_00cf: callvirt instance void [TestLibrary]TestLibrary.Person::set_Occupation(string) 

Here the ldloc.2 instruction loads the local variable with index 2 onto the evaluation stack (which is the Person instance), then the ldstr instruction loads the string Programmer onto the evaluation stack as well. Last with the callvirt instruction the Occupation’s setter function is called, whose two arguments get the values loaded to the evaluation stack (the instance and the string).

The reweaving function in this case will be different as we won’t just insert new instructions but replace the load instructions that place the new value onto the evaluation stack:

C#
public void Sets(object valueToSet)
{
    var setterMethod = Property.SetMethod;
    var stringValue = valueToSet as string;
 
    //Get the load instruction to replace
    var ilProcessor = setterMethod.Body.GetILProcessor();
    var argumentLoadInstructions = ilProcessor.Body.Instructions
        .Where(l => l.OpCode == OpCodes.Ldarg_1)
        .ToList();
 
    var fakeValueLoad = ilProcessor.Create(OpCodes.Ldstr, stringValue);
    foreach (var instruction in argumentLoadInstructions)
    {
        ilProcessor.Replace(instruction, fakeValueLoad);
    }
}

With this we have rewritten the usage of the value variable in the whole setter function, not only where we set the automatically created backing field but all other possible usages as well.

Using the solution

The example solution contains four projects and from these the most interesting is the ILCodeWeaving project as this is the one that contains the actual logic for reweaving the code of a .NET dll. The code that will be overwritten is located in the TestLibrary project. The other two projects are for running the example (TestRunnerConsole) and for calling the reweaving code (TestWeaverConsole).

Let's build the solution and run the TestRunnerConsole. Now you should see the following output:

Image 2

To overwrite the TestLibrary.dll you only need to start the TestWeaverConsole project and let it run. When it finishes it will simply close and the next time you run the TestRunnerConsole you will see the output generated using the dll we've rewoven:

Image 3

Now we can see that the output is different except the example titles. One other thing to note is that the TestLibrary.dll is copied into the TestDlls folder upon build and that dll is referenced instead of the project. The reason for this is to be able to open both the rewoven and the original dlls through our choice of IL code reader such as ILSpy

Conclusion

 

Just dive into the code and find out how easy it is to reweave code with Mono.Cecil. I strongly encourage you to tweak around this code. And well, I hope you liked this article and I do intend to continue writing articles about Mono.Cecil and its usage both for testing and for Aspect Oriented Programming.

 

If you have any questions just ask me at the comments and I will gladly answer it!

 

Update history

  • 2013.12.13 - Corrected parts of the article.
  • 2014.05.25 - Added the "Using the solution" part to the article.
  • 2015.10.21 - Added reference to my blogpost about IL coding. 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer
Hungary Hungary
Well, like many of you I'm mainly a .NET web developer most acquainted with ASP.NET MVC but I consider myself an omnivore: I like the whole stack of programming from Assembly programing, C# and even the UX design. I know focusing on a lot of things may stop you from being an expert on a particular area, however I think I learned a lot from the paradigms applied in different fields.

For my other posts check out my blog at: http://dolinkamark.wordpress.com

Comments and Discussions

 
QuestionIs an IL's execution atomic? Just like a piece of an assembly instruction? Pin
Member 1076420225-Feb-16 15:21
Member 1076420225-Feb-16 15:21 
AnswerRe: Is an IL's execution atomic? Just like a piece of an assembly instruction? Pin
Dolinka Márk Gergely23-Mar-16 1:18
Dolinka Márk Gergely23-Mar-16 1:18 
QuestionHey wait!) Pin
Mikant1-Feb-16 9:31
Mikant1-Feb-16 9:31 
AnswerRe: Hey wait!) Pin
Dolinka Márk Gergely23-Mar-16 0:55
Dolinka Márk Gergely23-Mar-16 0:55 
QuestionWhat about reweaving the base type? Pin
dig_dug_d24-Feb-14 12:46
dig_dug_d24-Feb-14 12:46 
AnswerRe: What about reweaving the base type? Pin
Dolinka Márk Gergely10-Mar-14 12:02
Dolinka Márk Gergely10-Mar-14 12:02 
QuestionGood article Pin
Rajesh Pillai18-Feb-14 22:05
Rajesh Pillai18-Feb-14 22:05 
Thanks for the good article. Waiting for more in this series from you. 5 from me!
Enjoy Life,
Rajesh Pillai
http://tekacademy.com/




QuestionThanks. And God Lord Pin
Junod15-Dec-13 22:57
professionalJunod15-Dec-13 22:57 
GeneralMy vote of 5 Pin
Florian Rappl14-Dec-13 0:16
professionalFlorian Rappl14-Dec-13 0:16 
Questionthanks so much Pin
Sacha Barber13-Dec-13 20:10
Sacha Barber13-Dec-13 20:10 
AnswerRe: thanks so much Pin
Dolinka Márk Gergely15-Dec-13 20:37
Dolinka Márk Gergely15-Dec-13 20:37 
QuestionRe: thanks so much Pin
Southmountain18-Feb-14 5:42
Southmountain18-Feb-14 5:42 
AnswerRe: thanks so much Pin
Dolinka Márk Gergely18-Feb-14 8:16
Dolinka Márk Gergely18-Feb-14 8:16 
GeneralRe: thanks so much Pin
Southmountain19-Feb-14 5:03
Southmountain19-Feb-14 5:03 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.