Table of contents
- Introduction
- Mono.Cecil
- Faking return values
- Throwing exceptions
- Reweaving properties
- Using the solution
- Conclusion
- Update history
Introduction
In the past month I had the opportunity to get to know Mono.Cecil and use it extensively in our programs. However while I was learning it I was faced with a number of Mono.Cecil tutorials, most of them using an older version of Mono.Cecil with classes having been since removed or replaced (like AssemblyFactory of CilWorker).
This lack of coverage is the reason why I started writing tutorials for Mono Cecil. My aim is to create a series where the first two tutorials focus on testing while the other two on the use of IL code weaving for Aspect Oriented Programming.
If you are new to IL coding, you can check out my post about the basics of IL programming.
Mono.Cecil library
The Mono.Cecil is a very powerful library for inspecting and reweaving IL code. Though the target language of these articles is C#, don’t forget that we could inspect or reweave any .NET compliant language with Cecil or even generate IL code.
From these characteristics comes that the class hierarchy maps to the concepts of IL programming instead of those of the C# language and it does in a straightforward way so if you know the IL language, getting started with Mono.Cecil will be very easy and fast. But don’t worry, I will introduce the concepts of the intermediate language in parallel with Cecil, so an in-depth knowledge of IL programming is not a prerequisite.
Start our tour with getting a look at Mono.Cecil’s class hierarchy from a bird’s-eye view:
In this structure there is a highlighted ModuleDefinition
which will be our point of entry for getting the types and methods of an assembly (exposed through the AssemblyDefinition
’s MainModule
property). This module will be always part of the module collection (Modules property) and most of the time it will be the only element.
The reason for this is simple: in Visual Studio one project will be compiled into exactly one module contained in one assembly. To create multi-module assemblies we have to either use the related compiler options of the CSC specifically or use the Assembly Linker.
Faking return values
We will begin with an obligatory Hello World example with faking the return value of a static function. While doing so we will use a simplified interface that resembles mocking libraries like Moq. The steps we need to take are very straightforward, which are the following:
- Load the assembly
- Get the method we want to overwrite
- Inject code at the beginning of the method’s body
- Save the assembly
This is how faking a return value will look like with our example library:
ilCodeWeaver.Setup(() => HelloMessages.GetHelloWorld())
.Returns("All your bases are belong to us");
First we load the AssemblyDefinition
at the constructor without forgetting to store the assembly path (as it won’t be stored in the AssemblyDefinition
instance):
public ILCodeWeaver(string assemblyPath)
{
_assemblyPath = assemblyPath;
_assemblyDefinition = AssemblyDefinition.ReadAssembly(assemblyPath);
}
Emphasizing this simple piece of code is because reading in an assembly was achieved with the AssemblyFactory
class before Cecil’s 0.9 version which was completely removed yet most of the existing few tutorials still use this class and other deprecated ones.
After getting the assembly we need to infer the declaring type and method info of the method we want to overwrite. Here you can see how easy it is to mix Reflection with Mono.Cecil.
public SetupContext Setup(Expression<Action> expression)
{
var methodCall = expression.Body as MethodCallExpression;
var methodDeclaringType = methodCall.Method.DeclaringType;
var type = _assemblyDefinition.MainModule.Types
.Single(t => t.Name == methodDeclaringType.Name);
var method = type.Methods
.Single(m => m.Name == methodCall.Method.Name);
return new SetupContext {
MainModule = _assemblyDefinition.MainModule,
Method = method,
};
}
Last we insert two IL instructions to return our choice of value (currently limited only to strings).
public void Returns(object returnObject)
{
var returnString = returnObject as string;
var ilProcessor = Method.Body.GetILProcessor();
var firstInstruction = ilProcessor.Body.Instructions.First();
ilProcessor.InsertBefore(firstInstruction, ilProcessor.Create(
OpCodes.Ldstr, returnString));
ilProcessor.InsertBefore(firstInstruction, ilProcessor.Create(OpCodes.Ret));
}
As you can see, this only injects code at the beginning of the function. The reason for not meddling with the existing code is that it would take more time to change locals, etc. Think about the following usage of faking:
ilCodeWeaver.Setup(() => HelloMessages.GetSumMessage(5,11))
.Returns("the sum of x and y is none of your concern")
If we replaced the GetSumMessage
function’s body, then its IL code would look like this:
.method public hidebysig static GetSumMessage (int32 x, int32 y) cil managed
{
.maxstack 2
.locals init (
[0] int32
)
IL_0000: ldstr "the sum of x and y is none of your concern"
IL_0005: ret
}
Which would compile and run without a problem, but other than the unused local int
variable which in this case is only inelegant this approach would later cause us problems (injecting multiple conditional fake code for example).
Throwing exceptions
n the following example we will manipulate a function to throw an exception instead of its original purpose. To accomplish this we have to instantiate a new Exception object and insert the throw OpCode after that. The resulting IL code will look like this:
IL_0000: newobj instance void [mscorlib]System.Exception::.ctor()
IL_0005: throw
To create this code first we have to get hold of a MethodReference
to the empty constructor of the Exception class. The easiest way for us to do that is to use Reflection first, then import it to our main module then with using the ILProcessor as before we insert the object creation and throwing Opcodes:
public void Throws()
{
var reflectionType = typeof(Exception);
var exceptionCtor = reflectionType.GetConstructor(new Type[]{});
var constructorReference = MainModule.Import(exceptionCtor);
var ilProcessor = Method.Body.GetILProcessor();
var firstInstruction = ilProcessor.Body.Instructions.First();
ilProcessor.InsertBefore(firstInstruction, ilProcessor.Create(
OpCodes.Newobj, constructorReference));
ilProcessor.InsertBefore(firstInstruction, ilProcessor.Create(OpCodes.Throw));
}
Now that we know how to create an object we will create a function which throws a custom exception with its required arguments.
public void Throws<TException>(params object[] arguments) where TException : Exception
{
var reflectionType = typeof(TException);
var argumentTypes = arguments.Select(a => a.GetType()).ToArray();
var exceptionCtor = reflectionType.GetConstructor(argumentTypes);
var constructorReference = MainModule.Import(exceptionCtor);
var ilProcessor = Method.Body.GetILProcessor();
var firstInstruction = ilProcessor.Body.Instructions.First();
foreach (var argument in arguments)
{
ilProcessor.InsertBefore(firstInstruction,
ilProcessor.CreateLoadInstruction(argument));
}
ilProcessor.InsertBefore(firstInstruction, ilProcessor.Create(
OpCodes.Newobj, constructorReference));
ilProcessor.InsertBefore(firstInstruction, ilProcessor.Create(OpCodes.Throw));
}
Basically there is very little difference from what we have done before: we use the same way of getting the required constructor via Reflection, but now we've extended it to get constructors with parameters as well. After obtaining the required constructor reference by importing it using the MainModule
, we insert our IL code at the beginning of the method as usual.
The new part is the CreateLoadInstruction
extension of the ILProcessor which is a simplification over the methods creating load instructions:
public static Instruction CreateLoadInstruction(this ILProcessor self, object obj)
{
if (obj is string)
return self.Create(OpCodes.Ldstr, obj as string);
else if (obj is int)
return self.Create(OpCodes.Ldc_I4, (int)obj);
throw new NotSupportedException();
}
Reweaving properties
Replacing the code of a getter method of a property is very similar to what we used for faking the return value of a static function. First we create a new class ReweavePropContext
that will hold the PropertyDefinition
instance.
public ReweavePropContext SetupProp(Expression<Func<string>> expression)
{
var memberExpression = expression.Body as MemberExpression;
var declaringType = memberExpression.Member.DeclaringType;
var propertyType = memberExpression.Member;
var typeDef = _assemblyDefinition.MainModule.Types
.Single(t => t.Name == declaringType.Name);
var propertyDef = typeDef.Properties
.Single(p => p.Name == propertyType.Name);
return new ReweavePropContext
{
MainModule = _assemblyDefinition.MainModule,
Property = propertyDef,
};
}
To get a certain property we just have to access the Properties collection of TypeDefinition
instance and get the PropertyDefinition
we want to use. After that in the reweaving function we inject the new return value:
public void Returns(object returnValue)
{
var getterMethod = Property.GetMethod;
var returnString = returnValue as string;
var ilProcessor = getterMethod.Body.GetILProcessor();
var firstInstruction = ilProcessor.Body.Instructions.First();
ilProcessor.InsertBefore(firstInstruction, ilProcessor.Create(
OpCodes.Ldstr, returnString));
ilProcessor.InsertBefore(firstInstruction, ilProcessor.Create(OpCodes.Ret));
}
To obtain the getter function we just need to access the PropertyDefinition
’s GetMethod
property and now we can use again a MethodDefinition
for overwriting the getter’s return value. From here on we can use the same code we used before for method overwriting.
Before we begin reweaving the setter of a property, first let’s examine its IL code:
.method public hidebysig specialname instance void set_Occupation
(string 'value') cil managed {
.maxstack 8
IL_0000: ldarg.0
IL_0001: ldarg.1
IL_0002: stfld string TestLibrary.Person::'<Occupation>k__BackingField'
IL_0007: ret
}
Going through this IL code the first thing we stumble upon is that there are two instructions that say to load argument 0 and 1 onto the evaluation stack (ldarg.0
and ldarg.1
). But hey, we only got one argument in the method description which is the value we want to set our property to. So why is there two load instructions when we have only one argument? The answer is that for every method that is annotated with the instance keyword there is a 0. argument that is the instance itself. This will be clear when we take a look at the usage of the Occupation property’s setter:
IL_00c9: ldloc.2
IL_00ca: ldstr "Programmer"
IL_00cf: callvirt instance void [TestLibrary]TestLibrary.Person::set_Occupation(string)
Here the ldloc.2
instruction loads the local variable with index 2 onto the evaluation stack (which is the Person
instance), then the ldstr
instruction loads the string Programmer
onto the evaluation stack as well. Last with the callvirt
instruction the Occupation’s setter function is called, whose two arguments get the values loaded to the evaluation stack (the instance and the string).
The reweaving function in this case will be different as we won’t just insert new instructions but replace the load instructions that place the new value onto the evaluation stack:
public void Sets(object valueToSet)
{
var setterMethod = Property.SetMethod;
var stringValue = valueToSet as string;
var ilProcessor = setterMethod.Body.GetILProcessor();
var argumentLoadInstructions = ilProcessor.Body.Instructions
.Where(l => l.OpCode == OpCodes.Ldarg_1)
.ToList();
var fakeValueLoad = ilProcessor.Create(OpCodes.Ldstr, stringValue);
foreach (var instruction in argumentLoadInstructions)
{
ilProcessor.Replace(instruction, fakeValueLoad);
}
}
With this we have rewritten the usage of the value variable in the whole setter function, not only where we set the automatically created backing field but all other possible usages as well.
Using the solution
The example solution contains four projects and from these the most interesting is the ILCodeWeaving project as this is the one that contains the actual logic for reweaving the code of a .NET dll. The code that will be overwritten is located in the TestLibrary project. The other two projects are for running the example (TestRunnerConsole) and for calling the reweaving code (TestWeaverConsole).
Let's build the solution and run the TestRunnerConsole. Now you should see the following output:
To overwrite the TestLibrary.dll you only need to start the TestWeaverConsole project and let it run. When it finishes it will simply close and the next time you run the TestRunnerConsole you will see the output generated using the dll we've rewoven:
Now we can see that the output is different except the example titles. One other thing to note is that the TestLibrary.dll is copied into the TestDlls folder upon build and that dll is referenced instead of the project. The reason for this is to be able to open both the rewoven and the original dlls through our choice of IL code reader such as ILSpy
Conclusion
Just dive into the code and find out how easy it is to reweave code with Mono.Cecil. I strongly encourage you to tweak around this code. And well, I hope you liked this article and I do intend to continue writing articles about Mono.Cecil and its usage both for testing and for Aspect Oriented Programming.
If you have any questions just ask me at the comments and I will gladly answer it!
Update history
- 2013.12.13 - Corrected parts of the article.
- 2014.05.25 - Added the "Using the solution" part to the article.
- 2015.10.21 - Added reference to my blogpost about IL coding.
Well, like many of you I'm mainly a .NET web developer most acquainted with ASP.NET MVC but I consider myself an omnivore: I like the whole stack of programming from Assembly programing, C# and even the UX design. I know focusing on a lot of things may stop you from being an expert on a particular area, however I think I learned a lot from the paradigms applied in different fields.
For my other posts check out my blog at: http://dolinkamark.wordpress.com