Basic Concepts
- Serialization: A mechanism to transform the state of an object into a persistable format.
- Deserialization: Restore the state of an object from a persistable format.
- Binary serialization: Serialization technique to transform the state of an object into a binary stream.
Problem
Microsoft .NET provides a binary serializer in the System.Runtime.Serialization.Formatters.Binary
namespace. Here is a simple example of how that works:
using System;
using System.Text;
using System.IO;
using System.Runtime.Serialization.Formatters.Binary;
namespace ConsoleApplication1
{
[Serializable]
class TestClass
{
public String Name;
}
class Program
{
static void Main(string[] args)
{
TestClass oT = new TestClass();
oT.Name = "Hello Bin Serializer";
MemoryStream ms = new MemoryStream();
BinaryFormatter bf = new BinaryFormatter();
bf.Serialize(ms, oT);
}
}
}
The binary stream generated by this formatter is 173 bytes long, and when converted to characters looks like:
"\0\0\0\0????\0\0\0\0\0\0\0\f\0\0\0JConsoleApplication1,
Version=1.0.0.0, Culture=neutral,
PublicKeyToken=null\0\0\0ConsoleApplication1.TestClass\0\0\
0Name\0\0\0\0\0\0Hello Bin Serializer\v"
173 bytes just to serialize a simple class !!
That may not be suitable for high performance applications with a low memory budget.
Solution
The good news is that you can write a custom surrogate class that can serialize and deserialize your class into/from a binary stream.
The bad news is that you need to write this surrogate for every class you need to serialize.
And this is where .NET IL and the System.Reflection.Emit.ILGenerator come to the rescue. Using this class and a good working knowledge of MSIL, you can auto generate serialization surrogates on the fly. Here are the basic steps:
- Define a custom attribute that you can use to tag the class you want to generate serialization surrogates for.
- During the startup of your assembly, walk through all the types that have this custom attribute and generate a serialization surrogate for them.
- Create a class with an interface similar to the binary formatter that internally delegates the call to the IL serialization surrogate.
Sounds pretty easy, but unfortunately, the hardest part is to code the serialization surrogate using IL. Well, don’t lose heart yet, for I will show you how to write one and also give a reference implementation free. What do you say? It is a good deal, right?
OK then, let's get started. First, let me give a quick tutorial on IL.
Quick IL Tutorial
Write a simple hello world app in C#:
class HelloIL
{
public static void Main()
{
System.Console.Writeline("Hello IL");
}
}
Compile it and then open the application in IL DASM (in Visual Studio, go to Tools/ILDasm). Double click the Main node to see the IL:
This is what the IL looks like:
.method public hidebysig static vod Main() cil managed
{
.entrypoint
.maxstack 8
IL_0000: nop
IL_0001: ldstr "Hello IL"
IL_0006: call void [mscorlib]System.Console::WriteLine(string)
IL_000b: nop
IL_000c: ret
}
Here is where the fun begins. Using the following classes in the System.Reflection.Emit namespace, you can generate IL at runtime in any .NET app.
Now coming back to creating our serialization surrogates using this namespace. Here are the steps.
Steps to Write an IL Binary Serializer
- Define an interface that our dynamic serialization surrogate will implement:
public interface IHiPerfSerializationSurrogate
{
void Serialize(BinaryWriter writer, object graph);
object DeSerialize(BinaryReader reader);
}
- Using the
AssemblyBuilder
class, create a dynamic assembly within the current app domain:
AssemblyBuilder myAsmBuilder = Thread.GetDomain().DefineDynamicAssembly(
new AssemblyName("SomeName"),
AssemblyBuilderAccess.Run);
- Within this assembly, now define a module:
ModuleBuilder surrogateModule =
myAsmBuilder.DefineDynamicModule("SurrogateModule");
- Within the module, now define your custom serialization surrogate:
TypeBuilder surrogateTypeBuilder = surrogateModule.DefineType(
"MyClass_EventSurrogate", TypeAttributes.Public);
- Make this type an implementation of
IHiPerfSerializationSurrogate
:
surrogateTypeBuilder.AddInterfaceImplementation
(typeof(IHiPerfSerializationSurrogate));
- Now define the
Serialize
method within the surrogate:
Type[] dpParams = new Type[] { typeof(BinaryWriter), typeof(object) };
MethodBuilder serializeMethod = surrogateTypeBuilder.DefineMethod(
"Serialize",
MethodAttributes.Public | MethodAttributes.Virtual,
typeof(void),dpParams);
- And then emit a getter method for each
pub
lic property:
ILGenerator serializeIL = serializeMethod.GetILGenerator();
MethodInfo mi = EventType.GetMethod("get_" + pi.Name);
MethodInfo brWrite = GetBinaryWriterMethod(pi.PropertyType);
serializeIL.Emit(OpCodes.Ldarg_1);
serializeIL.Emit(OpCodes.Ldloc, tpmEvent);
serializeIL.EmitCall(OpCodes.Callvirt, mi, null);
serializeIL.EmitCall(OpCodes.Callvirt, brWrite, null);
- Define the
DeSerializ
e method within the surrogate:
MethodBuilder deserializeMthd = surrogateTypeBuilder.DefineMethod(
"DeSerialize",
MethodAttributes.Public | MethodAttributes.Virtual |
MethodAttributes.HideBySig | MethodAttributes.Final |
MethodAttributes.NewSlot,
typeof(object),
dpParams);
- And now emit a setter method for each property:
ILGenerator deserializeIL = deserializeMthd.GetILGenerator();
MethodInfo setProp = EventType.GetMethod("set_" + pi.Name);
deserializeIL.Emit(OpCodes.Ldloc, tpmRetEvent);
deserializeIL.Emit(OpCodes.Ldarg_1);
deserializeIL.EmitCall(OpCodes.Callvirt, brRead, null);
deserializeIL.EmitCall(OpCodes.Callvirt, setProp, null);
- Emit the serializing surrogate:
Type HiPerfSurrogate = surrogateTypeBuilder.CreateType();
- Now that we have a high performance serialization surrogate, it is time to use it. Here is how:
IHiPerfSerializationSurrogate surrogate =Activator.CreateInstance(HiPerfSurrogate);
BinaryWriter binaryWriter = new BinaryWriter(serializationStream);
binaryWriter.Write(eventType.FullName);
surrogate.Serialize(_binaryWriter, obj);
Results
using System;
using System.Text;
using System.IO;
using System.Runtime.Serialization.Formatters.Binary;
namespace ConsoleApplication1
{
[Serializable]
[ILSerialization.HiPerfSerializable]
public class TestClass
{
public String Name;
}
class Program
{
static void Main(string[] args)
{
int len = int.Parse(args[0]);
TestClass oT = new TestClass();
oT.Name = "Hello Bin Serializer";
System.Diagnostics.Stopwatch w = System.Diagnostics.Stopwatch.StartNew();
w.Start();
for (int i = 0; i < len; i++)
{
MemoryStream ms = new MemoryStream();
BinaryFormatter bf = new BinaryFormatter();
bf.Serialize(ms, oT);
ms.Close();
}
w.Stop();
Console.WriteLine("Time elapsed .net binary serializer= "
+ w.ElapsedMilliseconds);
w = System.Diagnostics.Stopwatch.StartNew();
w.Start();
for (int i = 0; i < len; i++)
{
MemoryStream ms = new MemoryStream();
ILSerialization.Formatters.HiPerfBinaryFormatter hpSer =
new ILSerialization.Formatters.HiPerfBinaryFormatter();
hpSer.Serialize(ms, oT);
ms.Close();
}
w.Stop();
Console.WriteLine("Time elapsed IL hi perf serializer= "
+ w.ElapsedMilliseconds);
}
}
}
Serializing the TestClass
defined in the problem section gives the following results:
- Byte stream size: 1/3rd the size of the .NET binary serializer (51 bytes)
- Performance: 5 times faster (for 1000000 runs, the .NET serializer took 6602 ms, our high performance serializer took 1261 ms)
Reference Implementation
"HiPerf_IL_CustomSerializer" is a reference implementation of the high performance binary serializer that is 5 times faster than the .NET binary serializer with 1/3rd the size of the serialized stream
.