Background
During years I wrote articles in which I talk about architecture problems that are so common that are many times used as "good practices", especially because they are present in big and well known frameworks.
In those articles I usually get one of those well-known frameworks to explain the problems and to propose solutions but I rarely give those solutions working, simply because I can't change third-party closed-source libraries. It is not important if a problem could be solved by a single line of code when I can't put that line of code in the right place.
The biggest flaw I see on most frameworks is the lack of delegation or the incorrect use of delegation, which directly affects their expandability and their capacity to interact with external objects.
For example, the Convert
class in .NET seems to be the solution to convert objects from one type to another but it actually only knows how to do some basic conversions. Its "expansion" support requires the types to provide their own conversions by implementing IConvertible
, without any possibility of "attaching" conversions to already existing types. This limits the number of places where we can count on Convert
as we may need to convert enums to some other types (and enums can't implement interfaces) or we are handed already created instances that are easy to convert but don't implement IConvertible
and that we simply don't have the source code to change.
The [TypeConverter]
is somewhat better, somewhat the same. The conversions are done by a different type, so the object that's going to be converted doesn't need to know anything about possible conversions, which is a good thing. The [TypeConverter]
may be chosen by the type itself (being similar to implementing the conversion directly on the type, but still allowing each class to have a single responsibility and working over enums) or it can be set/replaced in properties that want a different [TypeConverter]
. Yet, the [TypeConverter]
also needs to know all the valid conversions and, in practice, most [TypeConverters]
only convert their single target type to and from strings
. Nothing else.
Wouldn't be better if applications could simply register conversions from one type to another as they see fit? bool
to Visibility
(used frequently in WPF) comes to my mind as something that I would like to register as a default conversion, for example.
Well, that's the purpose of the Expandable Frameworks. They are mostly containers for some kind of action and they can be completely configured by the application.
Introduction
Together with this article I am providing three Expandable Frameworks. Those frameworks are for Binary Serialization, Data-Type Conversion and for Inversion of Control (IoC). All of them have a pretty similar and expandable architecture.
I already published some ancient versions of them in other articles but the new ones are built as Portable Class Libraries targeting .NET Framework 4.5, Windows 8, Windows Phone 8.1 and Windows Phone Silverlight 8. They were also built on the idea of being simple to understand and to reimplement by looking only at the main interfaces.
Overview of the Frameworks' Architecture
Considering the architecture of the frameworks is pretty similar, I will describe it only once.
All those frameworks are built as interface first, implementation later. Literally.
This is why you will see that there are 2 libraries for each. The real purpose of the framework is almost entirely abstract, being represented by a main interface (IIocContainer
, IConversionContainer
or IBinarySerializationContainer
, according to each framework), by the supporting delegate types, item specific interfaces (like the converter or serializer for a particular type) and a global entry point. This is the base library and code that only wants to consume these frameworks (not to instantiate them) doesn't need to reference the "implementation" libraries, only these (and actually the default implementation doesn't need to be used at all, being completely replaceable... great for testing/mocking).
Then, there are the libraries with the default implementations of the frameworks. Those libraries actually contain a Thread-Safe (concurrent and lock free on reads) implementation of the container, a thread-unsafe version with some decorators so you can make the thread-unsafe version become thread-bound or even thread-isolated (each thread with its own instance) and, except for the Inversion of Control container which is completely configured by the user applications, they also come with some default actions, like item serializers for the most common types or the most basic conversions. It is not an error, only the most common and portable conversions/item serializers come by default. The purpose is not to have all conversions/serializers but to support registering new conversions/serializers for specific items and have access to them using a common API.
All these implementation libraries make reference to another library. In particular, they need the ThreadSafeGetOrCreateValueDictionary, which is kind of a reduced and safer version of the ConcurrentDictionary
, and the (ThreadSafe)PriorizableCollection
, which is used to run the searchers based on their priority.
Note: Later I will explain what's the purpose of a searcher.
Containers and some explanations
There's no secret: These frameworks are expandable because they are containers that can be configured at run-time. That is, they don't do the action themselves, they simply allow to register the required actions/items, which can be implemented by completely unrelated classes and assemblies.
Yet, when I presented expandable solutions in the past I received criticisms like:
- "Oh, if we need to write every conversion ourselves it would be a nightmare. It is good to have classes that have default conversions";
- "You probably never worked in a big project. Imagine how hard it would be to register a serializer for every new serializable type you have in your application. It is much better to simply mark classes with
[Serializable]
and let the framework do the job for us."
Well, I never said attributes, default conversions or even other simple ways of providing hints together with the types can't be used. I was only pointing out that a framework should not depend on that. The target types may be built prepared to be used by those frameworks and having all the traits expected by the frameworks or they may not know about those frameworks at all, yet that doesn't mean they should be incompatible with those frameworks. Saying that we can always create adapters is far from the truth, as we may be receiving "unknown" objects and we would need some kind of expandable adapter generator to deal with them, so we would need this kind of solution anyway (and it would also be terrible if we have really big object trees).
In any case, the default implementations that I am providing are actually capable of serializing objects marked with the [Serializable]
attribute or to convert types using the [TypeConverter]s
, but that's not the strong point. I could be providing those as separate libraries as they are independent on how the container is implemented. Yet, I really believe most users will want both the default containers and the default actions, that's why they are shipped together.
Initializing the Frameworks with all default capabilities
The conversion and serialization frameworks come with some default converters and item serializers, yet you can still create an empty container by doing something like:
var container = new ThreadSafeConversionContainer();
Then you can of course register the default converters and searchers by hand (they are all public classes inside namespaces like Pfz.ExpandableContainers.BinarySerialization.ItemSerializers
and Pfz.ExpandableContainers.BinarySerialization.Searchers
).
Of course, I imagine that's not what most developers want. So, if your purpose is to create a fully configured container, use the ConversionContainerFactory
. It can be done like this:
var factory = new ConversionContainerFactory();
GlobalConversionContainer.Instance = factory.CreateThreadSafeContainer();
By doing this, instead of registering all the converters and searchers by hand, you select all of them by default with the option to avoid some specific ones.
I will not show how to do it for the Binary Serialization framework, but it is quite equivalent. The IoC container doesn't have that kind of factory because there's no default action for an IoC container.
Direct registration versus searchers
You may have seen that I talked about registering conversions and searchers on the last topic.
Actually, registering a conversion from string
to int
can be done directly, like this:
container.Register(StringToInt32Converter.Instance);
Or, if you don't have a class and prefer writing it through delegates, it can be like this:
container.Register<string, int>((str) => int.Parse(str));
But how do you register a conversion from an IEnumerable<T>
to a an array of type T[]
, considering that many classes are IEnumerable<T>
and that the T
generic parameter can be any type?
There's no method to register generic types and when looking for a converter from List<int>
to int[]
, a registration made to IEnumerable<int>
will not be used. It would be possible to create specific methods to register and search for abstract or generic types, but this will probably complicate the search logic and would be incomplete. What about lazy-loading conversions from external DLLs? Should we create specific methods for that too?
Well, all of those can be achieved through the use of Searchers
. Every time a converter for a specific input/output type is not found, the container runs the searchers, in order of priority (and in case of equal priorities, in order of registration) to ask for a converter for the given input/output types. If the Searcher
is able to load/find/create a converter for that specific request, it is enough to set that converter on the searcher args and everything will work fine. So, a Searcher may identify it supports an interface the class implements and provide a converter.
I know, the specific registration for generic types will probably be easier to use than writing the searchers. Yet those can be written on top of the searchers. It wouldn't be possible to write external DLL searchers on top of a generic registration method, for example. So, the more expandable solution is the one provided, the others can be created without requiring new changes to the framework.
Only to show the different cases, the conversions of all string
to nullable types are supported by a Searcher
that actually looks for a converter from string
to the non-nullable struct to then generate a searcher that first checks for null and, if it is not null, uses the found converter. Note how special is this case, as it is not a simply registration from string
to all nullable types. Only when there's a converter for the non-nullable type the searcher will provide a result.
The container methods
Considering what we already saw about the containers, it is known that all the containers have a Register
and a RegisterSearcher
method.
The IoC container has a Register
that allows you to specify the Type
that will be used on future searches and a delegate to create/return the value being registered.
The Binary Serialization container has a Register
that allows you to register a specific item serializer (like the serializer for int
, or for string
). The item serializer is expected to work with a single type and not to support many types.
The Conversion container has a Register
that allows you to register a Converter
, which actually deals with a specific input type and a specific output type only.
The RegisterSearcher
receives a delegate to do the search and a priority. The delegate in all cases receive an args which all the needed information (requested Type for the IoC container and Serialization, Input and Output type for the converter) and has a property to set the result. Setting the result not only stops the search (so lower priority searchers will not run) as it will cache the result, so the searcher will not be run twice with the same input parameters. Setting null
as the result is valid and it will force the search to stop and will cache that no result is available.
Finally, there are the container specific methods:
-
IoC Container
Get<T>: What can I say? You request to get an item of a type. If a delegate was already registered, it will be executed to either create a new instance of the requested type or to return a possible singleton instance.
Of course, if one is not registered, then the Searchers
will be run and, if a delegate is given, it will be cached and also executed to generate a result.
If a result can't be generated, an exception is thrown. Note that null
can be returned if a delegate to return null
is provided. A really low priority searcher can be used to return null when a value is not found, so you will avoid having exceptions in your application, if that's desirable.
-
Conversion Container
TryGetConverter (generic and non-generic): At this moment there's no global Convert.ChangeType
equivalent (but it is pretty easy to create one in another class and then redirect to this call). What users are expected to do is to request a converter from a type and to a type, and then use it to do one or more conversions.
There's both a generic and a non-generic version of the methods because both are common scenarios. We either ask to convert a string
to int
(for example) or we ask to convert this unknown typed object to some other type by providing the Type
objects at run-time.
-
Binary Serialization Container
This is the most complex container. Maybe a version with only CreateSerializer
would be easier to understand, but considering performance and even security, it has the following methods:
- CreateSerializer: Creates a serializer/deserializer bound to a stream. Note that creating a serializer and calling serialize 2 or more times is probably going to give different results than creating a new serializer every time as a kind of "compression" happens on consecutive calls;
- RegisterDefaultDataType: Both the serializer and the deserializer must agree about the default data-types, registering them in the same order. Either the
Type
itself is a default data-type, allowing the serialization of the type information needed when handling any kind of object, or all the types that are going to be serialized must be registered as default types. Default types are sent by index, not by serializing their type information, thus reducing the code size and avoiding a possible unsafe search of the type; - TryGetItemSerializer (generic and non-generic): Similar to what happens in the conversion container, it is possible that you want to get a particular item serializer so you can reuse it many times. It is faster to call the specific item serializer many times than always calling the
Serialize
again, which will need to do a search for the type to be serialized.
Binary Serialization Security
There are many reasons to avoid using binary serialization as a sharing mechanism, be it local or remote. The first one is that it is version dependent. Add a new field to an object and any previously serialized data of that type will be considered corrupted because the size of the data doesn't match (maybe the item serializer can deal with this, but it is up to the item serializer to do it). This doesn't happen with XML Serialization, for example, as it will simply leave that new field/property with its default value.
But the worst thing is the fact that it is usually a security problem if you use it to load external data or to do remote communications, at least if we don't take the necessary precautions.
Registering the Type
as a default type gives us the possibility to serialize really complex objects without problems, as the type information of any object that's not of a default type will be serialized as a way to make the deserialization capable of finding it back. The risk here is that, when loading, any Type
(and consequently assembly) identified in the stream will be loaded. This may be used to force an application to load conflicting versions of the same DLL in parallel, to load an unknown amount of libraries and who knows what the static constructors of random classes and assemblies may be doing?
So, the first thing for safety is to avoid registering the Type
information itself as a default type. But, in this case, absolutely all types that may be serialized must be registered and in the same order by the serializer and the deserializer. This can be quite difficult (if not impossible, in some cases).
That's not all, though. It is still possible to attack the serializer, at least with the purpose of causing a denial of service, by giving a really big size for any variable sized data. Strings, arrays and collections in general are in this case. If you look at the default implementation, there's no limit on the size of these objects. This means that we can serialize strings of up to 2gb. Yet, is this really what's supposed to happen in a communication scenario?
Note that even if you build your own remoting architecture, limiting the size of the received packaged to 4kb (for example), the deserializer will probably read the size value (even if it is 2gb), will allocate that amount of memory and then will try to read the rest of the contents (when it will fail). It's too late, as allowing the allocation of 2gb to happen is the problem for a server.
So, it is possible to use binary serialization to share data and for communication purposes, but be aware that it is naturally unsafe. So, only use it for local communications, not over the internet, or take care of every detail. The provided item serializers for any variable sized data aren't safe to be used across the internet.
No Unregister Methods
Those of you who already read some of my past articles like the Expandable IoC container or Actionless Frameworks may see that the actual implementations is not as complete as I proposed things to be.
The Expandable IoC Container article presented a quite similar code to the one I am providing now, with 2 main differences: It didn't support priorities for the searchers, executing them in the order they were registered (so the new one is better), and both the searchers and direct registrations could be unregistered. The version I am providing now doesn't have any Unregister
method.
In the Actionless Frameworks I went much further, talking about global and local configurations and their interactions, about notification events and I said that the most complete the framework is, the better. So, why am I doing things that are less complete now?
And the answer is that I had to make a choice. The most complete solution is actually harder to understand, making it hard to implement, to use, to debug and is, in general, less predictable.
For example, when registering a composite searcher (a searcher that uses the container itself to find a base item) a new object is created to hold the information about the container to be used. As this object is never returned to the user, how would the user request to unregister that particular composite searcher?
Even if you know the answer (give that particular instance back to the user or make the helper object comparable), what should happen to items that were found thanks to a searcher when the searcher itself is unregistered? Should those items be unregistered too?
On the opposite situation, if we do a search and an item is found, can we unregister that particular item without unregistering the searcher? Would this unregister call actually avoid the item to be found again by running the searcher again? If it does, will it avoid a result to come from a different searcher (like a lower priority one)?
All those can be decided and done (and I actually did it) but those complications would make new implementations much harder to be made or will probably introduce lots of compatibility issues between implementations. So, to avoid all those complications and make the container code readable, I opted to avoid the Unregister
methods and make the implementations much, much simpler.
Users will also be sure that as soon as they find a result, that result will not change except if an explicit call is made to replace it. Nothing else. So the results are much more predictable.
Quality of the Code
These frameworks were largely tested before converting them to Portable Class Libraries. As Portable Class Libraries, I needed to adapt many things and I only did the basic testing. So, it is possible that things may not work by platform specific traits or simply that I have used completely wrong namespaces or similar.
About possible incompatible issues, I am counting on the fact that if you receive two references to the same type or method info, that it will be actually the same instance (or at least will compare equally by the default comparer). If this is not the case, then direct registrations will not work and registrations based on the searcher will keep creating new instances all the time and registering them, consuming more memory and causing lots of other related problems.
In my tests everything went fine using .NET and Windows Store apps. I know that things would not work if I was targetting Silverlight 5 (that's one of the main reasons to avoid Silverlight 5 on these libraries) and I simply didn't test it with Windows Phone Silverlight 8, but I hope it is still fine.
So, do your own tests before trusting these libraries and use them at your own risk.
The Sample Application
By downloading the libraries you will also receive a sample Console application. The only thing the application do is to instantiate the containers with their default config, add a single extra item (that is, a single extra conversion, binary item serializer or actual IoC item) and then request for it.
That is, the application is useless to be run, its only purpose is to show how to setup the frameworks and use them for the most basic actions. So, see its source, don't expect it to be useful otherwise.