Introduction
When Microsoft brought out the Managed Extensions to C++ with VS.NET 7, C++ programmers accepted it with mixed reactions. While most people were happy that they could continue using C++, nearly everyone was unhappy with the ugly and twisted syntax offered by Managed C++. Microsoft obviously took the feedback it got very seriously and they decided that the MC++ syntax wasn't going to be much of a success.
On October 6th 2003, the ECMA announced the creation of a new task group to oversee development of a standard set of language extensions to create a binding between the ISO standard C++ programming language and Common Language Infrastructure (CLI). It was also made known that this new set of language extensions will be known as the C++/CLI standard, which will be supported by the VC++ compiler starting with the Whidbey release (VS.NET 2005).
Problems with the old syntax
- Ugly and twisted syntax and grammar - All those double underscores weren't exactly pleasing to the eye.
- Second class CLI support - Compared to C# and VB.NET, MC++ used contorted workarounds to provide CLI support, for e.g. it didn't have a for-each construct to enumerate .NET collections.
- Poor integration of C++ and .NET - You couldn�t use C++ features like templates on CLI types and you couldn�t use CLI features like garbage collection on C++ types.
- Confusing pointer usage - Both unmanaged C++ pointers and managed reference pointers used the same
*
based syntax which was quite confusing because __gc
pointers were totally different in nature and behavior from unmanaged pointers.
- The MC++ compiler could not produce verifiable code
What C++/CLI gives us?
- Elegant syntax and grammar -This gave a natural feel for C++ developers writing managed code and allowed a smooth transition from unmanaged coding to managed coding. All those ugly double underscores are gone now.
- First class CLI support - CLI features like properties, garbage collection and generics are supported directly. And what's more, C++/CLI allows jus to use these features on native unmanaged classes too.
- First class C++ support - C++ features like templates and deterministic destructors work on both managed and unmanaged classes. In fact C++/CLI is the only .NET language where you can *seemingly* declare a .NET type on the stack or on the native C++ heap.
- Bridges the gap between .NET and C++ - C++ programmers won't feel like a fish out of water when they attack the BCL
- The executable generated by the C++/CLI compiler is now fully verifiable.
Hello World
using namespace System;
void _tmain()
{
Console::WriteLine("Hello World");
}
Well, that doesn't look a lot different from old syntax, except that now you don't need to add a reference to mscorlib.dll because the Whidbey compiler implicitly references it whenever you compile with /clr (which now defaults to /clr:newSyntax).
Handles
One major confusion in the old syntax was that we used the * punctuator with unmanaged pointers and with managed references. In C++/CLI Microsoft introduces the concept of handles.
void _tmain()
{
String^ str = "Hello World";
Console::WriteLine(str);
}
The ^ punctuator (pronounced as cap) represents a handle to a managed object. According to the CLI specification a handle is a managed object reference. Handles are the new-syntax equivalent of __gc
pointers in the MC++ syntax. Handles are not to be confused with pointers and are totally different in nature from pointers.
How handles differ from pointers?
- Pointers are denoted using the
*
punctuator while handles are denoted using the ^
punctuator.
- Handles are managed references to objects on the managed heap, pointers just point to a memory address.
- Pointers are stable and GC cycles do not affect them, handles might keep pointing to different memory locations based on GC and memory compactions.
- For pointers, the programmer must
delete
explicitly or else suffer a leak. For handles delete
is optional.
- Handles are type-safe while pointers are most definitely not. You cannot cast a handle to a
void^
.
- Just as a
new
returns a pointer, a gcnew
returns a handle.
Instantiating CLR objects
void _tmain()
{
String^ str = gcnew String("Hello World");
Object^ o1 = gcnew Object();
Console::WriteLine(str);
}
The gcnew
keyword is used to instantiate CLR objects and it returns a handle to the object on the CLR heap. The good thing about gcnew
is that it allows us to easily differentiate between managed and unmanaged instantiations.
Basically, the gcnew
keyword and the ^
operator offer just about everything you need to access the BCL. But obviously you'd need to create and declare your own managed classes and interfaces.
Declaring types
CLR types are prefixed with an adjective that describes what sort of type it is. The following are examples of type declarations in C++/CLI :-
- CLR types
- Reference types
ref class RefClass{...};
ref struct RefClass{...};
- Value types
value class ValClass{...};
value struct ValClass{...};
- Interfaces
interface class IType{...};
interface struct IType{...};
- Enumerations
enum class Color{...};
enum struct Color{...};
- Native types
class Native{...};
struct Native{...};
using namespace System;
interface class IDog
{
void Bark();
};
ref class Dog : IDog
{
public:
void Bark()
{
Console::WriteLine("Bow wow wow");
}
};
void _tmain()
{
Dog^ d = gcnew Dog();
d->Bark();
}
There, the syntax is now so much more neater to look at than the old-syntax where the above code would have been strewn with double-underscored keywords like __gc
and __interface
.
Boxing/Unboxing
Boxing is implicit (yaay!) and type-safe. A bit-wise copy is performed and an Object
is created on the CLR heap. Unboxing is explicit - just do a reinterpret_cast
and then dereference.
void _tmain()
{
int z = 44;
Object^ o = z;
int y = *reinterpret_cast<int^>(o);
Console::WriteLine("{0} {1} {2}",o,z,y);
z = 66;
Console::WriteLine("{0} {1} {2}",o,z,y);
}
The Object
o
is a boxed copy and does not actually refer the int
value-type which is obvious from the output of the second Console::WriteLine
.
When you box a value-type, the returned object remembers the original value type.
void _tmain()
{
int z = 44;
float f = 33.567;
Object^ o1 = z;
Object^ o2 = f;
Console::WriteLine(o1->GetType());
Console::WriteLine(o2->GetType());
}
Thus you cannot try and unbox to a different type.
void _tmain()
{
int z = 44;
float f = 33.567;
Object^ o1 = z;
Object^ o2 = f;
int y = *reinterpret_cast<int^>(o2);
float g = *reinterpret_cast<float^>(o1);
}
If you do attempt to do so, you'll get a System.InvalidCastException
. Talk about perfect type-safety! If you look at the IL generated, you'll see the MSIL box
instruction in action. For example :-
void Box2()
{
float y=45;
Object^ o1 = y;
}
gets compiled to :-
.maxstack 1
.locals (float32 V_0, object V_1)
ldnull
stloc.1
ldc.r4 45.
stloc.0
ldloc.0
box [mscorlib]System.Single
stloc.1
ret
According to the MSIL docs, "The box instruction converts the �raw� valueType (an unboxed value type) into an instance of type Object (of type O). This is accomplished by creating a new object and copying the data from valueType into the newly allocated object."
Further reading
Conclusion
Alright, so why would anyone want to use C++/CLI when they can use C#, J# and that VB thingie for writing .NET code? Here are the four reasons I gave during my talk at DevCon 2003 in Trivandrum (Dec 2003).
- Compile existing C++ code to IL (/clr magic)
- Deterministic destruction
- Native interop support that outmatches anything other CLI languages can offer
- All those underscores in MC++ are gone ;-)