Introduction
This started with a simple question in the C# forum:
As a self taught amateur in c#, I have never found a reason to implement a struct instead of a class.
Since I am self taught (No formal education) I would like to know what advantage would be gained by using a struct in lieu of a class? More importantly why when and where would it benefit my programs?
And I started to reply...then a while later realized this was a lot more involved than a simple "use it then, don't use it then" type answer.
So, when one of the replies suggested a permanent record of this, I thought it was a good idea. So, here it is: the difference between struct
and class
in one place.
Background
A struct are the same as a class, except one is a Value type and the other is a Reference type.
The end.
Well, no - it's a simple statement, and it's both true and complete... but it doesn't explain anything, and it doesn't really mean anything either.
So what is the difference? What does it mean? And when should I use a struct, and when a class?
That's a lot bigger question that you probably thought - it involves a lot of background before you can make a decision to use one or the other. So, let's have a look at the background...
Difference between value type and reference type
Struct and Class have one huge difference: struct is a value type, class is a reference type. What that means is simple to describe, but harder to grasp the significance of.
So let's start by defining one of each:
public class MyClass
{
public int I;
public int J;
}
public struct MyStruct
{
public int I;
public int J;
}
The two objects are identical, except that one is a class and one is a struct. Which means that if you declare an instance of each in your code then you get a reference and a value, but the code can look the same:
public void UseClassAndStruct()
{
MyClass mc = new MyClass();
mc.I = 1;
mc.J = 2;
MyStruct ms = new MyStruct();
ms.I = 1;
ms.J = 2;
}
Or slightly different:
public void UseClassAndStruct()
{
MyClass mc = new MyClass();
mc.I = 1;
mc.J = 2;
MyStruct ms;
ms.I = 1;
ms.J = 2;
}
Because you don't have to use the new keyword with structs. If you do, then the struct constructor is called, if you don't it isn't - simple as that. Unlike a class, the name of the struct is the struct itself, it is not a "pointer" to a instance.
And that's important, because that is the whole point: a struct is the object, and class is a reference to the object.
Creating a reference instance
When you create a class variable:
MyClass mc;
That allocates memory on the stack to hold a reference to a MyClass instance in future, it does not create an instance of MyClass, and you will have seen that before - when you try to use any property or method of a class and you get a "Object reference not set to an instance of an object" exception and your program crashes, it's because you created a variable, but didn't create or assign an actual instance to the variable.
It's a bit like referring to your car in terms of the parking space you normally use: if your wife takes the car and parks it somewhere else, then the parking space is empty, and your journey dies when you try to drive away... The space is the variable, the car is the instance.
You have to explicitly create an instance of the class by using the new keyword:
mc = new MyClass();
What this does is create a new instance of MyClass on the heap, and assign the reference to it to the variable mc. This is important, because the stack and the heap are different "types" of memory: the heap is a big "lump" of memory which is sorted out by the Garbage collector and all classes, methods and threads share it. The stack on the other hand is specific to a thread, and everything on the stack is discarded when you exit the method - which means that the mc variable is lost, but the data it references is not - if you have copied the reference to a variable outside the method, then you can still access it from the rest of your program.
Creating a value instance
What happens when you create a struct variable is different: the actual struct is immediately created on the stack and is available directly for the lifetime of the method. But it will be thrown away when the method exits. So you don't need the new unless you want to use a struct constructor to initialise your fields. You've used this a lot - probably without noticing:
int i;
double j;
Point p;
All create value types.
That's trivial, isn't it?
So that's it? Not quite. Remember I said that "the name of the struct is the struct itself, it is not a 'pointer' to a instance"? That has a big effect, which again you probably have used a lot, and not really noticed. Think about this:
int i = 3;
int j = i;
i = 4;
Console.WriteLine("{0}:{1}", i, j);
What does that produce?
Obviously, it produces a string "4:3" - anything else would make coding very, very difficult!
But...what if we do that with reference types?
MyClass i = new MyClass();
i.I = 3;
MyClass j = i;
i.I = 4;
Console.WriteLine("{0},{1}", i.I, j.I);
This time, it prints "4:4" because i and j are references to the same instance in memory, instead of being separate, self contained value types.
The same thing happens if we use our structs:
MyStruct i = new MyStruct();
i.I = 3;
MyStruct j = i;
i.I = 4;
Console.WriteLine("{0},{1}", i.I, j.I);
This time, we get "4:3" again.
When you assign a reference type variable to another reference type variable, it copies the reference, not the object.
When you assign a value type variable to another value type, it copies the content of the object, not a reference to the object.
Passing variables to methods is affected as well
So when you call a method with a reference type, a copy of the reference is passed, and any changes you make affect the one and only object.
If you call it with a value type such as a struct, a copy of the value is passed, and any changes you make will not be reflected back:
public void ClassMod(MyClass mc)
{
mc.I += 100;
Console.WriteLine(mc.I);
}
public void StructMod(MyStruct ms)
{
ms.I += 100;
Console.WriteLine(ms.I);
}
MyClass mc = new MyClass();
mc.I = 3;
ClassMod(mc);
MyStruct ms = new MyStruct();
ms.I = 3;
StructMod(ms);
Console.WriteLine("{0},{1}", mc.I, ms.I);
What do we get?
103
103
103,3
Within the method, everything works the same.
But outside...the changes we made to the struct inside the method affect the copy, not the original.
So, what does that do for us? Why use one or the other?
So what does this do for us in practice? What are the advantages of a struct over a class?
Speed, under certain conditions. Structs can be a lot slower to use if you aren't careful: if you have a large struct, just calling a method and passing it as a parameter means it must be copied which takes time. But...if they are small (16 bytes or less) and you use a lot of them then they can be a lot faster than a reference type, because the Heap is not involved. Every time you create a reference type instance, the heap must be looked at, a suitable size bit of memory found and allocated and a reference to that returned. This takes time - quite a lot of it! A value type in contrast takes almost no work to allocate: copy the stack pointer, add the size of the struct to it for next time is pretty much all you have to do (and you have to do that for reference types as well so you have somewhere to store the reference to the heap memory!)
It's never that easy, is it?
There is one bit I missed out here: boxing. This was deliberate, because it's a bit difficult to explain...
What happens when you have a value type and you want to store it with other types in a mixed List (for example)?
List<object> mixedList = new List<object>();
You can add any object perfectly happily:
mixedList.Add("Hello there");
mixedList.Add(Form1);
because object is a class that all reference types derive from. But...what happens here:
mixedList.Add(12);
Um...12 is an int which is a value type, and so isn't derived from object...is it?
Yes, it is: all value types derive from a special class called System.ValueType, which derives from object and what happens is that the value is "boxed" - a reference is created on the heap to hold the value type and it is copied there, and the reference to the boxed value is added to the list. When you cast it back to the original struct it is "unboxed" and you have a value type again. This is not a fast process and is one of the reasons you don't use structs for everything!
So, to copy from MSDN[^]:
"CONSIDER defining a struct instead of a class if instances of the type are small and commonly short-lived or are commonly embedded in other objects.
AVOID defining a struct unless the type has all of the following characteristics:
- It logically represents a single value, similar to primitive types (int, double, etc.).
- It has an instance size under 16 bytes.
- It is immutable.
- It will not have to be boxed frequently."
The immutable bit is not enforced - it is just a recommendation, otherwise integers, doubles, and so forth wouldn't work!
But...it's a very good idea to make structs immutable: it causes a lot less confusion.
The Point struct does it by making the X and Y Setters private, so that it is obvious that all Points are different instances and that you don't move the original when you change a copy.
And there's more...structs aren't just on the stack.
All of the above has talked about structs being created on the stack, but that isn't quite accurate, because you can easily use structs on the heap as well: by embedding them inside classes. Again, you've used this a lot in the past: every Control has a Location property, which is a Point, and thus a struct. The data for the Point is included within the body of the class instance, and is allocated space on the heap along with the rest of the class data.
And this is interesting...because every array in .NET is a reference type, even if the data it is an array of is a value type.
So an array of integers is a reference type, and is allocated on the heap, not the stack:
int[] myArrayOfInt = new int[100];
Does this make a difference? Yes, yes it can...
How big is the stack? Seriously, how big is it? Clearly, it's smaller than the heap, but how much data can you get on it? Well...
The default stack size for a .NET Windows application is 1 MB - but, it's only 256KB for 32-bit ASP.NET apps and 512KB for 64-bit ASP.NET apps, which is a significant difference if you are writing a web site... It can be changed: but it's not trivial - you have to modify the PE header of the executable for Windows apps, or you can change the settings in IIS for web apps if you are the admin. The Thread class has a constructor overload that takes a stack size, so that's pretty simple, if you use the full Thread model rather than the simpler BackgroundWorker.
1MB is not a lot, when you think about it: in terms of 32 bit integers, that's only 256K values, so if your struct has 16 integers - which is not a lot - you could only get 16K of them on the stack - and there is a lot of other stuff on there already: references, return addresses, that kind of thing, so it doesn't take that much to exhaust it. And for a 32Bit web site...you only get 4K of your structure on there in total.
The heap is lot bigger, so you can store a lot more on there, and having all arrays be reference types means that you can declare some HUGE arrays without running out of memory.
But...there is always a but...
Did you know .NET has a maximum size of an object? It does, you know - 2GB is the total limit. Nor single object can ever exceed 2GB. Loads and loads of space...
But...If you have a array of class instances, each individual class instance is created on the heap separately, so the array adds only the overhead needed to store the references: 32 bit on a 32 bit OS, or 64 bits on a 64 bit OS (which means that the maximum size of an array is different as well: 500 * 1024 * 1024 entries of 32 bit, or half that for 64Bit apps).
But...for value types (including structs), the array size is the number of elements multiplied by the size of the element - so as your struct grows in size, the fewer of them you can get into an array before you run out of memory. So think very, very carefully before you start creating big structs: the MS suggested limit for structs of 16 bytes is actually a pretty good idea. :laugh:
So, when do I use which?
For most applications, only ever consider using a struct if it meets Microsoft recommendations: Small, Immutable and you aren't going to have to Box it. Generally speaking use a class for nearly everything.
But, for special cases, consider a struct even if it doesn't meet those criteria if you need to save processing time in a tight loop - the lack of dereferencing that is involved in using a struct rather than a class can be a significant time saver. But beware! Passing structs to methods, or any other form of copying can cause a serious processing overhead. :laugh:
My Thanks to...
The original poster of an excellent question: David C# Hobbyist.[^] - without whom, this would never have been born.
And of course, his question: Struct vs Class[^]
MSDN: without which we would all have nothing to confuse us with accurate but frequently unhelpful information, and inaccurate examples. :sigh:
History
Original Version
2014 Feb 23 Spelling and grammar errors in "My Thanks to" section :O