Introduction
I had the need for a dynamically growing sparsely populated array. "Sparse" implies that there will be a lot of elements that contain empty values between the ones that contain non-empty values. The .Net collections namespace doesn't contain anything that meets this need. The
ArrayList
class dynamically grows, but it would have elements allocated for the empty and non-empty values alike. I plan to use this code on devices that have limited memory so this wouldn't do. I ended up making my own class to satisfy this need. While my initial need for this code is for byte
arrays I made the code generic so that it can be used with other data types.
How Big is this Class?
Since I will be talking about memory allocation I'll need to also talk about how to get a sense of how much memory that an instance of something costs. This won't account for 100% of the memory that is consumed by by allocation of the object. But it will be close enough for you to start making
judgments about if one object cost more memory or less memory than another.
There are a number of predefined value types for which the size is well known. Here are some of the most common ones.
Type | Size (in bytes) |
---|
byte | 1 |
int | 4 |
short | 2 |
float | 2 |
double | 4 |
char | 2 |
If you build a struct it is a value type too. The size of a struct will be about the sum of the size of its members. Here is an example.
struct Location
{
public int LocationID;
public double Latitude;
public double Longitude;
public double Elevation;
}
The size of this structure is 16 bytes. Now what happens if you add a reference type (something that is defined as a class instead of a struct)? How big will it be? I'll add a string to the previous example.
struct Location
{
public int LocationID;
public string LocationName;
public double Latitude;
public double Longitude;
public double Elevation;
}
There are two elements of memory to consider for the reference field. There is the size of the reference and the size of the object to which it is pointing. Think of the LocationName
field in the example above as holding a memory address to the area where the string is being held. The size of this memory address will depend on what type of processor architecture that the code is running against. If the code is JIT compiled for a 32-bit system then the
reference will be 32-bits (4 bytes) in size. If it is JIT compiled for a 64-bit system then the
reference will by 64-bits (8 bytes) in size. When I am working with just a desktop then I do my calculations based on 64-bit systems. But the code on this article will run on Windows Phone so I will be taking both into consideration. There is also other elements within an objects header that is around 8 bytes. The other element of memory to take into the consideration is the size of the string itself. If the string has not yet been assigned and the element is null then the second element of size to consider will have a size of zero. If the string is assigned a value then the second element will be what ever memory is consumed by the string.
It is possible for multiple structs to refer to the same instance of a
reference type. When this occurs you'll want to make sure that you don't count the memory taken up by the instances of the reference type multiple times.
Now let's add an array and see what that does for our memory calculations. The array itself is a reference type. so the amount of memory the reference to it will consume is dependent on the processor architecture.
struct Location
{
public int LocationID;
public string LocationName;
public double Latitude;
public double Longitude;
public double Elevation;
public int[] SomeRandomArray;
}
The size of the memory associated with the array will be the sum of the size of the elements that it contains. If the array contains value types (the above contains a value type of int
) then the memory allocated when the array is initialized is sizeof(int)
(4 bytes) multiplied by the number of elements in the array. If the array contains reference types then the memory consumed by the array will be the size of the reference (4+8 bytes on a 32-bit system, 8+8 bytes on a 64-bit system) multiplied by the number of elements that the array can hold. This doesn't include the size of the individual instantiated instances of elements it contains.
What happens if I change any of the above from a struct
to a class
? Once a type is made into a reference type it is going to also have an object header. For a struct
when you declare a variable all of the memory that is going to be used by the immediate members is going to be allocated. With a class
only the memory for the reference (4 bytes on 32-bit, 8 bytes on 64-bit) is allocated. The memory needed for the members of an reference type is not allocated until there is a call to new
. Also note that the memory for the instances of a reference type are allocated on the heap while for a value type they are allocated on the stack.
A Look Sparse and Contiguous Memory Allocation
If we take a look at how memory is allocated for a sparse array and a contiguous (normal) array the reason I needed this will be more clear. As an oversimplified example let's say that that an array of 40 elements must be allocated. Regular arrays will allocate the memory contiguously; the bytes associated with the array will be in the same block of memory. Lets say our sparse array is allocating blocks of memory for five elements at a time. Below the blocks that are in blue represent areas in which there is non-empty data.
Contiguous Array |
---|
00 | 01 | 02 | 03 | 04 | 05 | 06 | 07 | 08 | 09 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 |
At a glance the sparse array is taking up less memory than the conventional array. There are still some empty elements within the structure. This is because it's allocating memory for a range of index. If there's even a single non-empty element within a 5 block range then there's an allocation for all 5 blocks. Whether or not this results in less memory being allocated overall is going to depend on the usage patterns for the sparse array. It will work best when the populated elements are clustered closely to each other. Can we get rid of the empty elements all together? We could by reducing the size of the chunks allocated. The smaller the chunks the lower the opportunity for empty positions within chunks to exists. If memory was allocated for single elements (a chunk that only holds one element) then we would only have populated elements in the list. But is this better?
It may be. But to get a better answer to this question there's a cost that as of yet has remained invisible for the sparse array. There's a structure for holding each chunk that looks like the following in my implementation.
public class ArrayChunk<T>
{
public int StartIndex { get; set; }
public int EndIndex
{
get
{
if (Data == null)
return -1;
return StartIndex + Data.Length-1;
}
}
public T[] Data { get; set; }
}
In addition to the array that holds the elements of the chunk itself there is also a field to hold the start index for the array. The start index consumes 4 bytes of memory. So there is an overhead of no less than 4 bytes for each chunk allocated. Consider a conventional and a sparse array both of which are fully populated with 100 bytes of data. Also assume that the sparse array allocates memory for 10 bytes at a time. The memory consumed by the conventional array will be about 100 bytes. The memory consumed by the sparse array will be around 140 bytes. If I reduced the size of the chunks to only having data for 2 bytes of memory then we end up needing at least 6 bytes for each chunk. For the fully populated 100 element collection this would translated to no less than 600 bytes. With results like that one may wonder why even bother with the sparse array. But consider another scenario. Lets say that only 20 elements of the array are populated with 10 elements at the
beginning of the array and 10 at the end. For the conventional array the memory consumed will still be at least 100 bytes. For the sparse array it is around 28 bytes. Something that may become apparent is that the best case scenarios for this sparse array will occur when it partially populated and the populated data items are clustered close to each other.
Growing
The conventional array cannot grow. Once allocated its size is fixed. There are other collection classes within .Net that can grow such as the List<T>
derived classes or the MemoryStream
class. My understanding is that once the memory buffer for any of these classes is consumed it will allocated a new memory buffer of twice the size as the what it had, copy all of it's data to the new buffer, and then discard the old buffer to be reclaimed by garbage collection later. In trying to confirm this I found the source code for the MemoryStream
class. The code of interest is below
private bool EnsureCapacity(int value) {
if (value < 0)
throw new IOException("IO.IO_StreamTooLong");
if (value > _capacity) {
int newCapacity = value;
if (newCapacity < 256)
newCapacity = 256;
if (newCapacity < _capacity * 2)
^__strong style=<span class="code-string">color: red;">newCapacity = _capacity * 2;
Capacity = newCapacity;
return true;
}
return false;
}
// Gets and sets the capacity (number of bytes allocated) for this stream.
// The capacity cannot be set to a value less than the current length
// of the stream.
//
//| <include file='doc\MemoryStream.uex' path='docs/doc[@for="MemoryStream.Capacity"]/*' />
public virtual int Capacity {
get {
if (!_isOpen) __Error.StreamIsClosed();
return _capacity - _origin;
}
set {
if (!_isOpen) __Error.StreamIsClosed();
if (value != _capacity) {
if (!_expandable) __Error.MemoryStreamNotExpandable();
if (value < _length) throw new ArgumentOutOfRangeException("value", "ArgumentOutOfRange_SmallCapacity");
if (value > 0) {
^__strong style=<span class="code-string">color: red;">byte[] newBuffer = new byte[value];
if (_length > 0) Buffer.BlockCopy(_buffer, 0, newBuffer, 0, _length);
_buffer = newBuffer;
}
else {
_buffer = null;
}
_capacity = value;
}
}
}
This behavior is just fine for typical scenarios, but I will be working with what are relatively large buffers (in comparison to the memory available on the devices on which I will be running my code). So I'd prefer to keep the ceiling for the maximum amount of memory allocation that occurs within the programs that I have in mind. It's also worth mentioning that the .Net runtime treats "large" objects differently than it does small ones. For more information take a look at The Dangers of the Large Object Heap. Large objects (around 85,000 bytes or larger) or allocated on a separate heap than small objects (under 85,000 bytes). During garbage collection the .Net garbage collector will try to condense the objects in the smaller heaps to being in contiguous memory. Objects in the LOH (Large Object Heap) are not as easily addressed. From the referenced article:
Large objects pose a special problem for the runtime: they can’t be reliably moved by copying as they would require twice as much memory for garbage collection. Additionally, moving multi-megabyte objects around would cause the garbage collector to take an unreasonably long time to complete.
.NET solves these problems by simply never moving large objects around. After large objects are removed by the garbage collector, they leave behind holes in the large object heap, thereby causing the free space to become fragmented. When there’s no space at the end of the large object heap for an allocation, .NET searches these holes for a space, and expands the heap if none of the holes are large enough. This can become a problem. As a program allocates and releases blocks from the large object heap, the free blocks between the longer-lived blocks can become smaller. Over time, even a program that does not leak memory, and which never requires more than a fixed amount of memory to perform an operation, can fail with an
OutOfMemoryException
at the point that the largest free block shrinks to a point where it is too small for the program to use.
There are improvements in the LOH in .Net 4.5. Also note that on devices with more constrained memory (such as Windows Phone) there is no LOH. But there are still advantages to avoiding fragmented memory conditions.
Collection of Chunks
While the code I've written is mean to be a type of collection class the code is still dependent on a collection class for holding onto the chunks that it has. It is possible to use one of the List<T>
classes for this or a Dictionary<T1,T2>
for this. I've decided to go with the List<T>
. Now doesn't it look dubious that I talked about the memory usage patterns of the List<T>
class and now I'm using it in my underlying implementation! Isn't that just going to cause the same problem that I described above with memory fragmentation? Well, no, at least not as severe. The List<T>
contains references to ArrayChunk
instances. So these will either be 4 bytes or 8 bytes. Let's assume the worst case which is 8 bytes. To grace the large object boundaries the array list would need to grow to more than 10,000 elements ( (85,000 / 8)=num of items needed to make the ArrayList large. 85,000 is the large object size and 8 is the amount of bytes needed to store a reference). The number of elements needed to make the array list this size is going to depend on how many elements you allow to be stored in each ArrayChunk
. When the array does become large enough to occupy the LOH area the scenario is still better than what would happen with a conventional array since the block of memory occupying the LOH is smaller than the block of memory that would have been occupied by a contiguous array of the elements.
For what the sparse array is capable of doing in the code presented with this write-up the LinkedList<T>
would have been suitable (actually more suitable). The List<T>
that I use is primarily stepping forward and backwards in the list (I wrote the code making the assumption that read and writes to the list will tend to be clustered close to each other). I used the List<T>
because it seems to be a better fit for some modifications I plan to do to this code in the future. I won't detail the details of those plans now since they could change. Consequently of those plans do change then I may swap out the List<T>
with the LinkedList<T>
. When I do make changes I want to ensure that I don't break any of the existing behaviour of the class. The project for this code also contains a few unit test to validate that the behaviour of the ArrayList<T>
doesn't vary from expectations.
Testing
There's a small test project included with the code. It will become more important as time goes on because as I make additions to this code I want to ensure that I don't break any of the behaviours that are already present. Right now the test are just checking against the virtual size of the array, ensuring the memory chunks are allocated or deallocated as expected, and that data is preserved.
What is EmptyValue For?
In the sparse array cells that have not been written to are to be treated as though they exists and they contain some default value. That default value is stored in EmptyValue
. There are multiple uses for this member. If there is an attempted read from an unallocated cell EmptyValue
is returned. When the sparse array is searching for chunks to deallocate it will check the contents of that chunk to see if all of its elements are equal to EmptyValue
. When a new SparseArray
is instantiated the EmptyValue
field is initialized by calling the default constructor/initializer for the type that it hosts. For the numeric types this will end up being a zero value. If this isn't appropriate for the data type that you are using there is a delegate named IsEmptyValue
to which you can assign a method that returns true
to indicate that a value or instance should be considered empty and false
otherwise.
Using the Code
Use of this code is not much unlike how you might use a strongly typed fixed size array. The main difference is the upper bound on the SparseArray<T>
grows as needed to accomodate writes to positions within the least. Reading from locations beyond the upper bound will not result in an exception. If some one reads from an indix that is above the size of the array it remains unchanged. But if someone writes to a location above the upper limit then the array will update its reported size and allocate an ArrayChunk
if needed.
Class Interface
Properties |
---|
Name | Access | Description |
---|
ChunkSize | int | Indicates the number of elements that each ArrayChunk can hold |
Length | int | Returns the virtual size of the sparse array |
ChunkCount | int | Returns the total number of chunks that the sparse array has. |
this[int index] | (Generic) | Indexer for accessing contents of array |
Methods |
---|
Name | Description |
---|
Condense | Searches through the sparse array and removes the chunks that contain only empty values |
ctor() | default constructor. Will create a sparse array with a chunk size of 256 elements |
ctor(int size) | Creates a sparse array with a chunk size that is determined by the size parameter passed. |
ctor(int size, int chunkCount) | Creates a sparse array with a chunk size that is determined by the size parameter passed.
The chunkCount may be used as a hint for preallocating some memory. |
Usage Example
SparseArray<int> myArray = new SparseArray<int>();
myArray[15] = 23;
myArray[1024] = 2;
Console.Out.WriteLine("Virtual size of the array: {0}", myArray.Length);
Console.Out.WriteLine("Value in position 2048 : {0}", myArray[2048]);
Console.Out.WriteLine("Virtual size of the array: {0}", myArray.Length);
Console.Out.WriteLine("Chunk count: {0}", myArray.ChunkCount);
myArray[1024] = 0;
myArray.Condense();
Console.Out.WriteLine("Chunk count: {0}", myArray.ChunkCount);