Click here to Skip to main content
15,867,704 members
Articles / Programming Languages / C#
Article

Threads, Processes, Memory Allocation, and Workstation Mode vs. Server Mode

Rate me:
Please Sign up or sign in to vote.
5.00/5 (27 votes)
15 Mar 2015CPOL4 min read 100.8K   416   43   12
What you may not realize about memory allocation and threads, and a little known thing called "Server Mode"

Introduction

A few years ago I converted a single threaded C++ application to multithreaded.  The application relied heavily on the STL and each thread did a lot (and I mean a lot) of memory allocations and de-allocations.  While that could have been optimized, the problem is (or was at that time) that the STL hides all the memory allocations from you in the various collection classes that my code was using, and I really wasn't interested in replacing the allocator.

Now, the interesting thing was, that the performance of the multithreaded version was considerably slower than the single threaded application.  This was very puzzling at first, because the processes could be neatly divided into the number of available cores, there was no intercommunication between the work, and the only synchronization with the main thread was "here's some work to do" where the actual work was easily 99% of the processing time as compared to the queue locking mechanism.

So, I did some digging and discovered that the memory allocation in C++ is, while thread safe, effectively a single-threaded function.  In other words, when a thread requested an allocation or released an allocation, all other threads blocked.  Now, this was totally unacceptable, and my solution was to launch separate physical processes, one per CPU, to do the work, and use pipes to communicate between the main application thread and the physical processes.  Again, because the overhead in communication was so low, this was not an issue.  The result was finally what I expected to see, namely that each core was utilized now at 100%, and indeed, the overall processing time of the work was reduced linearly by the number of cores.  If I had 4 cores, the work took 1/4 of the time.  And by the way, we're talking about doing an analysis that could take days, if not weeks, on a single core CPU.

I've always been curious how .NET behaved in a high allocation environment.  The results are documented in this article, and (no peeking) special thanks (though he'll never know it) to Craig Peters for his post on StackOverflow.

What Does .NET Do?

We'll first make sure things are working right.

Testing Non-Allocating Threads

Let's write a simple test case that doesn't do allocation, but instead just computes, over and over, the factorial of 100.  First, the setup:

static int FACTORIAL_OF = 100;

static void ThreadTest()
{
  List<Thread> threads = new List<Thread>();
  List<Worker> workers = new List<Worker>();
  int n = Environment.ProcessorCount;

  for (int i = 0; i < n; i++)
  {
    Worker worker = new Worker(FactorialTest);
    Thread thread = new Thread(worker.DoWork);
    workers.Add(worker);
    threads.Add(thread);
  }

  threads.ForEach(t=>t.Start());

  Console.WriteLine("Press ENTER key to stop...");
  Console.ReadLine();

  workers.ForEach(w=>w.RequestStop());
  threads.ForEach(t=>t.Join());

  Console.WriteLine("Done");
}

And of course, the work to do:

static void FactorialTest()
{
  decimal f = 1;

  for (int i = 0; i < FACTORIAL_OF; i++)
  {
    f = f * i;
  }
}

On my 8 core system, I see what I expect to see: all processors at 100%:

Image 1

Testing Allocations in Threads

Now let's try the same thing but with allocation 10,000 16K blocks of memory on the heap (not the stack), which we immediately discard for the next allocation of 10,000 objects.  Instead of initializing a worker thread to compute factorials, we tell it do memory allocations:

Worker worker = new Worker(AllocationTest);

Implemented as:

static int ALLOCATIONS = 10000;
static int ALLOCATION_SIZE = 16384;

static void AllocationTest()
{
  // Console.WriteLine(AppDomain.CurrentDomain.FriendlyName);
  object[] objects = new object[ALLOCATIONS];

  for (int i = 0; i < ALLOCATIONS; i++)
  {
    objects[i] = new byte[ALLOCATION_SIZE];
  }
}

Here's the result:

Image 2

Oh my.  Only 33% CPU utilization, and only four of the cores are actually doing anything.  So, we've learned that, as with C++, the memory management in .NET is blocking when we allocate memory.  No big surprise, really.  By the way, the memory allocation never exceeded about 1GB.  Remember, we're allocation 10,000 blocks of 16K each, or about 163MB per thread, so on my 8 core system, this would amount to about 1.3GB, which is in line with the bouncing around I saw with regards to the memory allocation.

Testing Allocations in Separate Processes

So let's try running this as separate processes.  Here's the code (including a very ungraceful Kill call to the processes):

static void ProcessTest()
{
  List<Process> processes = new List<Process>();
  int n = Environment.ProcessorCount;

  for (int i = 0; i < n; i++)
  {
    Process p = Process.Start("ProcessWorker.exe");
    processes.Add(p);
  }

  Console.WriteLine("Press ENTER key to stop...");
  Console.ReadLine();

  processes.ForEach(p => p.Kill());

  Console.WriteLine("Done");
}

Here's the results:

Image 3

Ah, now, because each test is running in its own process, the memory allocations do not block across processses.

Workstation Mode vs. Server Mode

As I mentioned at the start of the thread, thanks to Craig Peters for this gem.  Let's go back to the thread allocation code, but now we'll introduce this in our app.config file:

<runtime>
  <gcServer enabled="true"/>
</runtime>

And the result:

Image 4

Oh my, look at that.  We're getting on average 75% CPU utilization, and each core is mostly busy doing its thing.

Conclusion

I think you can reach your own conclusion here.  If you have threads that are very memory allocation intensive, you are probably being deceived into thinking that you are gaining much improvement of a single-threaded application.  If you don't mind a 25% loss in performance, setting the garbage collector to Server mode is a neat trick.  But if you really want to maximize your performance, create separate processes.  Of course, all of this is irrelevant if the worker thread is doing something that doesn't require allocation and garbage collection of memory!  Regardless, this little suite of tests should give you some pause when considering how to design a multithreaded application with all those other fancy features we have in C# now, such as Task, async/await, and so forth.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Architect Interacx
United States United States
Blog: https://marcclifton.wordpress.com/
Home Page: http://www.marcclifton.com
Research: http://www.higherorderprogramming.com/
GitHub: https://github.com/cliftonm

All my life I have been passionate about architecture / software design, as this is the cornerstone to a maintainable and extensible application. As such, I have enjoyed exploring some crazy ideas and discovering that they are not so crazy after all. I also love writing about my ideas and seeing the community response. As a consultant, I've enjoyed working in a wide range of industries such as aerospace, boatyard management, remote sensing, emergency services / data management, and casino operations. I've done a variety of pro-bono work non-profit organizations related to nature conservancy, drug recovery and women's health.

Comments and Discussions

 
GeneralMy vote of 5 Pin
Igor Ladnik22-Nov-17 3:24
professionalIgor Ladnik22-Nov-17 3:24 
SuggestionInteresting Write Up Pin
Dave Kerr6-Nov-16 14:30
mentorDave Kerr6-Nov-16 14:30 
GeneralRe: Interesting Write Up Pin
Jon McKee7-Sep-17 16:15
professionalJon McKee7-Sep-17 16:15 
GeneralMy vote of 5 Pin
D V L4-Nov-15 23:57
professionalD V L4-Nov-15 23:57 
SuggestionObject pool Pin
Shao Voon Wong9-Apr-15 15:08
mvaShao Voon Wong9-Apr-15 15:08 
AnswerExcelente! Pin
jediYL7-Apr-15 17:20
professionaljediYL7-Apr-15 17:20 
Rose | [Rose] You've done your modest subject justice. Thank you so much for sharing. Big Grin | :-D

QuestionNice one! Pin
manchanx15-Mar-15 9:33
professionalmanchanx15-Mar-15 9:33 
AnswerRe: Nice one! Pin
Marc Clifton15-Mar-15 12:07
mvaMarc Clifton15-Mar-15 12:07 
GeneralRe: Nice one! Pin
manchanx15-Mar-15 12:13
professionalmanchanx15-Mar-15 12:13 
GeneralRe: Nice one! Pin
Marc Clifton15-Mar-15 16:26
mvaMarc Clifton15-Mar-15 16:26 
GeneralRe: Nice one! Pin
manchanx15-Mar-15 16:56
professionalmanchanx15-Mar-15 16:56 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.