Click here to Skip to main content
15,850,750 members
Articles / Programming Languages / C#
Article

The Performance of System.Xml - Insert Operations

Rate me:
Please Sign up or sign in to vote.
3.75/5 (4 votes)
1 Sep 2006CPOL3 min read 37.9K   250   7   9
The results of some perfomance tests of insert-like operations of System.Xml
Sample Image - XmlInsertPerfomanceTest.jpg

Introduction

In this post, I will analyze the ways in which you can add nodes to an XmlDocument and discuss which way is the fastest, using the .NET 1.1 Framework.

How Can We Add Nodes to an XmlDocument?

There are four ways for you to add nodes to an XmlDocument:

  • XmlDocument.InsertAfter
  • XmlDocument.InsertBefore
  • XmlDocument.Append
  • XmlDocument.PrependChild

So, which method is the fastest?

To answer that question, I've created a test program that uses the content of an XML file (11KB) to create an XmlDocument in memory. Then the user can choose which method he wants to test. The results are measured in nanoseconds.

How Did You Make the Performance Test?

One could easily say that the easiest way to make a performance test is something like this:

C#
DateTime before = DateTime.Now;
// do the test...
DateTime res = before.Subtract(DateTime.Now);

Unfortunately, this code forgets that there are threads running in your system. Our performance test application isn't running all alone and most important, it isn't running all the time.

Hence, a good performance test must take these issues into account.

My performance test uses the following approach:

  1. There are two threads. One thread is sleeping for a certain period of time. I like to call it the TimerThread. While it's sleeping, the other thread is invoking the method to test, over and over again, until the first thread wakes up.
  2. The number of times that the method was invoked is registered in a variable.
  3. The TimeThread executes the following code: (timeThatWasSleeping*1000000)/counter. The counter variable holds the number of times that the method to be tested was called.
  4. The time in nanoseconds that the TimerThread calculates is the time that the method takes to finish.

For minimizing the scheduler interference in the tests, I've give ThreadPriority.Highest to the thread that is making the test. However if you run the tests multiple times, you can get different values because of this.

Some Test Code

Let's look at some code, shall we?

The application is configured using some static variables.

C#
static int counter = 0;            // the number of times that the method was called
static bool run = false;           // the test thread is running?
static int timeMilisec = 5000;     // For how long do the test run?
static XmlDocument xmlDoc = new XmlDocument();
static object mon = new object();  // Object used for synchronization
static Thread thTimer;

The Main() method is responsible for setting up the "test thread" and the "timer thread" based on the test that the user chose.

C#
static void Main(string[] args)
{
    string cmd = String.Empty;
    do
    {
        lock(mon)
        {
            if(run)
            Monitor.Wait(mon);// wait for the test to end.
        }
        Console.WriteLine("XmlDocument.AppendChild: a");
        Console.WriteLine("XmlDocument.InsertAfter: b");
        Console.WriteLine("XmlDocument.InsertBefore: c");
        Console.WriteLine("XmlDocument.PrependChild: d");
        Console.WriteLine("Quit: q");
        Console.Write("Enter your choice:");
        cmd = Console.ReadLine();
                    
        using (StreamReader sr = new StreamReader("news.xml")) 
        xmlDoc.LoadXml( sr.ReadToEnd() );                
            
        run = true;

        Thread thw = null;
        switch(cmd)
        {
            case "a":
                thw = new Thread(new ThreadStart(TestXmlDocAppendChild));
                break;
            case "b":
                thw = new Thread(new ThreadStart(TestXmlDocInsertAfter));
                break;
            case "c":
                thw = new Thread(new ThreadStart(TestXmlDocInsertBefore));
                break;
            case "d":
                thw = new Thread(new ThreadStart(TestXmlDocPrependChild));
                break;
            case "q":
                return;
        }
        Console.WriteLine("\nTest started...");
        thTimer = new Thread(new ThreadStart(ThreadTimer));
        thw.Priority = System.Threading.ThreadPriority.Highest;
        thw.Start();
        thTimer.Start();
    }
    while(!cmd.Equals("q"));
}

For example, the thread that is testing the AppendChild method tries to call that method the maximum number of times.

C#
private static void TestXmlDocAppendChild()
{
    while(run)
    {
        xmlDoc.AppendChild(xmlDoc.FirstChild);
        counter++;
    }
}

The timer thread executes the following code:

C#
private static void ThreadTimer()
{
    Thread.Sleep(timeMilisec);
    run = false;
    Console.WriteLine("Result: "+(timeMilisec*1000000)/counter + "ns\n\n");
    counter = 0;
    lock(mon)
        Monitor.PulseAll(mon);
}

The Results

  • AppendChild - 38ns
  • InsertAfter - 12ns
  • InsertBefore - 13ns
  • PrependChild - 14ns

Conclusion

The InsertAfter, InsertBefore and PrependChild methods have basically the same performance results. The AppendChild is a little slower.

One interesting conclusion is the difference between the AppendChild and the PrependChild. The first "adds the specified node to the end of the list of child nodes, of this node" and the second "Adds the specified node to the beginning of the list of child nodes for this node." Hence, it's faster to insert nodes to the beginning of the list of child nodes than to the end.

The Challenge

This tests were made with an Intel Pentium M processor 1500Mhz and 512 MB of RAM running Windows XP Pro. I would like to challenge everyone who would like to run these tests to post the results here.

Hope that you can improve your code with this performance test.

History

  • 1st September, 2006: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer Advantis
Portugal Portugal
Bruno Coelho is a Software Engineer from Portugal.

He's been working with .NET since version 1.1 to 3.5, using the hard won skills at ISEL, where he has got a degree in Informatic Engineering, to design critical applications using the best practices and design patterns in the industry.

He's current interests are: Performance & Best practices and Design patterns.

Comments and Discussions

 
GeneralResults Pin
Dilatazu9-Jul-07 8:09
Dilatazu9-Jul-07 8:09 
GeneralMeasurment Principles Pin
Alois Kraus13-Oct-06 5:36
Alois Kraus13-Oct-06 5:36 
Hi,

this is an interesting approach to measure performance. When you claim that something is twice as fast as another API call you need to get stable relative performance numbers (e.g. Claim: String.Format is 1.33 times fast than StringBuilder.AppendFormat). There is no need to switch to a different thread and play dirty tricks with the OS scheduler to get these results. DateTime.Now is fine if the test duration is long enough (2-3s). Stopwatch is better where you can measure up to the nanoseconds level. The shorter the test is the less accurate it will become because of random artefacts. The mitigation is very easy: Run the test in a loop to let the noise cancel itself out by using a mean value.

You do state in your article that the numbers change from run to run on your PC. This is a sign that your time measured values are not stable and should not be published anyway. Someting that cannot be reproduced is of no value because how do you want to proof that your numbers are correct? How big is the deviation if you let the tests run 100 times?

Yours,
Alois Kraus

GeneralTime calculation Pin
carlop()6-Sep-06 0:48
carlop()6-Sep-06 0:48 
GeneralRe: Time calculation Pin
Bcoelho20006-Sep-06 3:10
Bcoelho20006-Sep-06 3:10 
GeneralRe: Time calculation Pin
carlop()6-Sep-06 6:36
carlop()6-Sep-06 6:36 
AnswerRe: Time calculation Pin
Bcoelho20007-Sep-06 9:10
Bcoelho20007-Sep-06 9:10 
GeneralRe: Time calculation Pin
carlop()7-Sep-06 22:15
carlop()7-Sep-06 22:15 
QuestionI fail to see how your approach solve the problem ... Pin
Sebastien Lorion5-Sep-06 11:11
Sebastien Lorion5-Sep-06 11:11 
AnswerRe: I fail to see how your approach solve the problem ... Pin
Bcoelho20006-Sep-06 3:15
Bcoelho20006-Sep-06 3:15 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.