Click here to Skip to main content
15,868,149 members
Articles / General Programming / Optimization
Tip/Trick

High Resolution Timing

Rate me:
Please Sign up or sign in to vote.
5.00/5 (4 votes)
19 Dec 2019CPOL3 min read 6.7K   79   4   3
A simple, header-only class for high resolution timing

Introduction

This is a very simple, header-only class that can be used for high-resolution timing.

Background

I first posted an article about this class at this site's predecessor (codeguru.com) about twenty years ago. Since then, I have seen many implementations of timers using the QueryPerformance... family of functions, but none of them work the way I like them to. Where this differs from those is in its simplicity of use. One timer instance can be used for an entire multi-threaded application.

The methods of this class, once initialized, are fully reentrant. Since one instance can suffice, there should be only one initialization and that is at construction. This class retains only three pieces of information that are all set at construction so it is very lightweight. They are the frequency of the counter, the period of the counter, and a flag indicating the availability of the performance counter. I have never encountered a desktop or laptop machine where it is not available but the flag is there for verification. While only one instance of this timer is necessary, it is light enough that construction is a short process so multiple instances can be used if preferred.

This class works by acquiring the value of the Windows performance counter and subtracting that value from one previously obtained. That difference is the number of counter ticks that have elapsed between calls of the Since method. The number of ticks is then converted to a time duration by multiplying it by the period of the counter which is the reciprocal of its frequency. Over the years, I found the counter frequency to range from about 1MHz in early Windows NT releases to 10MHz on current machines.

Using the Code

Here is a listing of the class. As mentioned, it is a header-only class.

C++
//
// Elapsed.h - a simple class for high resolution timing
//
// © Copyright 1998-2019 by Rick York
// This class is free for use in any and all applications for any purpose.
//
// To use this class, include this header and do the following:
//
// Elapsed timer;                              // instantiate an object
// double start = timer.Begin();               // start timing
// ...        // timed code goes here
// double elapsed = timer.Elapsed( start );    // get elapsed time in seconds
// printf( "elapsed time was %.3f seconds\n", elapsed );
//
// Note that by doing the timing like this, one timer can be used many times
// simultaneously since it is only the delta in time that is being measured
// and the starting time is kept by the application, not in the class.
// If it were, then one instance could time only one thing. This implementation
// can time multiple things by multiple threads with one instance.
//
// One more thing - timing always has a certain amount of overhead. It is best
// to time events that last much longer than that overhead. For this class, the
// overhead is on the order of nanoseconds so if events taking at least microseconds
// are timed, then the overhead is a very small fraction of the total time taken.

#pragma once
#define ELAPSED_H
// #include "Elapsed.h"

#include <ProfileAPI.h>    // for QueryPerformance... functions

class Elapsed
{
public :
    Elapsed()        // constructor
    {
        // get the frequency of the performance counter and determine its period

        LARGE_INTEGER li;
        if( QueryPerformanceFrequency( &li ) )
        {
            m_Available = true;
            m_Frequency = li.QuadPart;
            m_Period = 1.0 / (double)m_Frequency;    // period is in seconds
        }
    }

    // obtain elapsed time in seconds

    inline double Since( double begin=0 )
    {
        // get current performance counter value, convert to seconds,
        // and return the difference from the begin argument

        LARGE_INTEGER endtime;
        QueryPerformanceCounter( &endtime );
        return ( endtime.QuadPart * m_Period ) - begin;
    }

    // returns true if the counter is available

    bool    IsAvailable() const              { return m_Available; }

    // return the counter frequency

    INT64   GetFrequency() const             { return m_Frequency; }

    // return the period of the counter

    double  GetPeriod() const                { return m_Period; }

protected :
    bool    m_Available                      { false };
    double  m_Period                         { 0 };
    INT64   m_Frequency                      { 0 };
};

Here is a snippet of code that shows it in action. This illustrates how one can measure the overhead of calling the Since() function to measure the elapsed time.

C++
void MeasureTimerOverhead( int loopCount )
{
    double temp = 0;   // this is to prevent the calls to Since from being optimized away

    Elapsed timer;
    start = timer.Since( 0 );   // acquire initial counter value

    for( int n = 0; n < loopCount; ++n )
    {
        temp = timer.Since( start );
    }

    double elapsed = timer.Since( start );   // obtain the elapsed time in seconds

    printf( "time for Since calls was %9.6f seconds\n", elapsed );

    // determine the average time per loop iteration and scale to microseconds

    double average = 1.0E6 * elapsed / loopCount;
    printf( "average time for Since : %9.6f microseconds\n", average );
}

Executing this function on my machine, an i9-9900X at 3.5GHz, for 40M loops results in an average time of 0.0163 microseconds or about 16 nanoseconds. This includes the time taken to increment and compare the loop's iteration counter but other experiments indicate this is a small fraction (< 1%) of the total time.

Points of Interest

My testing has demonstrated the call to the timer's Since method has very low overhead. However, since the counter's frequency is in the megahertz range, I consider the timer's accuracy to be in the range of tens of microseconds. My reasoning is a 10MHz counter has a period of 0.1μS which is 1% of 10μS so you will be reasonably accurate when timing values of 10μS or 0.01mS and higher. Yes, I know this is not a rigorous analysis but I consider it useful to know at least what the accuracy's order of magnitude is. I rarely need to measure less than tenths of milliseconds so this timer has been very useful for me. I hope it is for you too.

History

  • 19th December, 2019: Initial submission

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
United States United States
I work on an industrial HPC project that can run on either the CPU or the GPU. I usually use which ever one has the most horsepower on a given machine. It's written with CUDA with very few ifdefs used. My company is quite large, in the top five in our industry in North America, and I work in a small group with just five programmers.

Comments and Discussions

 
PraiseMy vote of 5 Pin
Shao Voon Wong21-Dec-19 22:33
mvaShao Voon Wong21-Dec-19 22:33 
GeneralRe: My vote of 5 Pin
Rick York22-Dec-19 7:04
mveRick York22-Dec-19 7:04 
GeneralRe: My vote of 5 Pin
Shao Voon Wong22-Dec-19 15:31
mvaShao Voon Wong22-Dec-19 15:31 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.