Basic Instrumentation and Profiling Framework for C++

Paul M Watt

Rate me:

5.00/5 (15 votes)

9 Oct 2008CPOL27 min read

76.5K

979

Macro based framework written for C++ applications that will allow basic instrumentation and profiling features

Introduction

I have been developing user level applications for Windows CE based devices for about 5 years now. The quality of the development tools for this platform has improved considerably in those 5 years. However, the tool quality still pales in comparison to what is available to a Windows Desktop developer. A debugger exists for these mobile devices, however it is not always possible to use the debugger. Many times when debugging an application a developer will be reduced to using MessageBox or writing data to a log file.

For some of the projects that I have helped develop, the developers have created a series of C pre-processor macros for instrumenting their code. These macros can be selectively compiled into the application when instrumentation reporting is desired. Normally the output is written to a log file. Typically a macro is created to indicate when a function is entered, when it is left, the input parameters and the return values, and even reporting macros for certain checkpoints in the code.

I like the verbose output that a report like this provides, however, just about every macro scheme that I have encountered was just as verbose. Quite often these macro entries interfere with the readability of the code, they are cumbersome to fill out. They use the printf mechanism to report data values, and because the C Preprocessor cannot overload macros, parameter reporting macros often have multiple versions defined for the number of parameters to be reported. The name often ends with a "_x" where x is the number of parameters to be reported. These basic solutions are also generally not threadsafe, and often severely impact the performance of the application because of IO operations with the file system.

I decided to try to harness the power of C++, and create an instrumentation framework that exhibits these characteristics:

Very little work to instrument code with
Brief syntax
Automatically report any information that can be determined by the compiler
Portable to C++ compilers for Windows Desktop and Windows CE
Simple message syntax
Thread-safe
Takes advantage of the stack to automatically instrument scope level
Configurable reporting level
Eliminate cumbersome macro entries
Minimal impact on application performance
Portability possible for other platforms

This article documents the framework that I have developed. I was able to accomplish most of the goals listed above. Some of the goals were only able to be mere improvements over the code that inspired this framework. For most of the goals however, I believe I have created a much better model that will make it possible to easily instrument the code.

Along the way, I discovered that I could not only instrument the code with basic output logging, but I could also include a basic profiling framework for very little extra effort. Here is a list of the profiling features that this framework provides:

Function Entry Count
Function Usage Time
Total Function Usage
Ability to Pause and Resume the usage clock to account for system blocking operations

Framework

Let's first describe the framework for the readers that are more interested in using this code rather than how it works. The system is contained in a single header file called FnTrace.h. There are approximately ten to fifteen macros designed to be used by the developer to assist in code implementation. Here are the names of these commands:

FN_CHECKPOINT
FN_ENABLE_PROFILE
FN_INSTRUMENT_LEVEL
FN_INSTRUMENT
FN_INSTRUMENT_THREAD
FN_ENTRY
FN_FUNC
FN_FUNC_RETVAL
FN_FUNC_PARAMS
FN_OUTPUT
FN_PARAM
FN_PAUSE
FN_RESUME
FN_RETVAL

Each of these macros provide a specific purpose, and not all of them are required to be used. The selection of macros will be based on what the developer would like to accomplish.

Requirements

This section describes the basic requirements for the FnTrace framework:

Visual Studio 2005 or greater for best results
Boost Preprocessor Library
C++ source code

Usage

This section will describe the actions that are required in order to enable function instrumentation for your application. All of the macros that are intended for direct use will also be documented and explained.

Activate Instrumentation

In order to activate instrumentation, the following macro will need to be defined before the FnTrace header is included in a file. If you are using precompiled headers, I would suggest that you add the include the header file in your precompiled header.

C++

#define FN_INSTRUMENT

Declare Default Log Level (Optional)

You can declare your desired instrumentation log levels for your application. If you do not define these log levels, the default level of FN_5 will be used, which is the maximum level of detail. There is also a MACRO that will be described later that can be called which will allow the level of logging to be changed at runtime.

C++

#define FN_FUNC_LEVEL_DEF FN_3

FN_FUNC_LEVEL_DEF:This MACRO defines the default log level for reporting function level operations. This includes function entry and exit, profiling, and general output statements. If the log level is less than the defined level for a function, the statements found in that function will not be reported to the instrumentation log.

C++

#define FN_PARAM_LEVEL_DEF FN_5

FN_PARAM_LEVEL_DEF:This MACRO defines the default log level for reporting function parameter values. If the log level is less than the defined level for a function, the parameter output statements found in that function will not be reported to the instrumentation log. If this log level is set to a value less than the function log level, it will automatically be moved to the same level as the FN_FUNC_LEVEL_DEF value.

#Include "FnTrace.h"

Include the FN Trace framework header in each of the files that you intend to instrument. If you are using precompiled headers, I would suggest that you add the include the header file in your precompiled header.

Change Instrumentation Levels at Runtime

The MACRO in this section can be used to change the instrumentation level dynamically at runtime. This could be attached to a UI option or part of a message processor in your main windows WndProc.

C++

FN_INSTRUMENT_LEVEL(F,P);

F: The new function trace level, FN_0 through FN_5
P: The new parameter trace level, FN_0 through FN_5

Initialize Thread Instrumentation

For each thread that will be instrumented, the following entry will need to be defined at the beginning of the thread function to initialize the thread state for the framework:

C++

FN_INSTRUMENT_THREAD;

Enable Profiling (Optional)

If function profiling is desired, then declare the following MACRO at the top of your main function.

C++

FN_ENABLE_PROFILE(1);

You can turn function profiling off at a later point in the program at run time by calling the same macro with a value of 0.

C++

FN_ENABLE_PROFILE(0);

Enable Instrumentation for a Function

For each function that you would like to instrument in your application, add the following MACRO to the top of the function scope inside of the braces:

C++

FN_FUNC(retType, level);

ex:
bool IsEmpty()
{
    FN_FUNC(bool, FN_1);
    ...
}

retType: The type of the return value for the function. For functions that have a void return type, use the FN_VOID define as the return type.
level: The tracing level of this function.

Notify Framework of Return Value

The FN Framework is capable of reporting the value that is returned by a function when the function exits. In order to enable this feature, a programming convention will need to be followed. You will need to declare the macro from this section, as well as allocate space for a return value. The name of the return variable is passed to the macro, and at the function exit, this variable should be in the return call.

C++

FN_RETVAL(retVar);

ex:
bool IsEmpty()
{
    ...
    bool retVal = false;
    FN_RETVAL(retVal);
    ...
    retVal = true;
    return retVal;
}

retVar: The name of variable that will be returned when the function exits.

Enable Function Instrumentation with Return Value

This section is a combination of the previous two macros. This macro can be defined, and make this a one-step process.

C++

FN_FUNC_RETVAL(retType, retVar, level);

ex:
bool IsEmpty()
{
    bool retVal = false;
    FN_FUNC_RETVAL(bool, retVal, FN_1);
    ...
    retVal = true;
    return retVal;
}

retType: The type of the return value for the function. For functions that have a void return type, use the FN_VOID define as the return type.
retVar: The name of variable that will be returned when the function exits.
level: The tracing level of this function.

Enable Full Function Instrumentation

For each function that you would like to instrument in your application, declare the MACRO from this section at the top of the function scope after the braces. This macro will initialize the appropriate state to report the name of the function, each of the parameters that you indicate, and report the return value when the function exits. This MACRO will also initiate profiling for this function if you have enabled it for the framework.

In order to use this version of the MACRO, you will need Visual Studio 2005 or greater because of the variadic pre-processor operator ...

C++

FN_FUNC_PARAMS(retType, retVar, level, n, ...);

ex:
BOOL SetRect(LPRECT lprc, int xLeft, int yTop, int xRight, int yBottom)
{
    BOOL retVal = FALSE;
    FN_FUNC_RETVAL(BOOL, retVal, FN_1, 5, lprc, xLeft, yTop, xRight, yBottom);
    ...
    retVal = TRUE;
    return retVal;
}

retType: The type of the return value for the function. For functions that have a void return type, use the FN_VOID define as the return type.
retVar: The name of variable that will be returned when the function exits.
level: The tracing level of this function.
n: The number of parameters to be reported.
...: All of the parameters to be reported in a comma delimited list.

Any parameter type can be reported as long as there is a conversion available to the ostream operator<< for its type. Therefore in the example above, somewhere in the application, a definition must be declared for the RECT struct of the following:

C++

// Sample OStream implementation for the pointer to RECT struct
inline std::wostream& operator<<(std::wostream& os, const RECT* _Val)
{
    if (_Val)
    {
        os << "RECT(" << _Val;
        os << "): left " << _Val->left;
        os << ", top " << _Val->top;
        os << ", right " << _Val->right;
        os << ", bottom " << _Val->bottom;
    }
    else
    {
        //C: A Null pointer was passed in.
        os << L"(null);
    }

	return os;
}

// output:
// RECT(0034EF5A): left 0, top 0, right 640, bottom 480

Report Messages in Function Scope

Use the FN_OUTPUT MACRO to report any messages to the instrumentation log in the current function scope. The output mechanism for the FnTrace framework is based on the C++ ostream objects. Output strings are formed the same way you would write messages to cout. The default path for output is reported to OutputDebugString, which can be captured by your debugger or an external application running on your machine that will listen to the debug stream.

C++

FN_OUTPUT(output);

ex:
TCHAR windowName[100];
::_tcscpy(windowName, _T("Test Application"));
...
FN_OUTPUT(L"The current window name is: " << windowName);

output: The message to report to the instrumentation log formatted in an ostream.

The output stream can be redirected from OutputDebugString to the std::clog by declaring this macro before including the FnTrace header.

C++

#define FN_OUTPUT_CLOG

Parameter Reporting

You can use the following macro to report the name and value of a single variable or function parameter.

C++

FN_PARAM(param);

ex:
int x = 100;
FN_PARAM(x);

//result:
//x = 100

param:The variable for which to display the output.

Selective Function Instrumentation

Some functions are called so frequently that if you were to log every single entrance into the function, it would quickly fill your buffers. In this scenario, you would have too much information and it would no longer be beneficial to instrument your application because there would be too much data to sift through without automated tools.

A perfect example of this is a WndProc function for a user defined window in the WIN32 API. Most of the time this function is called, the default Windows message handler is perfectly acceptable. In this particular scenario, you are most likely not concerned with the entrance of this function. However, you may still be interested in hitting certain points in the WndProc function, such as when the user triggers a WM_COMMAND message that you have handled.

These next two MACROs are meant to be used in conjunction to provide selective instrumentation for a function. The first macro will prepare a function scope for instrumentation, the second MACRO is an instrumentation checkpoint. If a checkpoint is reached, then data will be printed out. If the function is entered, but a checkpoint is never reached, then no data will be reported for the function.

FN_ENTRY: Use this MACRO in place of the FN_FUNC_x class macros. This MACRO will declare function state:

C++

FN_ENTRY(level);

level: The tracing level of this function.
FN_CHECKPOINT: Use this MACRO to declare a checkpoint in the current function scope.

The FN_ENTRY: MACRO must have been declared at some point in the current function scope before this MACRO otherwise a compiler error is reported.

C++

FN_CHECKPOINT(label, level, n, ...);

ex:
    ...
    case IDM_HELP_ABOUT:
        {
            FN_CHECKPOINT("IDM_HELP_ABOUT", FN_5, 4, hWnd, message, wParam, lParam);
            DialogBox(g_hInst, (LPCTSTR)IDD_ABOUTBOX, hWnd, About);
        }
        break;
    ...

label: The label to be reported for this checkpoint.
level: The tracing level of this function.
n: The number of parameters to be reported.
...: All of the parameters to be reported in a comma delimited list.

Any parameter type can be reported as long as there is a conversion available to the ostream operator<< for its type. Refer to the FN_FUNC_PARAMS MACRO for more details.

Pause and Resume Profile Timing

If function profiling is enabled, this MACRO can be used to stop and start the timer for a function. This could be useful when the function calls a kernel function that blocks for a long period of time, such as GetMessage, or WaitForSingleObject.

C++

FN_PAUSE();
FN_RESUME();

ex:
    // Main message loop:
    FN_PAUSE();
    while (GetMessage(&msg, NULL, 0, 0))
    {
        FN_RESUME();
        if (!TranslateAccelerator(msg.hwnd, hAccelTable, &msg))
        {
            TranslateMessage(&msg);
            DispatchMessage(&msg);
        }

        FN_PAUSE();
    }

Implementation

The FnTrace Framework is built around a set of objects that are created on the stack when a tracing scope is entered. The constructor of the object will log important startup state and the destructor will report exit state before the function exits. There are three types of objects that are created to facilitate the framework. I will briefly describe each of the objects and their purpose later in this section.

All of the object definitions and global variables are declared within the namespace fn{...} in order to prevent cluttering of the global namespace. This fact should be completely transparent to a developer using the FnTrace framework because all access is done through the MACROs. No objects should be called directly.

The following sub-sections will document the basic path I followed to arrive with the current implementation. I will indicate my initial motivation for a design, what possibly worked and what failed, why I changed it, and I will indicate what the final solution was for each of the major components of the framework.

Message Reporting

While developing the framework, I used cout in my MACROs to quickly get the framework up and running. My intention was then to write a message queue or packet object to transmit the calls to a different process. That external process could then handle the synchronization of the messages, as well as the final storage to a file or some other medium for analyzing. The message queue system was also going to be portable across platforms, starting with Windows CE, and Windows Desktop, but moving on to Linux and other platforms.

Once I reached the point in my development to create this object, I remembered the Win32 function OutputDebugString. This function can be used to report messages to a debugger attached to your application. I may well end up creating that portable message queue class in the future, but for now, by using OutputDebugString I am able to ignore an entire chunk of code while I am solely developing with the WIN32 API.

The one downside, is that I need to have a debugger attached to my application while it is running to listen for the debug strings. The good news is that there are plenty of debugging tools that you can run on your system that are not actually debuggers, but tools that listen to the OutputDebugString data for your application. One such example is Debug View by Windows SysInternals.

The only challenge in making this work was to create a std::wostringstream object for each thread, to record the state of a thread. Then when the data was written to the OutputDebugString function, the read buffers of the ostream had to be emptied, or set back to the beginning. For more details, look at the definition to the FN_OUTPUT MACRO.

In an eye towards portability, I also made it possible to redirect the output to the std::clog. However, since I have not used this yet, no effort has been made to make this path thread-safe.

Function Name

My initial vision for this framework was to find a way to declare a macro that could extract all of the information that it needed at compile time from the declaration of the function, or behind the scene in the disassembly of the program. I did not like the fact that all of the basic instrumentation MACRO implementations that I had seen required the developer to type in the name of the function. This made instrumenting the application cumbersome and error prone. If you cut and paste, it would be too easy to either forget to fill in the correct name of the function when you first instrument, or change the name in each place the instrumentation MACRO is used in a function if the function name changes.

My first thought was to use the compiler defined MACROS __FILE__ and __LINE__ to report the current file name and line number respectively. I am already well aware of these macros from a clickable REMINDER MACRO that I, as well as many other developers use. My intention was to write an analyzer tool that would use the file name and line number, as well as the map file generated by the compiler and create a slick UI to more easily analyze an instrumentation trace. The benefit of doing it this way, is that I could write short cryptic messages to indicate state, and they could be post-processed at analyzing time in order to hopefully reduce the log size and impact on the application while it is executing.

The first reason I dumped this idea, is that it was another piece of software that I would have to write. Sadly it is not the reason that I did not go this route. The real reason is that I found out about some Microsoft Specific pre-processor macros, namely __FUNCSIG__. This macro reports the name of the function, as well as the calling convention, return type, and parameter types of the function. There was no need for me to write a decoding tool anymore. I have all of the information that I needed to extract the name of the function, the return type and all of the parameters.

Return Value

Another aspect that I have not liked about the instrumentation MACROs I have seen up until now, is that in order to report a return value of a function, a MACRO such as this would have to be reported at each and every return point in the function:

C++

    ...
    INSTRUMENT_OUT_1("FunctionA: returns %s", name);
    return name;
}

Again, the name of the function would have to be typed in, a cumbersome printf format string is required, and this is a copy of the MACRO that is required at each and every return point. If one of the points is missed, or added at a later date, the exit of the function will be missed.

In order to tackle this, I created the an object on the stack at the entry of the function. During the destructor of this object, I was going to search the stack or the function registers for the return value, and report it. I first went looking at the EAX register, which is the conventional register that values are returned in for a function in the x86 architecture. I found out quickly that this would not work because of the fact that the destructor itself is in a different stack frame than the function that I am instrumenting and the EAX register currently has nothing to do with the desired return value, and I could not find a reliable way to walk the stack for this value. That would make this solution very fragile and non-portable. I would also have to write decode the assembly for each architecture I was trying to run on for the Windows CE platform.

One intermediate solution to this problem was to use a MACRO that redefines the return keyword. That didn't work for obvious reasons, so I tried a MACRO such as this RETURN(retVal). Inside the MACRO, the variable retVal would be assigned to the retval stored in the functions implementation object, and would be printed out when the destructor was called. The developer would need to use this macro everywhere that it returned from a function call. This is still error prone and looks a little funny.

My final solution, was to pre-declare a return variable, and store a pointer to that value in my function stack object. Then when the developer assigns the return value to the return variable, I have access to the value in my stack object. This solution requires the user to follow a convention, however, it is a small price to pay for a big reward.

C++

    bool retVal = false;
    FN_RETVAL(retVal);
    ...
    if (x < minVal)
    {
        return retVal;
    }
    ...
    retVal = true;
    return retVal;
}

Parameter Values

I probably spent most of my development time working on a simple and robust way to report the parameter values for a function call. I really did not like the way that most TRACE macros are defined in the form:

C++

#define TRACE_1(msg, p0)
#define TRACE_2(msg, p0, p1)
#define TRACE_3(msg, p0, p1, p2)
#define TRACE_4(msg, p0, p1, p2, p3)
#define TRACE_5(msg, p0, p1, p2, p3, p4)
#define TRACE_6(msg, p0, p1, p2, p3, p4, p5)
#define TRACE_7(msg, p0, p1, p2, p3, p4, p5, p6)

Just to get the ball rolling, I declared this macro:

C++

#define FN_PARAM(param)   FN_OUTPUT(#param << " = " << ##param##)

This required a FN_PARAM definition for each parameter to be reported, but to me it felt better than the multiple numbered MACROs from above with the printf format string. However, it was still cumbersome, and very verbose, much worse than the method I was trying to avoid. This MACRO still exists in the framework, but it is left as a convenience and some of the other MACROs actually call this MACRO to do their work.

I started investigating the Stack Frames assembly, and I looked at the x86 Base Pointer, EBP (also known as Frame Pointer). From here, I could walk up the stack, and the first parameter was going to be EBP-8h, and the second parameter EBP-12h and so on. As I started down this path, I quickly realized I would be back in to the printf dilemma, because I would no longer have the type information at compile time. I was going to have to do some extra processing to discover the type at runtime.

I was already using the __FUNCSIG__ compiler macro to get the function definition, so I thought that I would go ahead and use the defined values in the function definition to decode the types and generate output from that path. At this point I decided to add a second object on the stack in the function declaration MACRO, only this second object would be static. This would allow the object to only be created once when the function was first called. This way, it could hold on to all of the type data that I calculated for the function, and I would only have to do it once. This turned out to be much more difficult than I had first thought. However, it was still a great step, because it led to the extra profiling capabilities that the framework now provides.

After realizing the deficiencies of the automatic stack method, I thought about what other problems could arise. A major problem with automatically trying to grab the parameters, is that many times a parameter could be an "out only" parameter. This would mean the data is invalid coming into the function and worthless to report to the log. When this new factor was coupled with the previous fact that type information was now gone, it seemed to me that there was too much work in this new method for very little gain.

The final step in this path was discovering the Boost pre-processor library. This library is a very cool set of expanded pre-processor macros. Some of the macros are quite complicated and it took me a bit of time to sift through the documentation and find the few that I actually needed. I settled on the ARRAY based macros that allowed me convert the variadic entries into an ARRAY. Then I was able to use the control and loop definitions from Boost to enumerate through each parameter that was defined for output in my macros.

Ironically, I decided to provide the cumbersome form of macros that inspired this framework in the beginning, but only to help support compilers that do not support the variadic pre-processor MACRO ... that was added to Visual Studio 2005. This MACRO is just like the one available in C++ for use in its functions.

Thread Safety

Thread Safety is provided by allocating a set of variables to each thread through Thread-Local Storage (TLS). The TLS mechanism in the Windows Desktop compiler turns out to be a much simpler implementation than the path I had to take to add support for the Windows CE counterpart. The Windows Desktop supports a compiler-defined definition __declspec(thread) for a variable that will allocate space for the variable for each thread that is created. For now this is convenient, however, this form does have limitations, and since I had to go through the work to support CE, I may make the CE method the only one that I used for my Windows based implementation. My decision will be based on the effects of performance after I analyze the framework in detail.

For Windows CE based systems, I had to use the WIN32 APIs designed to provide TLS support:

TLSAlloc
TLSGetValue
TLSSetValue
TLSFree

Currently there are four values that are allocated for each thread in the Windows implementation of the framework:

bool isInstrumentThread: This value is initialized when a thread is created. The value is set to true when the FN_INSTRUMENT_THREAD macro is defined.
unsigned long indentLevel: This value indicates the current indent level for the active thread. As each function is entered on the stack, the indent level is incremented to make viewing the call stack much easier in the instrumentation log.
std::wostringstream *pInstLog: When the default output method is used, this ostream object is initialized to help convert the stream data into output. This conversion object was chosen because of its ease of use and ability to preserve type information for parameters.
std::stack<timeEdge> *pTimeStack: This object was created to help facilitate the function profiling provided by the framework. This parameter is further detailed in the next section.

Each of these parameters is conditionally initialized during the thread creation macro, based on the compiler and other #defines. The objects need to be stored as pointers because of the nature of TLS storage. The thread object destructor will release the memory for each of these variables when the thread exits.

A series of helper macros have been created and utilized to abstract the access to the TLS variables as much as possible in an attempt to reduce the impact that the access to these variables will have on the overall performance on programs instrumented with this framework.

Profiling

Profiling was really added as an after thought because all of the pieces fell into place that it would not take much more effort to add this feature. Currently, the profiling features are very primitive, and provide a very limited amount of detail about the application. In fact, the timer resolution is reported in milliseconds, therefore the actual use for the timing functionality is quite limited. Therefore, at this point, this feature is really about proof-of-concept. All of the profiling features are currently facilitated by a static template object that is created for each function the first time the function is called.

Function Use Count

The first feature that was quite simple to implement was the ability to count the number of times that a function is entered. This is accomplished incrementing a counter in the static function object found in each function. There is nothing else to this feature. What is the value in knowing how many times a particular function is called?

Indicate which functions would benefit from conversion to inline or optimization.
Give insight to problems in situations where a function is called more or less than you expect it to be called.

Function Execution Time

This feature keeps track of the total amount of time that the CPU spends within the scope of a particular function. More precisely, the amount of time that the local function state object exists on the stack for each particular instantiation of a function. This feature was a bit tricky to complete and make entirely useful. The main issue was the fact that a single function call may call into other functions. Time spent in these "callee" functions should not necessarily be counted toward the execution time of "caller". If it were not this way, your applications main function would have the most time spent in it, and when you add up the total running time of the program, it would be greater than the actual time spent in the program.

My first attempt was to simply create a stack object in the TLS, that would allow the start time of each function to be pushed onto the stack. Then when the function exited, the item at the top would be popped off the stop and subtracted from the current time. This would indicate the total usage of the current function. This is when I realized the first issue I described above. The usage for each function lower on the stack was continually reported as a much larger value than I had expected.

What I needed to do was create a mechanism that would keep track of the usage of each function further down the stack. Then when I got to the end of a function, I would subtract the total time spent further down the stack, from the current execution time for the current function. This concept seemed pretty straight forward when I first started. To help facilitate this, I changed the stack object to use an stl pair. This allowed me to couple two items, the start time of the function call, and the total delay. The start time was built up the stack, and the delay was passed down from the "callee" to the "caller" through the TLS stack.

One last issue to tackle was to allow the profiling to account for function calls that this framework was not actually managing, such as blocking calls or calls into libraries that are time consuming. In order to give the developer some sort of control over the timing instrumentation I added the FN_PAUSE and FN_RESUME macros to be placed around calls such as WaitForSingleObject. This was easily implemented by using the code that was already used to calculate when the current function had returned and was being removed from the stack. The time spent in the PAUSE state counts as a delay for the "caller", however this time is not counted as time spent for the current function "callee".

The function timing feature has much potential to be useful. Before it can reach its full potential I will need to convert the time to a high-precision timer, much higher than millisecond resolution. Here are a few possible uses for measuring the execution time of your applications functions:

Determine which functions require the most processing time. Couple this with analyzing what is happening in the function to determine if it is necessary to optimize its performance.
Maintain records of previous runs to determine what effects on performance your changes are creating.

Bug: Profiling Summary Does not Print to Log

When I first created all of the tools, I wrote the data to cout and the order of the header includes was quite different. Therefore, I was getting my program profiling summary written to stdout when the application was destroying all of the static function objects. However, after I started writing a sample application and placed the FnTrace.h file in the pre-compiled header, I believe I changed the initialization order of some of the objects and therefore the destruction order. Therefore the stdout object is no longer valid when the objects are being destroyed, and there is no place to accept the reported data. This is a bug I hope to fix in the near future.

Demo

The demo application is simply an example of how to use the different macros. It is an application that is compatible with both the Windows Desktop and Window Mobile devices. It is the default application that is generated by the Visual Studio 2005 project wizard, that I have instrumented with the FnTrace framework. The sample image at the top of the article is an example of the data that will be reported to the output window when this application is run through the Visual Studio 2005 debugger.

Output

Here is a sample of the output to expect from the demo application:

C++

3068454610: int __cdecl WinMain(struct HINSTANCE__ *,struct HINSTANCE__ *,wchar_t *,int)
3068454610:   hInstance = B6DE9E56
3068454610:   hPrevInstance = 00000000
3068454610:   lpCmdLine =
3068454610:   nCmdShow = 5
3068454610:   int __cdecl InitInstance(struct HINSTANCE__ *,int)
3068454610:     hInstance = B6DE9E56
3068454610:     nCmdShow = 5
3068454610:     unsigned short __cdecl MyRegisterClass(struct HINSTANCE__ *,wchar_t *)
3068454610:       hInstance = B6DE9E56
3068454610:       szWindowClass = INSTRUMENT
3068454610:       elapsed:	51ms
3068454610:     return (50540)
3068454610:     long __cdecl WndProc(struct HWND__ *,unsigned int,unsigned int,long)
3068454610:       WM_CREATE
3068454610:         hWnd = 7C079270
3068454610:         message = 1
3068454610:         wParam = 0
3068454610:         lParam = 470153100
3068454610:       elapsed:	122ms
3068454610:     leave;
3068454610:     long __cdecl WndProc(struct HWND__ *,unsigned int,unsigned int,long)
3068454610:       WM_ACTIVATE
3068454610:         hWnd = 7C079270
3068454610:         message = 6
3068454610:         wParam = 1
3068454610:         lParam = 0
3068454610:       elapsed:	133ms
3068454610:     leave;
3068454610:     elapsed:	988ms
3068454610:   return (1)
3068454610:   long __cdecl WndProc(struct HWND__ *,unsigned int,unsigned int,long)
3068454610:     IDM_OK
3068454610:       hWnd = 7C079270
3068454610:       message = 273
3068454610:       wParam = 40000
3068454610:       lParam = 2080871584
3068454610:       long __cdecl WndProc(struct HWND__ *,unsigned int,unsigned int,long)
3068454610:         WM_ACTIVATE
3068454610:           hWnd = 7C079270
3068454610:           message = 6
3068454610:           wParam = 0
3068454610:           lParam = 0
3068454610:         elapsed:	136ms
3068454610:       leave;
3068454610:       long __cdecl WndProc(struct HWND__ *,unsigned int,unsigned int,long)
3068454610:         WM_DESTROY
3068454610:           hWnd = 7C079270
3068454610:           message = 2
3068454610:           wParam = 0
3068454610:           lParam = 0
3068454610:         elapsed:	114ms
3068454610:       leave;
3068454610:     elapsed:	0ms
3068454610:   leave;
3068454610:   elapsed:	0ms
3068454610: return (0)

Conclusion

This is a work in progress that was inspired by the need to instrument production level application code, that would remain in place and would not reduce the readability of the code. This framework is built around the mechanics available in C++, and attempt to be as non-intrusive as possible.

I have just completed my first pass of this framework, and I will now attempt to use it. I will be making improvements to this article if I find deficiencies in the framework. I welcome any comments or suggestions to help improve on its design and implementation.

In the future, I hope to improve the profiling reporting functionality as well as include features to record and report code coverage paths and other useful metrics.

History

8^th October, 2008: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By

Paul M Watt

Engineer

United States

I am a software architect and I have been developing software for nearly two decades. Over the years I have learned to value maintainable solutions first. This has allowed me to adapt my projects to meet the challenges that inevitably appear during development. I use the most beneficial short-term achievements to drive the software I develop towards a long-term vision.

C++ is my strongest language. However, I have also used x86 ASM, ARM ASM, C, C#, JAVA, Python, and JavaScript to solve programming problems. I have worked in a variety of industries throughout my career, which include:
• Manufacturing
• Consumer Products
• Virtualization
• Computer Infrastructure Management
• DoD Contracting

My experience spans these hardware types and operating systems:
• Desktop
o Windows (Full-stack: GUI, Application, Service, Kernel Driver)
o Linux (Application, Daemon)
• Mobile Devices
o Windows CE / Windows Phone
o Linux
• Embedded Devices
o VxWorks (RTOS)
o Greenhills Linux
o Embedded Windows XP

I am a Mentor and frequent contributor to CodeProject.com with tutorial articles that teach others about the inner workings of the Windows APIs.

I am the creator of an open source project on GitHub called Alchemy[^], which is an open-source compile-time data serialization library.

I maintain my own repository and blog at CodeOfTheDamned.com/[^], because code maintenance does not have to be a living hell.

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.