When an application encounters an error, the last thing you want to happen is that the reporting routine fails or renders an incomplete report because it ran out of string space, or couldn't get more from the heap because the system is starving for memory. My solution to this problem has quietly and privately evolved over the last ten years or so, and I finally am sufficiently confident of their value, and have a litle time to move them into a DLL and write an article about it.
The design of the library takes into account a number of factors.
- According to "STRINGTABLE Resource," in the Microsoft Windows Platform SDK, a string resource "must be no longer than 4097 characters." Accordingly, any valid string will fit into a buffer of that many
TCHARs, plus one for the terminal null character. Accordingly, buffers designated for use as destinations for
LoadString are allocated as arrays of 4098
unsigned short elements.
- Since error messages usually involve limited amounts of text being substituted into a string, I still use the old
swprintf functions, which aren't supposed to use more than 1024 bytes of output buffer. (See wsprintf function in the MSDN library.) Hence, the
sprtintf buffers are allocated as arrays of 1024
- Over the past two years of usage, I have found that few programs need more than 3 buffers each for use by
sprintf. Being a conservative sourt of person, though, I built my DLL with 5 buffers of each kind.
- There are numerous applications, including those related to error reporting, in which substitution tokens are more useful than the cryptic
printf tokens, not to mention that a replace function can operate on strings of arbitrary length, and the replacement string is specified once only, rather than once for each replacement, as you must if you use printf or
sprintf to perform the replacment.
- For many other reasons, which could easily fill another article, I studiously avoided using templates, frameworks, and external libraries, other than my own, to eliminate, as much as possible, the risk of a heap allocation sneaking in by the back door, and to otherwise keep the code "lean and mean," in the interest of low overhead, robust error reporting.
- The public functions use the
__declspec(dllimport) calling convention, but there is no
.DEF file. Though I have many libraries that use them, I have discovered an annoying side effect, which is that functions imported from a library that has one display only their ordinals in
dumpbin import reports. While ordinals are efficient for the loader, they are inefficient for us carbon units, for the same reason that most people don't use IP addresses directly when they have a choice.
- Functions that return strings have ANSI (narrow character) and Unicode (wide character) versions, for which I used the generic text mappings defined in
TCHAR.H, so that the source of both is virtually identical. Rather than maintain two identical copies of the function bodies, I put them into
.INL files, which are
#included into source files that supply only the appropriate character encoding directive, header inclusions, function prototype, and closing brace.
- This project dispenses with precompiled headers, which cause more trouble than they are worth when some, but not all, modules define the
_UNICODE preprocessor symbols.
- Along the same lines, there is one test program, for which two configurations are defined, one with
UNICODE defined, and the other without.
This package contains a good bit of material. This section offers guidance, in the form of inventories of the directories that comprise the package and the DLLs and link libraries includeded in it.
The following table lists and describes the directories in the package.
|Directory Name ||Abstract |
Library Exerciser and Demonstration program
Headers for libraries, including the article subject library
Link libraries required to build the project
Notes and reference documents
Satellite DLL of string resources
DLL binaries and listings
Scripts used in the post-build step, and the
Release build of test stand program configured to use all Unicode strings, and, therefore, to test the Unicode routines
Release build of test stand program configured to use all ANSI strings, and, therefore, to test the Unicode routines
Release build of satellite DLL of string resources
Scripts called in the post-build step to update the test stand program directories
The following table describes the dynamic link libraries, some of which are required to use the library, and all of which are used by the demonstration program.
|Library Name ||Abstract |
The string manipulation routines in this library do things that I wish the frameworks did well or at all. Although I eventually found equivalents for some in MFC, that puts them out of reach for programs that support only __stdcall, at best, which includes VBA and even robust scripting languages, such as WinBatch.
This library exports a small set of convenience routines for constructing logo banners from version resources. This is one of my oldest libraries of Windows API wrapper routines, all of which are implemented in straight C, and export via __stdcall.
This fairly new library provides ready access to information about the current process and modules that are loaded into it, such as, for example, the fully qualified name of the file from which they were loaded, the directory name from which a specified module loaded, and the fully qualified NetBIOS name of the user that owns the current process.
This library exports a single function (well, two, because it has ANSI and Unicode implementations) that uses memcpy to provide efficient, safe appending and copying of strings. The safety refers to the fact that the routines use HeapReAlloc when necessary to expand the destination buffer to ensure that it can accommodate the requested copy or append operation.
This library, which is about the same age as P6VersionInfo.dll, exports routines that employ substitution tokens to format the members of a SYSTEMTIME structure into a human readable date. The banner routines defined in P6VersionInfo.dll use them to display the current date.
The most important routines in this library are the three that wrap HeapAlloc, HeapReAlloc, and HeapFree in Structured Exception Handling blocks, so that they can return system status codes if they encounter a problem. HeapFree is also protected by a preceding call to HeapSize, whose return value is used to detect that the specified pointer didn’t come from the specified heap.
In addition to a handful of Windows NT command scripts and the executable programs upon which they depend, the scripts directory contains a couple of Microsoft .NET assemblies upon which
Date2FN.exe depends. Due to the way that .NET assemblies load, they must stay with
Using the code
INCLUDE of both packages contain a standard C/C++ header file,
FixedStringBuffers.H, which declares the required constants and exported routines. The library include file pulls two additional headers, both found in the same directory, into the compilation. After carefully weighing the risks, I decided that the safest way to deliver the packag was to identify the header file dependencies, leave them all in place, and recommend that you install everything in the
INCLUDE directory into a directory of your own choosing, so long as it meets a single requirement: it must belong to the list of directories named in your
INCLUDE environment variable. This is how they are installed on my development machines, which enables the preprocessor to find them, since those are the directories that are searched for include files whose names appear in angle brackets. The other headers are required to build the library and the demonstration program.
|Name ||Abstract |
|Define const typedefs that I haven't found anywhere in the Platform SDK headers, but use to secure arguments against accidental changes beiing made that might adversely affect the calling appliction if they were to be reflected back into it. |
|Define string reosurce IDs and associated application status codes for conditions that occur frequently in most applications. The resource strings live in |
WWStandardErrorMessages.dll, which the main DLL expects to find in the directory from which it loads. Hence, I deposited a copy in the Debug and Release directories of the main DLL,.
FixedStringBuffersTestStand of both packages contains a like named program, whose main routine is defined in
FixedStringBuffersTestStand.C., with additional routines defined in
FB_LoadStringFromNamedDLLA.CPP. Between them, these two source files demonstrate the full capabilities of the library.
To help you navigate the library, the following table summarizes the main worker routines, all of which have ANSI and Unicode (wide character) implementations.
|Name ||Returm ||Abstract ||Buffer |
Use this routine to report errors via message box, (for any program) or console (for a character mode program), returning the specified status code, unless a further error, such as a missing resource string, prevents the original error being reported.
Use this routine to directly format the message for a system status code. The return value is a pointer to the string, ready to use as you see fit.
Use this routine to load a string from a module for which you have a valid
HMODULE, or from the first module that was loaded into the process address space. A NULL module handle signifies the process module. Use this routine to load strings from modules that are already mapped into the process, either as executable or data-only DLLs.
Use this routine to load a string from a module for which you have a file name. The specified module is mapped into the address space of the calling process, the requested string is read into the buffer, and the module is unloaded.
Use this routine to format a string of up to 4097 characters (the maximum supported length of a resource string). The input string, text to find, and replacement text may come from anywhere, but the new string always comes from a single dedicated buffer that belongs to the DLL.
In addition to the main worker routines, a number of service routines return useful information from the DLL, including the number of each type of buffer that it supports, the sizes of the various types of buffers, and their machine addresses. The following table summarizes these routines.
|Name ||Return ||Abstract ||Buffer |
Get the address of the buffer into which the resource string specified as input to FB_ReportErrorViaStaticBuffer was loaded.
Emergency Message Resource String buffer
Get the address of the buffer used by FB_
ReportErrorViaStaticBuffer when it must use sprintf to construct the finished message.
Emergency Message sprint output buffer
Get the address of one of the output buffers designated for use as
sprintf output buffers.
In the unlikely event that one of the worker routines returns
NULL to indicate that an error occurred, pass the status code returned by GetLastError into this routine. The returned string translates the status code into English, and provides as much information as it can about why the error happened.
Emergency Message sprint output buffer
Get the number of
sprintf buffers. The index that you pass into
FB_GetSprintFBuffer must be less than the returned value.
Get the number of resource string buffers. Your index (
puintBufferID) in any call to
FB_LoadString must be less than the returned value.
Get the size, in bytes, of each
sprintf buffer. This is mostly FYI, since the
sprintf family of routines don't ask how much room they have, and won't use more than 1024 bytes, which happens to be how big these buffers are.
Get the size, in
TCHARs, of each resource string buffer. This is mostly FYI, since these routines supply the information to
LoadString, and the buffers accommodate the maximum supported length of a string resource, 4097 characters.
Copying Strings from the Buffers
The fourth argument to
plpuintLength, a pointer to the location of an unsigned integer which, unless
NULL, receives the character count returned by the underlying
LoadString system routine.
- If you intend to use the strings in situ, you can save 4 bytes of storage in your program and a few machine cycles in the DLL by passing
NULL. However, the argument must always be tested for null, and it takes only two machine instructions to return the value through the supplied pointer.
- For the same reason,
FB_Replace has a fourth argument, named
puintNewLength, to emphasize that it reports the length of the new string.
The fastest way to copy a string from a fixed buffer into one of your own is to call
CopyMemory (which calls
memcpy under the hood), passing the address of your own buffer as the first argument, the address returned by
FB_Replace as the second argument, and the character count times
sizeof ( TCHAR ) as the third argument. Failure to multiply the character count as I just described will get only half of your buffer copied out if it is composed of Unicode characters.
The following snippet, taken from
FB_ReportErrorViaStaticBuffer (the routine that motivated me to gather these routines into a library) illustrates use of
memcpy to copy out the new string generated by
FB_Replace, which is called several times in a loop, to replace the tokens embedded in the error message template.
for ( intSrchIndx = ARRAY_FIRST_ELEMENT_P6C ; intSrchIndx < sizeof ( m_aszTokens ) / sizeof ( m_aszTokens [ ARRAY_FIRST_ELEMENT_P6C ] ) ; intSrchIndx++ )
lpChanged = FB_Replace ( lpErrMsgResStr ,
m_aszTokens [ intSrchIndx ] ,
alpReplacements [ intSrchIndx ] ,
&uintNewLen ) ;
if ( StringIsNullOrEmptyWW ( lpChanged ) )
_stprintf ( m_lpFBReplaceBuff ,
FB_XlateFBStatusCode ( GetLastError ( ) ) ) ;
lpFinalMessage = m_lpFBReplaceBuff ;
if ( IsLastLoopLT_WW ( intSrchIndx , ( sizeof ( m_aszTokens ) / sizeof ( m_aszTokens [ ARRAY_FIRST_ELEMENT_P6C ] ) ) ) )
lpFinalMessage = lpChanged ;
memcpy ( lpErrMsgResStr ,
TcharsMinBufSizeP6C ( uintNewLen ) ) ;
} } }
Since I have "broken the ice" by displaying a code snippet, I shall shift gears, and call attention to a few aspects of the above example, and the ones to follow, that are significant, but not immiediately obvious.
Points of Interest
The loop shown above illustrates quite a few things that you will see throughout my code.
- The initialization clause of the for statement uses
ARRAY_FIRST_ELEMENT_P6C, a macro that expands to a numeric value of zero. I use such macros to document magic numbers, which is what I perceive the lower bound of an array to be.
- Although the limit clause is an expression in the source code, even a debug build of the code converts the expression to an immediate (hard coded) constant, which can be seen in the following snippet from the disassembly of the
for statment shown above. This feat is possible because the required values are all known at compile time, so the compiler computes the value, and bakes it into the code.
81: for ( intSrchIndx = ARRAY_FIRST_ELEMENT_P6C
10002DBF mov dword ptr [ebp-24h],0
10002DC6 jmp FB_ReportErrorViaStaticBufferW+1D1h (10002dd1)
10002DC8 mov ecx,dword ptr [ebp-24h]
10002DCB add ecx,1
10002DCE mov dword ptr [ebp-24h],ecx
10002DD1 cmp dword ptr [ebp-24h],3
10002DD5 jae FB_ReportErrorViaStaticBufferW+286h (10002e86)
The limit test evaluation is the
cmp instruction at machine address 10002DD1; the MASM style comment is lifted vebatim from my work notes, from which I lifted the above snippet.
- For the same reason, I didn't waste space in the executable file to evaluate and store the expression for use in the last iteration test that begins "
if ( IsLastLoopLT_WW " that suppresses the memory copy on the last iteration, since the final string can be used where it sits.
- The new length is captured on each iteration into
uintNewLen, which is allocated at the top of the routine, and fed to
memcpy to copy out the string between iterations, so that the output buffer can be reused. (As I write this, I realize that the copying could be eliminated by allocating a second buffer, and alternating between them on each iteration. I leave that as an exercise for ambitions readers, or for the next version of the library.)
- Though it looks like a function call,
TcharsMinBufSizeP6C is a parameterized macro that hides the multiplication by
sizeof ( TCHAR ) that I described above, and accounts for the trailing null,.
- Copying the trailing null every time makes it safe to reuse buffers without initializing them.
- The wide character calculation requires just one machine instruction, since
push don't count, because both are required to get the number into the argument list.
10002E66 mov edx,dword ptr [ebp-20h]
10002E69 lea eax,[edx+edx+2]
10002E6D push eax
The middle instruction at machine address
10002E69 accounts for both
sizeof ( TCHAR ) for a wide character and the trailing null (
UNICODE is undefined, that instruction becomes
add edx, 1, and
edx goes onto the stack.
- The last iteration test is another parameterized macro; this macro generates an expression that evaluates to true only on the last iteration of the loop.
#define IsLastLoopLT_WW(pintLoopIndex,pintLoopLimit) ( ( pintLoopIndex + 1 ) == pintLoopLimit )
Since the limit test of this loop is that the index is less than the upper limit, the loop stops when the index is one short of the loop. Why this expresson works is left as an exercise for the interested reader, as is the disassembly.
- The final function style macro in this block,
StringIsNullOrEmptyWW, is inspired by the static
string.IsNullOrEmpty method in the Microsoft .NET Framework, and it behaves in exactly the same way. The macro is straightforward, as is the generated machine code.
#define StringIsNullOrEmptyWW(plpString) ( ( BOOL ) ( plpString == NULL || StringIsEmptyWW ( plpString ) ) )
The machine code generated to impment the macro in the snippet shown above is as follows.
85: if ( StringIsNullOrEmptyWW ( lpChanged ) )
003B37EE cmp dword ptr [ebp-8],0
003B37F2 je FB_ReportErrorViaStaticBufferA+20Eh (003b37fe)
003B37F4 mov edx,dword ptr [ebp-8]
003B37F7 movsx eax,byte ptr [edx]
003B37FA test eax,eax
003B37FC jne FB_ReportErrorViaStaticBufferA+24Bh (003b383b)
The example above, also taken from my working notes, shows the register values from a test for a string that is neither null, nor empty.
Using Two or More Buffers at Once
The last major point that I think deserves some attention is a demonstration of a case in which it helps to have access to more than one (three, to be exact) static buffers. The example is the
StagingOrbits routine, defined in
FixedStringBuffersTestStand.C, most of which is reproduced below.
#define FB_BUFFER_INDEX_TOSEARCH ( FB_GUARANTEED_BUFFER + ARRAY_NEXT_ELEMENT_P6C )
#define FB_BUFFER_INDEX_TOFIND ( FB_BUFFER_INDEX_TOSEARCH + ARRAY_NEXT_ELEMENT_P6C )
#define FB_BUFFER_INDEX_REPLACEMENT ( FB_BUFFER_INDEX_TOFIND + ARRAY_NEXT_ELEMENT_P6C )
for ( uintStrData = STRDATA_FIRST ;
uintStrData <= STRDATA_LAST ;
lpStrData = FB_LoadTestString ( uintStrData ,
FB_BUFFER_INDEX_TOSEARCH ) ;
for ( uintStrFind = TOFIND_FIRST ;
uintStrFind <= TOFIND_LAST ;
uintStrFind ++ )
lpStrFind = FB_LoadTestString ( uintStrFind ,
FB_BUFFER_INDEX_TOFIND ) ;
for ( uintStrRepl = TOREPLACE_FIRST ;
uintStrRepl <= TOREPLACE_LAST ;
* plpOrbit += 1 ;
lpStrRepl = FB_LoadTestString ( uintStrRepl ,
FB_BUFFER_INDEX_REPLACEMENT ) ;
lpReplaced = FB_Replace ( lpStrData ,
&uintLength ) ,
lpReplaced4lOG = StrReplace_P6C ( ( lpReplaced
: FB_XlateFBStatusCode ( GetLastError ( ) ) ) ,
_T ( "\n" ) ,
_T ( "[NEWLINE]" ) ) ;
lpstrData4Log = StrReplace_P6C ( lpStrData , _T ( "\n" ) , _T ( "[NEWLINE]" ) ) ;
lplpStrFind4Log = StrReplace_P6C ( lpStrFind , _T ( "\n" ) , _T ( "[NEWLINE]" ) ) ;
lpStrRepl4Log = StrReplace_P6C ( lpStrRepl , _T ( "\n" ) , _T ( "[NEWLINE]" ) ) ;
_tprintf ( lpMsgTpl ,
* plpOrbit ,
uintLength ) ;
FreeBuffer_WW ( lpstrData4Log ) ;
FreeBuffer_WW ( lplpStrFind4Log ) ;
FreeBuffer_WW ( lpStrRepl4Log ) ;
FreeBuffer_WW ( lpReplaced4lOG ) ;
} } }
The objective of this routine is to thorougly exercise the
FB_Replace library routine, which takes three string arguments, all of which are inputs, and a fourth argument, which is a pointer to a
UINT variable that receives the length of the new string.
<font face="Courier New">StagingOrbits</font> is implemented as a nested
for loop, with a loop corresponding to each of the three inputs. Since all three inputs must be present when the innermost loop calls the
<font face="Courier New">FB_Replace</font> routine, it uses three of the five resource string buffers, designated by the three symbolic constants defined at the top of the listing.
Since they aren't really part of the test, but are used to format the output so that it can be read into Microsoft Excel for analysis, strings
lpStrRepl4Log are constructed in dynamically allocated buffers, using
StrReplace_P6C, the predecessor of
FB_Replace that allocates memory as needed from the heap, and can, therefore, handle strings of arbitrary length. Unlike its successor,
StrReplace_P6C has no provision for returning the length of its finished string, although it can be derived by dividing the value returned from
sizeof ( TCHAR ) and subtracting one from the quotient. Why this is much faster than passing the string to
_tcslen is left as an exercise for you mental gymnasts. The only additional hint I shall offer is that this method works because the returned buffer is exactly big enough to hold the returned string.
There are a number of differences between the algorithms used by
FB_Replace, I'll just say that I think the algorithm implemented in
FB_Replace is more robust in several respects, and there is a good chance of it being adapted to work with dynamic memory, to become an improved version of
Lessons Learned or Reinforced
- Reinforced: Compute offsets into character strings in characters, and let the compiler convert them to bytes. You may as well, since you can't make it do otherwise without more work than it's worth.
- Reinforced: Test the Unicode version first, and the ANSI version will probably take care of itself.
- Learned: The CRT string routines fail badly when fed a null reference. My solution to this issue is
TcsLenEvenIfNull, a function style macro that wraps my
StringIsNullOrEmptyWW macro, discussed above, in a ternary expression that calls
_tcslen only if the string pointer is not null and the string has a length greater than zero. This saves the function call for when you really need it, and avoids the badly handled null reference exception.
TcsLenEvenIfNull is defined in
FixedStringBuffers_Pvt.H, which is part of the main DLL project; its dissection is left as a lab exercise.
- Learned: The easiest and best way to avoid string ID number collisions is to group strings into satellite DLLs. This lesson culminated in the creation of library function
FB_LoadStringFromDLL and the VBA macro that makes
FB_Replace_Test_Strings.XLSM magic. Collision Proof Shared String Resources is all about the Excel workbook and its magic, and includes an improved version of the workbook, along with Visual Studio template projects from which to create your own string DLLs.. Meanwhile, I left a copy in the
NOTES directory of the project for you to explore. [New Version] As of 2 June 2015, the download package contains a copy of the improved workbook that I published with the article, and exports of the embedded VBA source code. Along with some bug fixes, the new version sports a hot key that starts the macro, Ctrl-Shift-G. The macro project is locked but unsigned. (To prevent accidental changes, I lock my VBA projects.), and the critcial formulas in the worksheets are protected against accidental changes, as are the lookup worksheets from which the resource script and its header are generated. If you downloaded the archive for this article last month, you may want to download it again to get the updated
NOTES directory. Better yet, use the hyperlink above to pop the other article open in a new browser window, read it, and get its demonstration package.
08 April 2015 - Article published.
02 June 2015 - Added new version of FB_Replace_Test_Strings.XLSM to both download packages, revise the article to cover the new package and include a link to the article about the workbook and the associated C/C++ code, reword a sentence here and there, and make a few cosnetic changes.
I deliver robust, clean, adaptable, future-ready applications that are properly documented for users and maintainers. I have deep knowledge in multiple technologies and broad familiarity with computer and software technologies of yesterday, today, and tomorrow.
While it isn't perceived as sexy, my focus has always been the back end of the application stack, where data arrives from a multitude of sources, and is converted into reports that express my interpretation of The Fundamental Principle of Tabular Reporting, and are the most visible aspect of the system to senior executives who approve the projects and sign the checks.
While I can design a front end, I prefer to work at the back end, getting data into the system from outside sources, such as other computers, electronic sensors, and so forth, and getting it out of the system, as reports to IDENTIFY and SOLVE problems.
When presented with a problem, I focus on identifying and solving the root problem for the long term.
Specialties: Design: Relational data base design, focusing on reporting; organization and presentation of large document collections such as MSDS libraries
Development: Powerful, imaginative utility programs and scripts for automated systems management and maintenance
Industries: Property management, Employee Health and Safety, Services
Outside Interests: Great music (mostly, but by no means limited to, classical), viewing and photographing sunsets and clouds, traveling by car on small country roads, attending museum exhibits (fine art, history, science, technology), long walks, especially where there is little or no motor traffic, reading, especially nonfiction and thoughtfully written, thought provoking science fiction