Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C++

A C++ StringBuilder Class

4.22/5 (5 votes)
14 Sep 20074 min read 2   602  
An atricle introducing a StringBuilder class written in C++
Sample Image - maximum width is 600 pixels

Introduction

This article introduces a C++ class that facilitates the composition of arguments from varying types into a string. Currently, the C++ standard library does not have a counterpart to the .NET System.Text.StringBuilder. Still, the need to embed the values of parameters of varying types inside a string is very common, e.g. when logging information to a file or when presenting the user with a message.

The StringBuilder class presented in this article addresses this need. It relies on features inherent to C++ and provides the user with a neat and compact way of creating strings on the basis of parameters of varying types.

Background

Lately, while implementing a thread-safe cyclic buffer, I came across the need for a convenient mechanism for logging the state of the buffer to a file. This meant being able to create a string composed of the values of parameters of different types (which together govern the state of the buffer) and pass it to a logging function, which would then write it to a file. One way of achieving this functionality is through sprintf:

C++
int index = 0;
... // cyclic buffer undergoes changes
char string_buf[100];
sprintf(string_buf, "%Current index is: %d.", index);
string message(string_buf);
log(message); // log() is implemented elsewhere

The downside of this approach is that it is very cumbersome. Our need to create a formatted string that contains the value of a single parameter is translated into four lines of C++ code. Moreover, we need to allocate a buffer to be used by sprintf and the decision of how large it should be cannot (in most cases) be made in advance. We are thus exposed to the risk of buffer overflow.

What comes to mind at this point is the idea of wrapping around the fprintf function. fprintf's (somewhat simplified) signature is as follows:

C++
int fprintf(FILE *_file, const char *_format, ...);

The basic idea behind fprintf is that is uses its second argument (_format), which is of type const char *, to read the arguments that come after. The ... argument makes fprintf a variadic function (i.e. a function that accepts a variable number of arguments). It thus seems that if we expose a logging function with the following signature...

C++
int log(const char *format, ...);

...then we can pass its received arguments to fprintf and thus achieve our goal with the following code:

C++
int log(const char *format, ...) 
{
    va_list argumentsList;
    va_start(argumentsList, format);
    
    FILE *pLogFile = fopen("log.txt","w");
    int retVal = fprintf(pLogFile, format, argumentsList);
    fclose(pLogFile);
    
    va_end(argumentsList);
    
    return retVal;
};

Unfortunately, although this code may compile, it is unlikely to do what we opt for. This is because our call to printf is seen -- from the compiler's point of view -- as a procedure call involving only two arguments, which clearly violates our intent to support an arbitrary number of arguments.

Back to the drawing board... It now seems that we have two options available:

  • Implement the mechanism used by the printf family (which consists of a format string and an arbitrarily long list of arguments that comes after) all over again.
  • Do something way more cool, which relies on conversion operators, operator overloading and implicit constructors. This innovative approach should also rely on the fact that C++ is statically typed to make sure that problems with the resultant string will be caught by the compiler, rather than at runtime (this guarantee cannot be made by the printf family).

Naturally, I went for the second option.

Using the Code

As you may have guessed already, the code I came up with uses all the treasures I named above. Fundamentally, it's divided into two classes, StringElement and StringBuilder. StringElement objects can be constructed using a variety of types. None of StringElement's constructors is explicit, which implies that it can be implicitly constructed from a long (and extensible) list of types.

StringBuilder's operator<< makes use of this fact:

C++
StringBuilder &operator<<(StringElement se) 
{
    Append(se); return *this;
};

It accepts an argument of type StringElement and appends it to its _value member -- which is of type std::string -- using the following code:

C++
void Append(StringElement element) 
{
    _value.append(element);
};

For this code to compile, StringElement must be convertible to std::string, and indeed it is. Since the return value of operator<< is StringBuiler & (i.e. a StringBuilder reference) and since operator<< is left-associative, it supports cascading calls. Last but not least, StringBuilder is convertible to both char * and std::string, which allows for the following code to compile:

C++
printf(StringBuilder() << "x=" << x << " and y=" << y);

The only bit that's left is to hide from the user the fact that a StringBuilder object is actually created as part of the call to printf. This is done using a simple macro. The final result is as follows:

C++
printf(SB << "x=" << x << " and y=" << y);

As I mentioned above, a major advantage of the approach taken here compared to the mechanism used by printf is that it allows us to catch ill-formatted strings at compile time. This is because the arguments we put in are compiled into StringElement objects (using the implicit constructors defined on StringElement) and from there on things should proceed smoothly, as the implementation of StringElement should make clear. Considering the number of crashes I've experienced due to typos in my format string, I view this as a significant advantage of the approach I have taken.

History

  • 14 September, 2007 -- Original version posted

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here