Click here to Skip to main content
15,920,576 members
Articles / Programming Languages / C++
Article

Yet another CString replacement (only much more practical)

Rate me:
Please Sign up or sign in to vote.
4.48/5 (25 votes)
1 Jul 20034 min read 167.9K   1.4K   28   60
MFC/non-MFC usage, UNICODE support, numerics operations support

Introduction

Its been almost 5 months since I'm involved in a very large Visual C++ project. The thing that I missed the most was a really practical string class.

<Getting political>

A good programmer is one that gets things done on time. Getting things done on time depends on the language you use and the available libraries. It's obvious that the language we use is C++, more exactly Visual C++. C++ is probably the most difficult language to master these days but we can't do anything about it, that's the standard and we have to live with it, and so we have a -(minus) on the "gets things done in time", which is almost always true from the point of view of management people. The only thing that can save us, besides years of experience, is good library support. The specific library support I will talk about in this article is, of course, string related and the keyword is practical.

On the MFC side I have CString, it's a decent string class implementation but I'm not very happy with it mainly for three reasons:

  1. I can use it only in MFC projects
  2. I cannot chain operations on a CString instance. Example:
    CString str( "  05 09 1990 ");    
    
    // I want to convert to "05/09/1990"
    str.TrimLeft();
    str.TrimRight();
    str.Replace( " ", "/" );

    By chaining operations I mean the following:

    str.TrimLeft().TrimRight().Replace( " ", "/" );

    See the difference? I don't know about you but I definitely do and I don't want to go back.

  3. Numeric operations support. This is something I used even in the smallest projects I've done. There's almost no way I can escape these conversion operations, and again CString doesn't do a good job to help me out. Of course CString provides Format and FormatV methods but they're not at all nice with me.
On the non-MFC side I have ( if you're a STL lover stop reading here ) well...almost nothing. Yes, I know there is std::string but I can't do much with it. It suffers the same drawbacks ( maybe more ) all STL does: too much academicism as oposed to practical, everything is so in place and standards aligned that I am nothing but delighted when I look at it. Unfortunately that's all about it.

</Getting political>

Using the code

The class name is CStr. I'll start directly with examples, I think the class interface is self-explanatory:

Chaining operations: The above CString example using the CStr class:

CStr str( "  05 09 1990 " );    
str.Trim().Replace( " ", "/" ); 

Numerical operations:

CStr str;
long lValue;
str = 123; 
lValue = str; // lValue -> 123;
        
// or if you don't like automatic conversion operators
double dValue;
str.SetDouble( 123.45 );
dValue = str.GetDouble(); // dValue -> 123.45;

Append operator ( denoted by << ) : Using this operator you can chain concatenation like this:

CStr str;
str << "Value=" << 12 << .5 << " seconds"; // str -> "Value=12.5 seconds";

Observations:

  • It is not the perfect string class but I like it, there will be bugs for sure in this implementation, so please let me know. If you think there are methods that must be added/changed also let me know, and if I like the proposals I'll change the interface.
  • In many cases are performed operations that modify the string length, and thus the size of the internal buffer, and to avoid some of memory reallocation cycles each new allocation adds an extra buffer size of
    CSTR_EXTRA_SIZE 
    (defined as static member). The default value is 16, if you don't want to use this feature and want to keep the buffer as small as possible set CSTR_EXTRA_SIZE value to 0.
  • Numerical conversion regarding double values is done using _vsntprintf and setting type field to "%g".
  • Though I wrote this class with full Unicode support I didn't have a chance to test it using Unicode, so it'll be great if someone can test it, before I'll use it in real projects ;).
  • The demo project is an empty MFC project that is full of ASSERTs that wrap CStr methods for testing, something like unit testing.

CStr public Interface

                static int CSTR_EXTRA_SIZE;

                CStr() throw();
                ~CStr();
                CStr( const CStr &str ) throw();
                CStr( LPCTSTR pszStr ) throw();
                CStr( int allocSize ) throw();

CStr&           operator = ( const CStr& str ) throw();
CStr&           operator = ( LPCTSTR pszStr ) throw();
CStr&           operator = ( int iValue ) throw();
CStr&           operator = ( long lValue ) throw();
CStr&           operator = ( unsigned long ulValue ) throw();
CStr&           operator = ( double dValue ) throw();
            
// internal buffer access
TCHAR*          Buffer() const;
TCHAR*          Buffer();
int             BufferSize() const;
BOOL            Realloc( int size ) throw();
void            Compact() throw();

// attributes
int             Length() const;
BOOL            IsEmpty() const;

// comparision
BOOL            operator == ( LPCTSTR pszStr ) const;
BOOL            operator != ( const LPCTSTR& str ) const;
int             Compare( LPCTSTR pszStr ) const;
int             CompareNoCase( LPCTSTR pszStr ) const;

// accessors
               operator LPCTSTR () const;
               operator int () const;
               operator long () const;
               operator unsigned long () const;
               operator double () const;

TCHAR&         operator [] ( int pos );
const TCHAR&   operator [] ( int pos ) const;
    
CStr           Left( int size ) const throw();
CStr           Right( int size ) const throw();
CStr           Mid( int start, int size ) const throw();
    
               // number of occurences
int            FindCount( LPCTSTR pszStr ) const; 
               // same as above only starting at startPos
int            FindCount( int startPos, LPCTSTR pszStr ) const; 
               // zero-based index of the 1st occurence
int            Find( LPCTSTR pszStr ) const;
               // zero-based index of the nth occurence
int            FindNth( int nth, LPCTSTR pszStr ) const;
               // zero-based index of the 1st occurence starting
               // from the end of the string
int            ReverseFind( LPCTSTR pszStr ) const;
               // zero-based index of the nth occurence starting
               // from the end of the string
int            ReverseFindNth( int nth, LPCTSTR pszStr ) const;

// operations
CStr&          Empty();
CStr&          Fill( const TCHAR chr );
CStr&          Trim();
CStr&          TrimLeft();
CStr&          TrimRight();
CStr&          Lower();
CStr&          Upper();
CStr&          Insert( int pos, LPCTSTR pszStr ) throw();
CStr&          Prepend( LPCTSTR pszStr ) throw();
CStr&          Append( LPCTSTR pszStr ) throw();
CStr&          Remove( int pos, int len ) throw();
CStr&          Trunc( int pos );
CStr&          Replace( LPCTSTR pszOld, LPCTSTR pszNew ) throw();
CStr&          Replace( int startPos, LPCTSTR pszOld, LPCTSTR pszNew ) throw();
CStr&          Format( LPCTSTR pszFormat, ... ) throw();
CStr&          FormatV( LPCTSTR pszFormat, va_list args ) throw();
    
// numerics ( get/set )
int            GetInt() const;
long           GetLong() const;
unsigned long  GetULong() const;
double         GetDouble() const;
CStr&          SetInt( const int iValue ) throw();
CStr&          SetLong( const long lValue ) throw();
CStr&          SetULong( const unsigned long ulValue ) throw();
CStr&          SetDouble( const double dValue ) throw();

// << concatenation operator
friend CStr&   operator << ( CStr &str, LPCTSTR pszStr ) throw();
friend CStr&   operator << ( CStr &str, int iValue ) throw();
friend CStr&   operator << ( CStr &str, long lValue ) throw();
friend CStr&   operator << ( CStr &str, unsigned long ulValue ) throw();
friend CStr&   operator << ( CStr &str, double dValue ) throw();

That's it. Have fun !

History

June 21, 2003

  • Replaced new/delete operators with malloc/free for internal buffer allocation/deallocation. It seems malloc is faster for large arrays.
  • If memory allocation fails the std::bad_alloc exception will be thrown. The methods that may throw this exception are marked with throw() in the class declaration.
  • As requested two new functions were adeded: FindNth( int nth, LPCTSTR pszStr ) const which returns the zero-based index of the nth occurence of pszStr. ReverseFindNth( int nth, LPCTSTR pszStr) const does the same thing only doing reverse search starting from the end of the string.
  • Contains method was renamed to the more suggestive FindCount.
  • All #defines replaced with static members, including CSTR_EXTRA_SIZE.
  • The CStr implementation was backed up by _ASSERTs.
  • Other small cleanups.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Web Developer
Romania Romania
Programmer

Comments and Discussions

 
GeneralRe: Unnecessary Pin
Richard Grenfell27-Jun-03 5:25
Richard Grenfell27-Jun-03 5:25 
Generaloperators Pin
HeavyHenke19-Jun-03 3:53
HeavyHenke19-Jun-03 3:53 
GeneralRe: operators Pin
Behzad Ebrahimi14-Sep-04 1:41
Behzad Ebrahimi14-Sep-04 1:41 
GeneralIssues Pin
.:floyd:.19-Jun-03 2:04
.:floyd:.19-Jun-03 2:04 
GeneralRe: Issues Pin
William E. Kempf19-Jun-03 8:33
William E. Kempf19-Jun-03 8:33 
GeneralRe: Issues Pin
.:floyd:.19-Jun-03 8:48
.:floyd:.19-Jun-03 8:48 
GeneralRe: Issues Pin
soso_pub23-Jun-03 22:32
soso_pub23-Jun-03 22:32 
GeneralRe: Issues Pin
.:floyd:.23-Jun-03 23:53
.:floyd:.23-Jun-03 23:53 
soso_pub wrote:
And I don't really care about out-of-memory situations, if memory is low there's not much to do anyway.

Maybe not much, but having the application silently crash is generally not accepted as the best choice Wink | ;)

soso_pub wrote:
<<<<<<<<<
* [Str.cpp: line 38]: applying operator delete on an array results in undefined behaviour.;
>>>>>>>>>>
Not on MS's C++ implementation, no plans for portability anyway.


You might be surprised, but it is undefined even with MS's C++ implementation. Let me shed some light on what undefined behaviour means: The ISO Standard doesn't enforce any sort of restrictions on the code creation, i.e. any given implementation can choose freely as to what they do about this erroneous statement. MS happened to choose what you expected. But there are 2 more pieces of information that should make you feel uneasy about it: a) an implementation doesn't have to document the behaviour of choice. b) an implementation can choose to change the behaviour whenever the vendor decides to -- again, no need to document the change.

In addition, you may have more plans for portability than you expected, if you think about what portability actually means: changing the platform, which could be as subtle a change as installing a service pack for your current compiler.

soso_pub wrote:
<<<<<<<<<
* Illegal symbol names: _Alloc, _Free, _Init, _PaddedSize, _CheckSize, _StrCopy, and __STR__EX_H. According to the ISO standard of c++ all symbol names containing a double underscore or starting with an underscore immediately followed by a capital letter are reserved for use by the implementation. [...]
>>>>>>>>>>
It compiles, it works, I see no problem with it. How many C++ compiler are standards compliant? And take a look at the header files of the STL implementation from SGI.


It may just compile currently, on your machine, with your code, with your choice of libraries to use. Now if you say that this is all that is needed to make it compile in any scenario you have quite a bit of the road ahead of you.

How many C++ compilers are standards compliant? Currently, none. It's still a rather weak excuse for not obeying the rules of the language in your code, though.

I'm not sure what your reference to SGI's STL implementation is supposed to say. I assume that you found those 'illegal' symbol names there, too. That is, well, true. Don't forget though, that the STL is part of the C++ implementation and thus SGI is free to use those symbol names.

soso_pub wrote:
Updated, all methods that throw std::bad_alloc are marked with throw() in their declaration.

Damn, I wish C++ didn't screw up this detail. Declaring a method as throw() tells the client that it doesn't throw any exception. You should use throw( std::bad_alloc ) instead. If there is no exception specification explicitly declared a method can throw any exception. This one has upset a number of developers around the globe...

soso_pub wrote:
Updated, though I'm not really into using assertions.

Why? Seriously, I cannot find a valid explanation for not using assertions heavily. They don't slow down your code at all, but present vital information during debugging. Use them or shoot your own leg -- it's up to you.

soso_pub wrote:
<<<<<<<<<<
* #define's: are BAD! [...]
>>>>>>>>>>
They are not always bad, without #defines how does MFC handle message maps?


That is true. I can't think of an alternative for include guards. But in this specific case, I also cannot think of any valid reason that would justify to use a macro instead of a true constant. If you do, please elaborate.

MFC's message maps use macros, granted. The reason isn't so much that there aren't better alternatives but rather the weak compiler support for templates. MFC has been around for quite some time and back when it was designed template support was too weak to be even considered.

My statement that macros are BAD! was rather blunt, agreed. Keeping this advice in the back of my head, though, has forced me a number of times to think about alternatives, that usually turned out to be a lot more powerful and even more importantly way less dangerous. At any rate, the way you are using it could easily be called a mistake.


Anyway, I'm glad that you could use some of those comments to improve your code. Best of luck to you

.f
GeneralAnother suggestion Pin
Jonathan de Halleux18-Jun-03 22:48
Jonathan de Halleux18-Jun-03 22:48 
GeneralUse template function Pin
Jonathan de Halleux18-Jun-03 22:46
Jonathan de Halleux18-Jun-03 22:46 
GeneralRe: Use template function Pin
Tim Smith16-Jul-03 3:50
Tim Smith16-Jul-03 3:50 
GeneralFinding functions Pin
Stlan18-Jun-03 19:34
Stlan18-Jun-03 19:34 
GeneralRe: Finding functions Pin
Johnny ²18-Jun-03 23:33
Johnny ²18-Jun-03 23:33 
GeneralRe: Finding functions Pin
soso_pub23-Jun-03 22:36
soso_pub23-Jun-03 22:36 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.