|
The Remove() function is part of the CString facade that I grafted on to this class. If you look up in the MSDN help section for CString::Remove, you'll see that there is no version of Remove() which takes a null terminated string. It only takes a single character.
There is a way to do what you want though. Just use the Replace() function
CStdString strTest = "Test String";<br />
strTest.Replace("Test", "");
I suppose that it would be easy enough to write an overload of Remove that could do just this
int Remove(const CT* sz)<br />
{<br />
return Replace(sz, "");<br />
}
but that seems unnecessary.
By the way, please be sure you have the latest version of the code. You can always get it at this link:
http://home.earthlink.net/~jmoleary/code/StdString.zip
I update the code from time to time as I discover platform inconsistencies, bugs, etc.
-Joe
|
|
|
|
|
|
The problem lies in VC6. VC6 has a number of problems with statement completion. To put it bluntly, it's not very good, especially with instances of templates. I suppose there must be a way to make it work but I have never been able to figure it out.
I cannot even get it to work for instances of the base template of CStdString either. In other words, if I declare a std::string object or a std::wstring object, I don't get statement completion either.
The only way I could get it to work was to buy a little Visual Studio add-in called "Visual Assist" (http://www.wholetomato.com). That gives me statement completion for everything. Unfortunately you must pay for this (though it's cheap), but I found it was more than worth the price for all the stuff it does.
Sorry I don't have a better answer for you.
-Joe
|
|
|
|
|
|
Hi!
I just found this great piece of code and started using it and I find it really good.
One funny thing I am experiencing is that when I call _CrtDumpMemoryLeaks in the end of the execution, I see all the CStdString allocated strings as memory leaks. When I debug it, I see the delete operators called from destructors. If I debug with Numega BoundsChecker, everything is ok. What could be the roblem? Is std using some overridden new/delete or am I doing something wrong?
Cheers,
Allan
|
|
|
|
|
Hi,
There's nothing in the CStdString code which could be causing this. The CStdString code does not dynamically allocate any memory and adds no member data to the base class. If BoundsChecker is reporting no errors, then I'd say it is quite likely you don't have a leak at all and the "phantom" leaks you are seeing and are simply due to the timing of when you call _CrtDumpMemoryLeaks.
Try using recommended CRT method that causes _CrtDumpMemoryLeaks to be called automatically for you at shutdown. You can find it in the MSDN documentation for the function. To quote the docs:
"The function can be called automatically at program termination by turning on the _CRTDBG_LEAK_CHECK_DF bit field of the _crtDbgFlag flag using the _CrtSetDbgFlag function."
Either that or you really have a memory leak due to a bug. I doubt you are new-ing CStdString objects so perhaps you are leaking some object that has CStdString's as a member. Or perhaps that has a container of CStdStrings...
Also (although this has no bearing on your current issue), be sure you have the very latest version of the code. No bugs that could cause such an issue, but I always say this when I respond. You can always get it here:
http://home.earthlink.net/~jmoleary/code/StdString.zip
-Joe
-Joe
|
|
|
|
|
Hi,
thans for your reply. It was my fault as I guessed and the leaks were really phony. I had stack allocated strings in WinMain and when I called _CrtDumpMemoryLeaks, the destructors hadn't been called yet. And thanks for the tip - I'll use the _CRTDBG_LEAK_CHECK_DF bit now.
Appologies for the hassle.
Regards,
Allan
|
|
|
|
|
Let me know if you have any problems
-Joe
|
|
|
|
|
Passing a CStdString as a variable parameter to FormatV crashes. The reason for this is that _vsnprintf tries to use the acual class instead of the string as the parameter (it doesn't cast it).
Here is the test:
CStdString strTest = _T("Test");
CStdString strScratch;
strScratch.Format(_T("%s"), strTest); //Crash here
|
|
|
|
|
duh I just noticed the other post and the link to your latest files. I will try them out.
-Thanks
|
|
|
|
|
I still have the problem. I don't see what changed to fix it.
|
|
|
|
|
Hi,
Nothing with FormatV() changed. Only Format() changed. I templatized Format in order to allow passing string objects.
You cannot simply pass a string object to FormatV. It doesn't take variable argument lists. Instead, it takes an arglist object which is BUILT from variable argument lists. However if you're building your own arglist then your still stuck with the need to cast. No way to templatize that. Sorry. As I mention in the comment, allowing this practice (passing string objects) with Format was a hack designed to get around a dangerous MS hack people have long relied upon.
This has long been an incompatability between CString and my code. In short, even Microsoft recommends you NOT pass string objects directly to calls to Format(), sprintf() or other such variadic functions. They recommend you first cast them to LPCTSTR. If you do that with the string object's you're passing, everything should be fine.
So instead of this
CStdString sName("Joe")
CStdString sVal;
sVal.Format("My name is %s", sName);
You should instead do this
sVal.Format("My name is %s", (LPCTSTR)sName);
or alternately this:
sVal.Format("My name is %s", sName.c_str());
Please note that this problem is due to a dangerous hack that the CString designers put in which, in my opinion, they never should have done. The only reason CString lets you get away with this practice is because they carefully laid out the binary pattern of the class to enable it. It's a bad habit to get into.
You can find a much fuller discussion of this topic in this feedback thread as long as you set the date filter to go back indefinitely. It's under the thread entitled "Operator[] and other incompatabilities"
-Joe
|
|
|
|
|
Hi Joe,
I meant Format instead of FormatV. I didn't realize it was unsafe to pass a CString object to CString Format. I was excited when I found your class because it is exactly what I need. I don't use MFC, WTL requires its headers on everyones machine, and the STL string is a mess. It looks like if I want to use it though, I will have to change a tun of calls and hope that I get them all. I guess I will just stick to what I have and suffer for now.
Thanks
|
|
|
|
|
You say you DID mean Format()? Well then your code SHOULD work. If it's not working I'd like to see it because it's a bug. I definitely made this workaround for calls to Format().
Please send me some sample code which illustrates the problem and I will check it out. You can find my email address in the StdString.h header file
-Joe
-Joe
|
|
|
|
|
The sample above crashes for me in every test I try. I really don't see what you have done to fix this. The args are passed down to FormatV and FormatV passes them to _vsnprintf and _vsnprintf doesn't do a cast so it crashes.
-Thanks
|
|
|
|
|
I don't see what sample you are talking about. However if the implementation you see simply does what you describe, then you have an outdated version of the code.
OR... I think I may see it now. There are two versions of Format(). One that takes a string literal for the format strin
void Format(const CT* szFmt, ...)
and another that take a resource ID.
void Format(UINT nId, ...)
Are you calling the one that takes a resource ID? That might be the problem. I only fixed the version that takes the string literal. Sorry, my bad. Just forgetfulness on my part. I'll do the same "fix" with the other version. You can download it here:
http://home.earthlink.net/~jmoleary/code/StdString.zip
Please make sure you have the very latest drop. If you do and you are still having trouble, please email me the code directly at jmoleary@earthlink.net
Thanks,
|
|
|
|
|
One of the most common usage of strings is to cancatenate strings using operator +=. In general this operation doesn't seem to be very efficient. Here are some of my test results:
For my testing, I cancatenate a string of 100 char long 10000 times together.
Using MFC's CString as (just pseudo-code)
CString str, s('0', 100);
for (int i = 0; i < 10000; i++)
{
str += s;
}
It takes about 70 seconds on my machine (
It is almost the same result (actually 68 seconds) if std::string is used (
Interesting, I was quite surprised that the same thing can done in VB in 49 seconds - who said VB is slower?
I then investigated some other options.
Using C libary functions as
// allocate big enough buffer first
char* str = new char[100*10000+1];
str[0] = '\0';
for (int i = 0; i < 10000; i++)
{
strcat(str, s);
}
This takes about 25 seconds - it's much better, at least better than VB. By the way, it's about the same result if memcpy is used.
Back using std::string if I reserve a big enough buffer first like
std::string str;
// allocate big enough buffer first
str.reserve(100*10000);
for (int i = 0; i < 10000; i++)
{
str += s;
}
it is MUCH MUCH MUCH faster. Actually, it takes no time (within 1 second anyway)
Can anybody explain the difference between C and std?
Anyway, it seems that the memory allocation is the most expensive operation here.
While there is a reserve function to allow you to allocate big chunk of memory for big string in std, I find it's inconvenient to use because in many cases you don't know how big you really need. If the reserved memory becomes smaller than the needed, all remaining cancatenations will suffer for the same reason.
A better solution to this, in my opinion, is to define a grow size. I can then set a bigger grow size if I know it will be cancatenated many times. In an application with many string cancatenations, it could save a lot of memory reallocations (thus time) if the grow size is set properly. It can improve the application's overall performance.
MFC's CArray has such a feature. The SetSize function has an optional second parameter nGrowSize which can be very useful if working with a very large array. Unfortunately, CString doesn't have it. And none of std containers has such a feature (right?).
Therefore, I extended CStdString to have this wrapper function
MYTYPE& Append(CT ch, int nGrowSize = 128)
{
if (this->capacity() < this->size()+1)
this->reserve(this->capacity() + nGrowSize);
return (*this += ch);
}
MYTYPE& Append(PCMYSTR sz, int nGrowSize = 1024)
{
if (this->capacity() < this->size()+sslen(sz))
this->reserve(this->capacity() + nGrowSize);
return (*this += sz);
}
But then I have to use Append instead of += what I'm fan of.
Anybody has better idea or comments?
|
|
|
|
|
Like most overloaded operators, operator += is just syntactic sugar. A nicety to make for less typing. Semantically, it is exactly the same thing as the basic_string::append function.
I guess what you want is a version of the operator that would would (like your new Append function) allow you to specify the grow length. I guess that would technically be a ternary operator. It certainly wouldn't be operator +=.
An alternative might be to define a string class with a different std::allocator object that ensured memory would be allocated in the chunk size you specify. The basic_string template takes three arguments: A character type, a traits type, and an allocator type.
typedef basic_string<char, char_traits<char>, CMyAllocator> CMyString
where 'CMyAllocator' is a class defined by you for allocating characters in specific chunks. It would have to follow all the semantics of std::allocator.
One problem with this approach is that you would need to know that size at compile time, not runtime, unless you designed some changeable chunksize setting into your allocator that could be set at runtime. And furthermore, such a string class would not be interchangeable with std::string or std::wstring -- technically it's a different C++ type.
If you look at the definition of my template ('CStdStr') you'll see I did NOT put these argments into the definition. The only thing that one can specify to my template is the character type. I then derive from the 'default' implementation of basic_string, given that character type.
I did this for a couple of reasons:
1. It keeps the template name short in the debugging information.
2. I was trying to design something that was interchangeable with the existing specializations of basic_string, std::string and std::wstring
Still, it's easy enough to change. Just alter the template definition for CStdStr to take these extra two arguments and supply defaults for them, just like basic_string does. Your template should derive from basic_string but now supply all three of these arguments. Then write your own allocator with these capabilities and pass it in as the argument you want. Just remember, whatever class you instantiate from this will NOT have an "is-a" relationship with std::string or std::wstring.
Another approach might be to write your own version of operator+= for CStdString. It would check some changeable "grow-size" setting inside the CStdString object be appended. You would then provide member functions to change this setting. However this approach would either entail a) adding a new member variable to CStdString to hold this grow size -- very, very bad OR b)adding some global variable/static member to hold the setting -- also very bad.
Seems like an awful lot of work just to avoid calling reserve(), doesn't it? I'm all for syntactical niceties myself, but sometimes, you just gotta do the extra work, I think.
-Joe
|
|
|
|
|
A bunch of things can be slow when appending a string:
1) Allocating new memory (and releasing old one).
2) Copying the data from the old memory location to the new one.
3) Scanning for the end of the string (when the string is long) to compute it's length.
The C version will suffer from problem #3.
On the other hand the STD version will suffer from problem #1 and #2. But of these problems vanish if reallocation is avoided (by pre-allocating memory).
I'm not sure how memory allocation is done for std::string but according to your performance analysis, I would guess that it would uses a small fixed grow size (probably 8, 16 or 32 characters or something like that).
A better way for performance is to double (or multiply by 1.5) the size instead of adding a fixed amount to it.
In applications I work on, we used to specify small grow size (to conserve memory) at beginning but we found that it was very slow when lot of data was added to the container.
What you should do it to reserve lot of memory and then copy that item into a new one if you want to conserve memory. For example, if you want to add your string to a std::vector, you could do something similar to:
std::vector<std::string> Container;
string Buffer;
Buffer.reserve(100000);
for (int i = 0; i < 10000; i++) Buffer += "More text...";
string Copy(Buffer);
Container.push_back(Copy);
You may also uses the swap trick (create a copy and then swap the copy (low memory overhead) with the original (high memory overhead) so that the object that is kept for a long period of time won't waste memory after it initialisation (assuming that strings are seldom changed after initialisation).
The reason that reserve is include in std::string and std::vector is exactly because avoiding reallocation and more importantly data copying can have a big effect on performance in situations similar to yours. When lot of data is appended, you should reserve memory for better performance.
For typical uses of std::string when the resulting string is not so long (generally well under 1k), the overhead won't matter for most applications.
Note that container like std::vector typically uses multiplicative increment so they are less affected by reallocation. For 1000000 append, it would take about 20 memory allocations (and data copy). In many case, I specify the size for vector only when I do know it (or know an upper bound and do not bother with wasted space - this will typically happens when copying data with some filtering).
Philippe Mori
|
|
|
|
|
The first version, you use CString class, is slowest because it must alloc memory when you += characters
The second version, you use the C runtime library, it's better because you don't have to alloc memory. But inside the strcat() function, it must calculate the length of the original string whenever you += characters. This may take some time if the string is too long.
In the third version, the string class keep the original length of the original string inside the class. So when you += new string, it doesn't need to call strlen() to calculate the original string length. That why it is the fastest code.
|
|
|
|
|
Hi,
I'm trying to compile this using vc5, and I'm getting errors like the ones listed below. Do you have any idea how I could get this working?
Thanks,
T
...stdstring.h(3236) : error C2908: explicit specialization; 'FmtArg<class cstdstr<char=""> >' has already been specialized from the primary template
...stdstring.h(3243) : error C2908: explicit specialization; 'FmtArg<class cstdstr<unsigned="" short=""> >' has already been specialized from the primary template
...stdstring.h(3251) : error C2242: typedef name cannot follow class/struct/union
...stdstring.h(3251) : error C2908: explicit specialization; 'FmtArg<class std::basic_string<char,struct="" std::char_traits<char="">,class std::allocator<char> > >' has already been specialized from the pri
|
|
|
|
|
VC5 ???? VC has always had terrible C++ support, even .NET is missing stuff every other C++ compiler supports. I'd suggest that this code is valid C++ and your compiler cannot understand it, because Microsoft suck at standards support.
Buy a new compiler - VC6 must be easy to get 2nd hand now that .NET is out.....
Christian
We're just observing the seasonal migration from VB to VC. Most of these birds will be killed by predators or will die of hunger. Only the best will survive - Tomasz Sowinski 29-07-2002 ( on the number of newbie posters in the VC forum )
|
|
|
|
|
First, Joe, like most of the people here, I wanted to congratulate you on one great class. CStdString is sharp, and I appreciate you making it available.
However, I have run into one oddity. I'm one of those that likes to compile with the warning level set to 4 in MSVC 6.0. Doing so lead me to an apparent odd dependency in StdString.h . The following code compiles with a series of minor warnings:
<br />
#include <atlbase.h><br />
#include <yvals.h><br />
#include "StdString.h"<br />
I directly suppressed the warnings with a pragma block around StdString.h:
<br />
#pragma warning (push)<br />
#pragma warning (disable: 4511 4663 4018 4100 4146 4244 4512)<br />
... The body of StdString.h ...<br />
#pragma warning (pop)<br />
and got a clean compile. Then i included StdString.h without first including atlbase.h and yvals.h . I received a series of warnings I felt should have been suppressed by the pragma warning disable above. The warnings appear to be coming from locale so I played with the SS_NOLOCALE macro. The number and location of warnings changed, but I was unable to create a clean compile without adding atlbase.h and yvals.h before StdString.h .
Admittedly, this situation is strictly cosmetic, but it is a touch mysterious to me. Reading yvals.h did not prove informative to me. Has anyone else seen this behavior? Do you have a solution, an explanation, a recommendation, or at least a good joke?
Thanks again for the great work,
cagey
|
|
|
|
|
Hi,
The fact is yvals.h is one of the oddest headers ever created by MS. It actually enables some warnings explicitly through its own #pragma warning directives. If you search through it, it seems to only enable a few, but I'd swear that it enables more than what appears there. In particular, it somehow seems to enable 4786.
The only way I've ever managed to work cleanly with it is to use the following trick:
1. Disable all warnings I want to disable
2. #include yvals.h
3. Re-disable all those same warnings
I have a utility library I use, the first few lines of the StdAfx.h look like this:
#pragma warning(disable: 4786) // symbolic name too long
#pragma warning(disable: 4201) // nonstandard extension used
#pragma warning(disable: 4511) // private copy constructors are good to have
(...etc...)
#include yvals.h // now #include the evil yvals.h and do it again
#pragma warning(disable: 4786) // symbolic name too long
#pragma warning(disable: 4201) // nonstandard extension used
#pragma warning(disable: 4511) // private copy constructors are good to have
(...etc...)
That's the only way I've ever found to reliably disable the warnings I want to. They must be disabled before and after the very first yvals.h is included.
There was a big discussion about this recently on the Yahoo newsgroup WinTechOffTopic. Some people use a trick of including one of the iostreams headers first instead of this (but I suspect that only works because those headers end up including yvals.h.
Regardless, <yvals.h> is the culprit. I've been dealing with this problem for years and I still don't understand how it re-enables warnings I've disables when I can find no #pragmas for them. But it does. And this trick is the only way I've found to make those warnings go away every time.
Give this trick a shot and let me know how it works. If you prefer you can email me directly. My address is in the StdString.h header file.
Also, make sure you're using the very latest version. You can always grab it here:
http://home.earthlink.net/~jmoleary/code/StdString.zip
-Joe
|
|
|
|
|
For years one huge incompatability with my CStdString vs MFC's CString has been that you could pass CString objects to the CString::Format() function fill in "%s" format specifiers, but with my class you could not.
In other words, with CString you could do this:
CString name("Joe");
CString val;
val.Format("My name is %s ", name);
But if you used CStdString (my class) in that example, the call to Format() would crash.
Well I am happy to say I have FINALLY figured out a way to workaround this incompatability. You can now pass a CStdString to Format() with no problems.
Important Note: The previous incompatability still exists for other variadic functions like sprintf() and the like, this only fixes it for Format(). My previous recommendation about using alternatives to variadic functions still applies. I just figured I'd get as much compatability as I could, since the compiler doesn't (and can't) warn you either way
Grab the latest code which fixes this incompatability here:
http://home.earthlink.net/~jmoleary/code/StdString.zip
Previously I couldn't do it because MFC's way to make it work relied upon the binary layout of the class. This was something I had no control over as my class derives from whatever implementation of basic_string is available.
But then I figured out a way to selectively apply strong typing to this function using a simple template trick. It did require me to overload the function based on number of arguments (an AWFUL lot of typing and so the file got a lot bigger). But the good news is that this should not bloat your runtime executables much as the functions are all templates that are inline and merely call through to an underlying format function.
Anyone who has any questions, email me. My email address is at the top of the code header file.
-Joe
|
|
|
|
|