Click here to Skip to main content
15,909,199 members

Comments by Jim Horvath (Top 1 by date)

Jim Horvath 22-Jun-11 20:15pm View    
Deleted
Don't be too horrified by string.Concat using + to concatenate two items... using + to concat between _only_ two (2) objects is okay! The performance problem (could) arise when you use + between _more than 2 items_. And when it comes to strings, this should rarely be an issue with modern .NET because the compiler will optimize many of your long-chained string adds - read to the end for the proof. But first - if you understand a) string's immutability (a string object can't be changed once it's created) and b) how multiple + expressions work, then it should make sense that where:

string str1 = "1";
string str2 = "2";
string str3 = "3";

then this statement creates one new string and stores it in the variable "combined":

string combined = str1 + str2;

but this next statement creates 2 new strings before storing the result in combined, and one of them is immediately garbage that needs to be collected!

string combined = str1 + str2 + str3;

One of them is immediately garbage because:
- first str1 and str2 are concat'd, making a string I'll call TEMPSTRING.
- then TEMPSTRING is concat'd with str3 and the result is stored in combined.

This is what makes + with strings potentially inefficient - that need for TEMPSTRING with chained + creates garbage which needs to be collected, but it can be avoided if you use string.Concat or StringBuilder. This example theoretically makes TWO temporary strings that are immediately garbage:

string combined = str1 + str2 + str3 + str4;

Breakdown:
- TEMPSTRING1 = str1 + str2
- TEMPSTRING2 = TEMPSTRING1 + str3
- combined = TEMPSTRING2 + str4

If you've read this far then you must be interested. :) Besides using + between 2 strings being okay, there are many cases where the C# compiler will replace chained + expressions between strings with string.Concat for you! For proof, compile the following code and use Reflector, or put it into LINQPad and look at the "IL" tab - you'll see that the last statement adding 3 strings is replaced by a call to string.Concat:

string str1 = "1";
string str2 = "2";
string str3 = "3";
string str1And2And3 = str1 + str2 + str3;

If you use + between more than 4 strings, the compiler will simply allocate an array, put all your strings in it, and pass that to string.Concat. As long as you are using + between string objects, the compiler does a lot of work to avoid creating less-than-optimal code. It may even do it if you're using + between strings and non-strings. I haven't checked. Overall, my point is:

- When using a modern .NET compiler, you don't need to avoid + between strings as much as was recommended back in the early days of .NET. In the earlier days of .NET it was a very valid concern, but I know 3.5 and 4.0 do these optimizations of string concatenation for you. When in doubt, make a test assembly with the version you are using and use Reflector or JustDecompile to see the results - in my testing with 3.5 and 4.0, you don't even need to do a "Release" optimized build - it does it in "Debug" builds too.

- Be sensible - if you've got a 100-line function that builds a string, then use StringBuilder. The compiler can/will only optimize obvious cases.

- Use a profiler to find cases where your code is slow! Later! Make the code work, and then make it right. In the words of Donald Knuth: "We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil"