Over the last couple of days I've seen numerous examples of people posting about how to count words in a sentence. Disturbingly, these postings recommend suggest counting the number of spaces in the sentence and use that as the basis of a word count.
You may be asking why this is a problem. Well, consider the following sentence:
The total number of words \t in this sentence,is 10.
As you can see, simply counting spaces isn't going to work. There's the special characters (the \t) to take care of, the multiple spaces, and the words separated by a comma without a space. So, if counting spaces doesn't work, what does? The answer is to use a regular expression, and you are going to love how simple it is. There's a simple regular expression that matches words, and takes care of all the guff demonstrated above; all you need to match a word is use \w+. Here's a quick sample:
Regex regex = new Regex("\\w+", RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.CultureInvariant);
string input = "The total number of words \t this sentence is 10.";
MatchCollection match = regex.Matches(input);
A developer for over 30 years, I've been lucky enough to write articles and applications for Code Project as well as the Intel Ultimate Coder - Going Perceptual challenge. I live in the North East of England with 2 wonderful daughters and a wonderful wife.
I am not the Stig, but I do wish I had Lotus Tuned Suspension.