Click here to Skip to main content
15,887,683 members
Please Sign up or sign in to vote.
1.00/5 (2 votes)
Hello ,

i want a very faster way to filter all emails that end with a special caracter in a big text file (more than 200MB) using c#

What I have tried:

var lines = File.ReadAllLines(file path);

foreach(var line in lines)
{
if (line.EndsWith(myWord))
{
outputEmails.Text += line + Environment.NewLine;
}
}
Posted
Updated 25-Apr-17 16:10pm

1 solution

Well, you can make it a tiny bit faster by replacing that string concatenation with a StringBuilder. Why? Because strings in .NET are immutable. Once created you cannot change them. What you're doing with that concatenation is creating a new string object every time you append something to it.

You could also probably pickup a very minor speed improvement by replacing the .ReadAllLines with a StreamReader and processing the file line by line one at a time. You'll also save yourself 200MB of memory and possibly some page swapping.

If you're looking for a single character at the end of each line, you might be able to eek out another boost by treating each line, which will be a string, as an array of char and specifically look at the last index in the array for that character. This will avoid some of the overhead of the .EndsWith method. You're writing something that specializes in looking at the last character instead of using the more general use case that looks for an entire string of characters.

Other methods to make it faster involve taking the time to pre-process the text file and create an index. This will take the amount of time your taking now to build the index. Then it's easy enough and much faster to get the lines that you need by using the index. But, I don't think that's what you're talking about.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900