Hi,
In our system we have multiple services, each service have a log folder.
The logs are very large, could end with 40MB-2GB.
Each row in the log starts with a date format, like this: 21.07.2020-16.40.22
We wrote an app that helps us search between all folders all the relevant rows
given a specific date and time.
Lets say I want to extract all logs from october 12, 2020 between 10:00-13:00,
The system will first filter all logs in all folders according to creation time and last write time.
Then, for each log filtered, we look for the relevnt lines according to the time frame.
The line date is verified using the following method:
private bool IsLineInTimeFrame(long lineNumber, string filePath)
{
bool result = false;
List<string> content = ReadSpecificLine(filePath, lineNumber).Split(' ').ToList();
foreach (string piece in content)
{
try
{
var regex = new Regex(@"\b\d{2}\.\d{2}.\d{4}\b-[012]{0,1}[0-9].[0-6][0-9].[0-6][0-9]");
foreach (Match m in regex.Matches(piece))
{
DateTime dt;
if (DateTime.TryParseExact(m.Value, "dd.MM.yyyy-HH.mm.ss", null, DateTimeStyles.None, out dt))
{
Logger.LogWriter.LogInstance.LogWrite($"{m.Value}");
if (dt >= Convert.ToDateTime(SelectedDateFrom) && dt <= Convert.ToDateTime(SelectedDateTo))
{
return true;
}
}
}
}
catch (Exception ex){}
}
return result;
}
string ReadSpecificLine(string filePath, long lineNumber)
{
string content = null;
try
{
using (StreamReader file = new StreamReader(filePath))
{
for (int i = 1; i < lineNumber; i++)
{
file.ReadLine();
if (file.EndOfStream)
{
Logger.LogWriter.LogInstance.LogWarning($"End of file. The file only contains {i} lines.");
break;
}
}
content = file.ReadLine();
}
}
catch (IOException ex)
{
Logger.LogWriter.LogInstance.LogError(ex);
}
return content;
}
Noe here is the question:
Assuming we have 150K (or even more) lines, what is the best way to search for the first line starting with a given date and time without iterating througout the whole file?
What I have tried:
...................................................................