I have written a program to extract the following data from a log file. It has a date and a Key value. Based on the key value i want to extract the oldest date for each keys.
2017-03-18 , INBIOS_ABZ824
2017-03-19 , INBIOS_ABZ824
2017-03-12 , INDROS_MSR656
2017-03-17 , INDROS_MSR656
2017-04-12 , INOS_GSN848
2017-04-19 , INOS_GSN848
what should be the best approach? could you please suggest?
the final output needs to be like the one below,
2017-03-18 , INBIOS_ABZ824
2017-03-12 , INDROS_MSR656
2017-04-12 , INOS_GSN848
Please share your thoughts.
What I have tried:
import os
import re
line_regex = re.compile(r"error", re.IGNORECASE)
line_regex = re.compile(r"[A-Z]+OS_[A-Z]+[0-9]+", re.IGNORECASE)
output_filename = os.path.normpath("output.log")
with open(output_filename, "w") as out_file:
out_file.write("")
with open(output_filename, "a") as out_file:
with open("ServerError.txt", "r") as in_file:
for line in in_file:
if (line_regex.search(line)):
last_ndx = line.rfind(' ')
matchObj = re.match(line_regex, line[last_ndx+1:])
if matchObj:
print(line[:23] + line[last_ndx:])
out_file.write(line[:23] + line[last_ndx:])