Click here to Skip to main content
15,883,883 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
I have written a program to extract the following data from a log file. It has a date and a Key value. Based on the key value i want to extract the oldest date for each keys.

2017-03-18 , INBIOS_ABZ824
2017-03-19 , INBIOS_ABZ824
2017-03-12 , INDROS_MSR656
2017-03-17 , INDROS_MSR656
2017-04-12 , INOS_GSN848
2017-04-19 , INOS_GSN848

what should be the best approach? could you please suggest?

the final output needs to be like the one below,

2017-03-18 , INBIOS_ABZ824
2017-03-12 , INDROS_MSR656
2017-04-12 , INOS_GSN848

Please share your thoughts.

What I have tried:

Python
import os
import re

# Regex used to match relevant loglines (in this case, a specific IP address)
line_regex = re.compile(r"error", re.IGNORECASE)

line_regex = re.compile(r"[A-Z]+OS_[A-Z]+[0-9]+", re.IGNORECASE)


# Output file, where the matched loglines will be copied to
output_filename = os.path.normpath("output.log")
# Overwrites the file, ensure we're starting out with a blank file
with open(output_filename, "w") as out_file:
    out_file.write("")

# Open output file in 'append' mode
with open(output_filename, "a") as out_file:
    # Open input file in 'read' mode
    with open("ServerError.txt", "r") as in_file:
        # Loop over each log line
        for line in in_file:
            # If log line matches our regex, print to console, and output file
            if (line_regex.search(line)):

                # Get index of last space
                last_ndx = line.rfind(' ')
                # line[:23]: The time stamp (first 23 characters)
                # line[last_ndx:]: Last space and following characters

                # using match object to eliminate other strings which are associated with the pattern ,
                # need the string from which the request ID is in the last index
                matchObj = re.match(line_regex, line[last_ndx+1:])
                #print(matchObj)
                #check if matchobj is not null
                if matchObj:
                    print(line[:23] + line[last_ndx:])
                    out_file.write(line[:23] + line[last_ndx:])
Posted
Comments
Richard MacCutchan 19-Jun-18 14:08pm    
I would read the data into some form of list or dictionary using the key value as the key. You can then parse the date for each key and compare it with the current value. If the new date is less than the existing one, replace the value in the list entry. Then when you have processed all the entries you can print them.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900