Click here to Skip to main content
15,867,686 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
Complete newby here. I'm trying to parse a quite long simulation output with Python into a frame and write it into an excel sheet. I only want to parse certain entries, not the whole thing. (See my code below)

The output I am trying to parse:

F100.T,557.9567856878748,F,F,F,F,
F100.Tv,557.9567856878748,F,F,F,F,
F100.Tl,557.9567856878748,F,F,F,F,
F100.Duty,-106382.60618934222,F,F,F,T,1
... 
F200.T,557.9567856878748,F,F,F,F,
F200.Tv,557.9567856878748,F,F,F,F,
F200.Tl,557.9567856878748,F,F,F,F,
F200.Duty,-37798.28473117316,F,F,F,T,1
... and so on


How it should look like at the end:

    | F100               | F200           |
----|--------------------|----------------| 
 T  | 557.9567856878748  | 100            |
 Tv | 557.9567856878748  | 5.550847203    |
 T1 |-106382.60618934222 | 3.798721561    |


... and so on.

What I have tried:

Python
import itertools
import pandas as pd


def read_lines(file_object) -> list:
    return [
        parse_line(line) for line in file_object.readlines() if line.strip()
    ]


def parse_line(line: str) -> list:
    return [
        i.split(",")[1]
        for i in line.strip().split()
        if i.startswith(("F100", "F200"))
    ]


def flatten(parsed_lines: list) -> list:
    return list(itertools.chain.from_iterable(parsed_lines))


def cut_into_pieces(flattened_lines: list, piece_size: int = 2) -> list:
    return [
        flattened_lines[i:i + piece_size] for i
        in range(0, len(flattened_lines), piece_size)
    ]


with open("sim.txt") as data:
    df = pd.DataFrame(
        cut_into_pieces(flatten(read_lines(data))),
        columns=["F100", "F200"],
    )
    print(df)
    df.to_excel("table.xlsx", index=False)


But it looks like this:

                    F100                 F200
1      557.9567856878748    100
2      557.9567856878748    5.550847203
3    -106382.60618934222    3.798721561
..                   ...                  ...


As you see, the rows are not named (T, Tv, T1 etc.). Im hitting a wall and don't know how to continue from here.
Thanks in advance
Posted
Updated 8-Nov-21 5:23am
Comments
Richard MacCutchan 8-Nov-21 4:13am    
It is impossible to see which parts are being missed in your code, as you have tried to parse with compound list comprehensions. You should split the parsing task into steps and check which field you have extracted at each step.

1 solution

 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900