Click here to Skip to main content
15,902,299 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I wanted to combine every three lines in a text file and make it one line. It worked, however, I get all the new lines in a paragraph forum, not a list as in line by line. I wanted to change this:

Python
T	2009-06-26 16:20:35
U	http://twitter.com/mujiang
W	No Post Title


to make it look like this:

Python
2009-06-26 16:20:35 http://twitter.com/mujiang No Post Title


I found that the only way to do that is by writing the output in a separate file. which is unpreferable but its the only way I know.

I wrote this code:
with open('tweets2009-10.txt', "r") as infile:
    for line in infile:
        if 'Apple'in line or 'apple' in line or 'Obama' in line or 'obama' in line:
            fout = open('newdata.txt', 'w')

            line_order = {'T': 'U', 'U': 'W', 'W': 'T'}

            with open('tweets2009-10.txt') as fin:
              prev_head = None
              new_line = ""
              for line in fin:
                cur_head = line[0]
                if prev_head is None or cur_head == line_order.get(prev_head):
                  new_line += line.strip()[1:]
                  if cur_head == 'W':
                    new_line += "\n"
                    fout.write(new_line)
                    new_line = ""
                  else:
                    new_line += ","
                  prev_head = cur_head
                else:
                  pass 
)


However, the new file new.txt. has nothing being written there. I have run the code for 5 minutes and it's still empty

I don't know what might be wrong in the code]

Any help, please?

What I have tried:

changing the method
Posted
Updated 27-May-18 14:22pm
v4
Comments
Richard MacCutchan 24-May-18 4:09am    
There is obviously more text in your file than in your example. You need to look at the actual content and filter out the lines that are not needed.
Member 13647869 24-May-18 4:13am    
Let me try to explain further. Originaly the file looks like this:
T 2009-06-25 21:15:29
U http://twitter.com/mujiang
W No Post Title

T 2009-06-26 16:20:35
U http://twitter.com/mujiang
W No Post Title

T 2009-06-26 21:04:20
U http://twitter.com/mujiang
W No Post Title

T 2009-06-27 17:50:59
U http://twitter.com/mujiang
W No Post Title

T 2009-06-27 17:56:09
U http://twitter.com/mujiang
W No Post Title

T 2009-06-28 00:00:10
U http://twitter.com/mujiang
W No Post Title



And so on. But because I need to do some sentiment analysis on this file, and every three lines are linked, I need to group them. so every three lines become one line
Richard MacCutchan 24-May-18 5:00am    
But your code is not doing that. It is just concatenating every line into one long string. You need to check each line to see what control character it starts with and process it accordingly. You also need to find out where all that extra text is coming from.
Richard MacCutchan 27-May-18 3:21am    
Why do you need separate files? All you need to do is to check the prefix character on each input line and start a new line when you read an item that starts with 'T'.
Member 13647869 27-May-18 5:31am    
but how can combine the lines? do I use .join?

1 solution

You opened the same file twice. The second one is under a loop.
You have a condition. According your data, only line U or line W can have your if condition.

Next issue is:
What happen if your if condition matched, You open a file name newdata.txt with w, in other word for every match it will crate an empty file destroying the previous file. i.e. If your match matches and even if there is more than one match, you will get only one output. But, according to your next commands, that's not gonna happen either.

Next>
You are reading tweets2009-10.txt as fin. You read the same file tweets2009-10.txt in the upper loop.

I will stop here.

Your whole logic is wrong.

The suggested logic would be:
open output to write
while(more lines in file) 
  read next 3 lines;
  if line 2 or line 3 contains preferred words; then
    format 3 lines as a single line;
    write single line to output file
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900