I have the following code that reads lines from a database and writes the matching results to a csv file. The problem I am running into is there are occasionally carriage returns / line feeds in some of the fields in different rows which cause the csv file to be unusable because there are bogus rows.
For example, here is a sample of what happens when there are carriage returns / line feeds in the SQL data and the affect it has on the file. ... Sample content of messed up file:
field1|field2|field3|field4|field5
value 1|value 2|value 3|value 4|value 5
value 1|value 2|value 3|value 4|value 5
value 1|value 2|val
ue 3|value 4|value 5
value 1|value 2|value 3|va
lue 4|value 5
Here is the code that writes the results of a SQL query to the output file. What I am trying to do is strip any results that have carriage returns / line feeds.
<pre lang="vb">'''
While loop to read each row. if compares row[2] (updated) against the last record processed
'''
latest = params
while row:
if row[2] > latest:
latest = row[2]
logger.debug("[%s] - Writing row %s", correlationId, row)
writer.writerow(row)
row = cursor.fetchone()
logger.info("[%s] - last letter date %s " % (correlationId, lastProcessed))
lastProcessedLog = open(LAST_PROCESSED_LOGFILE , 'wt')
lastProcessedString = str(latest)
lastProcessedString = lastProcessedString[0:19]
lastProcessedLog.write(lastProcessedString)
lastProcessedLog.close()
conn.close()
ofile.close()
logger.info("[%s] - Copying %s to root for loadBackflow as %s", correlationId, writeFile, outfile)
shutil.copyfile(writeFile, outfile)
logger.info("[%s] - Moving %s to completion folder %s", correlationId, writeFile, completionFolder)
shutil.move(writeFile, completionFolder)
I have tried changing the writer.write(row) line to include a replace but I get an error. Similarly, I get errors when trying to use replace with row = row.replace("\r\n", "") ... I have pasted my attempts and the corresponding errors below.
Any insights on how I can strip carriage returns / line feeds at the time they are being read from the SQL query results into the data file are much appreciated.
Thanks in advance! :)
writer.writerow(row).replace("\r\n", "")
Unexpected error: 'NoneType' object has no attribute 'replace'
row = row.replace("\r\n", "")
Unexpected error: 'tuple' object has no attribute 'replace'
row = row.replace("\r", "")
row = row.replace("\n", "")
Unexpected error: 'tuple' object has no attribute 'replace'
'''
While loop to read each row. if compares row[2] (updated) against the last record processed
'''
latest = params
def preprocess_item(item):
if isinstance(item, str):
return item.replace("\r\n", "")
return item
def preprocess_row(row):
return tuple(preprocess_item(item) for item in row)
while row:
if row[2] > latest:
latest = row[2]
logger.debug("[%s] - Writing row %s", correlationId, row)
row = cursor.fetchone()
row = preprocess_row(row)
writer.writerow(row)