Csv python empty row

how to skip blank line while reading CSV file using python

Why do you open your file in binary mode? Regardless of that, you should iterate over your lines in data variable. That assignation to empty_lines is incorrect without declaring it first and you have a typo in the for with an extra closing parentheses.

Janne yes you are right it does not print anything, but actually i have tried to print individual cell thats why i have stored in an array.

@Birei: «Why do you open your file in binary mode?» Because that’s the right portable way to open files to pass to csv.reader in Python 2, as mentioned in the docs.

You may use filter or generator approach similar to this answer: stackoverflow.com/a/14158869/1317713 I recommend re-factoring some of the logic into def is_empty_line(line): . for readability, say if you want to skip a line containing all white-space. It can also be a good idea to skip both comments and empty lines — more reason to re-factor into separate functions.

8 Answers 8

If you want to skip all whitespace lines, you should use this test: ‘ ‘.isspace() .

Since you may want to do something more complicated than just printing the non-blank lines to the console(no need to use CSV module for that), here is an example that involves a DictReader:

#!/usr/bin/env python # Tested with Python 2.7 # I prefer this style of importing - hides the csv module # in case you do from this_file.py import * inside of __init__.py import csv as _csv # Real comments are more complicated . def is_comment(line): return line.startswith('#') # Kind of sily wrapper def is_whitespace(line): return line.isspace() def iter_filtered(in_file, *filters): for line in in_file: if not any(fltr(line) for fltr in filters): yield line # A dis-advantage of this approach is that it requires storing rows in RAM # However, the largest CSV files I worked with were all under 100 Mb def read_and_filter_csv(csv_path, *filters): with open(csv_path, 'rb') as fin: iter_clean_lines = iter_filtered(fin, *filters) reader = _csv.DictReader(iter_clean_lines, delimiter=';') return [row for row in reader] # Stores all processed lines in RAM def main_v1(csv_path): for row in read_and_filter_csv(csv_path, is_comment, is_whitespace): print(row) # Or do something else with it # Simpler, less refactored version, does not use with def main_v2(csv_path): try: fin = open(csv_path, 'rb') reader = _csv.DictReader((line for line in fin if not line.startswith('#') and not line.isspace()), delimiter=';') for row in reader: print(row) # Or do something else with it finally: fin.close() if __name__ == '__main__': csv_path = "C:\Users\BKA4ABT\Desktop\Test_Specification\RDBI.csv" main_v1(csv_path) print('\n'*3) main_v2(csv_path) 

Источник

Читайте также:  Выравнивание содержимого ячеек

CSV file written with Python has blank lines between each row

This code reads thefile.csv , makes changes, and writes results to thefile_subset1 . However, when I open the resulting csv in Microsoft Excel, there is an extra blank line after each record! Is there a way to make it not put an extra blank line?

Wouldn’t setting lineterminator=’\n’ as default parameter for the initialization of csv.writer solve the problem? Want somebody do a Python 3.10 PR for this?

11 Answers 11

The csv.writer module directly controls line endings and writes \r\n into the file directly. In Python 3 the file must be opened in untranslated text mode with the parameters ‘w’, newline=» (empty string) or it will write \r\r\n on Windows, where the default text mode will translate each \n into \r\n .

#!python3 with open('/pythonwork/thefile_subset11.csv', 'w', newline='') as outfile: writer = csv.writer(outfile) 
from pathlib import Path import csv with Path('/pythonwork/thefile_subset11.csv').open('w', newline='') as outfile: writer = csv.writer(outfile) 

If using the StringIO module to build an in-memory result, the result string will contain the translated line terminator:

from io import StringIO import csv s = StringIO() writer = csv.writer(s) writer.writerow([1,2,3]) print(repr(s.getvalue())) # '1,2,3\r\n' (Windows result) 

If writing that string to a file later, remember to use newline=» :

# built-in open() with open('/pythonwork/thefile_subset11.csv', 'w', newline='') as f: f.write(s.getvalue()) # Path's open() with Path('/pythonwork/thefile_subset11.csv').open('w', newline='') as f: f.write(s.getvalue()) # Path's write_text() added the newline parameter to Python 3.10. Path('/pythonwork/thefile_subset11.csv').write_text(s.getvalue(), newline='') 

In Python 2, use binary mode to open outfile with mode ‘wb’ instead of ‘w’ to prevent Windows newline translation. Python 2 also has problems with Unicode and requires other workarounds to write non-ASCII text. See the Python 2 link below and the UnicodeReader and UnicodeWriter examples at the end of the page if you have to deal with writing Unicode strings to CSVs on Python 2, or look into the 3rd party unicodecsv module:

#!python2 with open('/pythonwork/thefile_subset11.csv', 'wb') as outfile: writer = csv.writer(outfile) 

Anyway the @Mark Tolonen’s answer did resolved many questions related to the extra line(s) added when saving a standard (no csv used) text file.

For compatibility between 2.6/2.7 and 3, you can use io.open with the newlines argument. If you’re still writing in 2.x, that seems like a better choice anyway since it’s forward compatible.

@jpmc26 Normally that’s good advice, but the csv module doesn’t work properly with io.open . There is a unicodecsv 3rd party module for Python 2.7 that works better.

My ultimate point is that if you use csv with pathlib.Path instead of open , the current answer results in \r\r\n newlines, even if you pass newline=» to the StringIO , and the solution is nonobvious. Now people can read these comments and find an answer and learn more about the nuance. Overriding lineterminator works, though it overrides the flavor settings, spites csv s encoding intentions, and muddies encoding across modules. Strangely, csv.writer() in Python 3 does not work with BytesIO , which I would expect it to, since it uses \r\n line endings by default.

Opening the file in binary mode «wb» will not work in Python 3+. Or rather, you’d have to convert your data to binary before writing it. That’s just a hassle.

Instead, you should keep it in text mode, but override the newline as empty. Like so:

with open('/pythonwork/thefile_subset11.csv', 'w', newline='') as outfile: 

Note: It seems this is not the preferred solution because of how the extra line was being added on a Windows system. As stated in the python document:

If csvfile is a file object, it must be opened with the ‘b’ flag on platforms where that makes a difference.

Windows is one such platform where that makes a difference. While changing the line terminator as I described below may have fixed the problem, the problem could be avoided altogether by opening the file in binary mode. One might say this solution is more «elegent». «Fiddling» with the line terminator would have likely resulted in unportable code between systems in this case, where opening a file in binary mode on a unix system results in no effect. ie. it results in cross system compatible code.

On Windows, ‘b’ appended to the mode opens the file in binary mode, so there are also modes like ‘rb’, ‘wb’, and ‘r+b’. Python on Windows makes a distinction between text and binary files; the end-of-line characters in text files are automatically altered slightly when data is read or written. This behind-the-scenes modification to file data is fine for ASCII text files, but it’ll corrupt binary data like that in JPEG or EXE files. Be very careful to use binary mode when reading and writing such files. On Unix, it doesn’t hurt to append a ‘b’ to the mode, so you can use it platform-independently for all binary files.

As part of optional paramaters for the csv.writer if you are getting extra blank lines you may have to change the lineterminator (info here). Example below adapated from the python page csv docs. Change it from ‘\n’ to whatever it should be. As this is just a stab in the dark at the problem this may or may not work, but it’s my best guess.

>>> import csv >>> spamWriter = csv.writer(open('eggs.csv', 'w'), lineterminator='\n') >>> spamWriter.writerow(['Spam'] * 5 + ['Baked Beans']) >>> spamWriter.writerow(['Spam', 'Lovely Spam', 'Wonderful Spam']) 

Источник

How to ignore blank rows in a csv file

I’m using dictreader to open some csv files, adding them to one big list of dictionaries, and then using dictwriter to write the list of dictionaries out to one csv file. The problem I’m having is that the resultant csv file has a bunch of blank rows between rows with data. I guess when the csv files are being read, it’s not ignoring blank rows. Could someone please send me in the right direction to find how I say to ignore the blank rows? I’ve tried finding this in the csv module but no joy. Any help would be much appreciated please. Hi! Thanks for replying! I more want dictreader to read rows if there is anything of interest in them, but will ignore a row only if it’s totally blank. Eg if I had

for dictionary in csv.DictReader(open(filename)): if any(x != '' for x in dictionary.itervalues()): 

3 Answers 3

You can read a fake file object that skips the blank lines in the real file. I’m not familiar with exactly what you’re doing, but this will work better than mac’s answer if the blank lines are making your reading process crash, or you really don’t want the blank lines ever in there.

class BlankLineSkipper(object): def __init__(self, file): self.file = file def __iter__(self): return (line for line in self.file if line.strip()) def read(self): return ''.join(self) >>> print open('lol.csv').read() 5,7,8 1,2,3 abc,lol,haha >>> list(csv.reader(open('lol.csv'))) [['5', '7', '8'], [], ['1', '2', '3'], [], ['abc', 'lol', 'haha'], []] >>> list(csv.reader(BlankLineSkipper(open('lol.csv')))) [['5', '7', '8'], ['1', '2', '3'], ['abc', 'lol', 'haha']] 

(You might need to implement readline() or something else to make your code work, depending on how it uses the file object.)

Источник

Оцените статью