Asked By – 384X21
I want to iterate over each line of an entire file. One way to do this is by reading the entire file, saving it to a list, then going over the line of interest. This method uses a lot of memory, so I am looking for an alternative.
My code so far:
for each_line in fileinput.input(input_file): do_something(each_line) for each_line_again in fileinput.input(input_file): do_something(each_line_again)
Executing this code gives an error message:
The purpose is to calculate pair-wise string similarity, meaning for each line in file, I want to calculate the Levenshtein distance with every other line.
Now we will see solution for issue: How to read a large file – line by line?
The correct, fully Pythonic way to read a file is the following:
with open(...) as f: for line in f: # Do something with 'line'
with statement handles opening and closing the file, including if an exception is raised in the inner block. The
for line in f treats the file object
f as an iterable, which automatically uses buffered I/O and memory management so you don’t have to worry about large files.
There should be one — and preferably only one — obvious way to do it.