Read Only a Part of a Text File Python
The All-time Practice of Reading Text Files In Python
Combine multiple files into a single stream with richer metadata
Reading text files in Python is relatively easy to compare with most of the other programming languages. Commonly, we only use the "open up()" function with reading or writing mode and then first to loop the text files line by line.
This is already the best practise and it cannot be any easie r means. Nevertheless, when we desire to read content from multiple files, there is definitely a better way. That is, using the "File Input" module that is born to Python. It combines the content from multiple files that allow us to process everything in a single for-loop and plenty of other benefits.
In this article, I'll demonstrate this module with examples.
0. Without the FileInput Module
Let'south have a await at the "ordinary" mode of reading multiple text files using the open up()
part. But earlier that, we need to create ii sample files for demonstration purpose.
with open('my_file1.txt', mode='due west') as f:
f.write('This is line 1-1\n')
f.write('This is line 1-two\n') with open('my_file2.txt', style='westward') as f:
f.write('This is line ii-i\n')
f.write('This is line 2-2\northward')
In the above lawmaking, we open up a file with the mode w
which means "write". Then, nosotros write two lines in the file. Delight be noticed that we need to add together the new line \n
. Otherwise, the two sentences will exist written in a single line.
After that, we should have 2 text files in the current working directory.
Now, let's say nosotros want to read from both the text files and print the content line by line. Of course, we tin even so do that utilize the open()
function.
# Iterate through all file
for file in ['my_file1.txt', 'my_file2.txt']:
with open(file, 'r') as f:
for line in f:
print(line)
Here we take to utilise two nested for-loops. The outer loop is for the files, while the inner one is for the lines within each file.
ane. Using the FileInput Module
Well, nada prevents u.s.a. from using the open()
function. However, the fileinput
module but provides us with a neater way of reading multiple text files into a single stream.
First of all, we need to import the module. This is a Python built-in module and then that we don't need to download anything.
import fileinput equally fi
And then, we tin can use it for reading from the two files.
with fi.input(files=['my_file1.txt', 'my_file2.txt']) as f:
for line in f:
print(line)
Because the fileinput
module is designed for reading from multiple files, we don't need to loop the file names anymore. Instead, the input()
function takes an iterable collection blazon such as a list every bit a parameter. Also, the great affair is that all the lines from both files are accessible in a single for-loop.
ii. Employ the FileInput Module with Glob
Sometimes, it may not be practical to have such a file name listing with all the names that are manually typed. Information technology is quite mutual to read all the files from a directory. Besides, we might be only interested in certain types of files.
In this case, we can utilise the glob
module which is another Python congenital-in module together with the fileinput
module.
We tin can do a uncomplicated experiment before that. The os
module tin help united states to list all the files in the electric current working directory.
It tin exist seen that in that location are many files other than the two text files. Therefore, we want to filter the file names because we want to read the text files but. We can use the glob
module every bit follows.
from glob import glob glob('*.txt')
Now, we tin put the glob()
part into the fileinput.input()
function as the parameter. And then, only these two text files will exist read.
with fi.input(files=glob('*.txt')) every bit f:
for line in f:
print(line)
three. Go the Metadata of Files
You may ask how can we know which file exactly the "line" is from when nosotros are reading from the stream that is actually combined with multiple files?
Indeed, using the open up()
office with nested loop seems to be very easy to become such information because we can access the current file name from the outer loop. Nonetheless, this is in fact much easier in the fileinput
module.
with fi.input(files=glob('*.txt')) as f:
for line in f:
print(f'File Proper noun: {f.filename()} | Line No: {f.lineno()} | {line}')
See, in the above code, nosotros use the filename()
to admission the current file that the line
comes from and the lineno()
to admission the electric current index of the line we are getting.
4. When the Cursor Reaches a New File
Apart from that, there are more functions from the fileinput
module that we can make use of. For example, what if we want to practise something when we reach a new file?
The part isfirstline()
helps us to make up one's mind whether we're reading the first line from a new file.
with fi.input(files=glob('*.txt')) as f:
for line in f:
if f.isfirstline():
print(f'> Offset to read {f.filename()}...')
print(line)
This could be very useful for logging purpose. And so, we tin be indicated with the current progress.
5. Jump to the Adjacent File
Nosotros can likewise hands cease reading the electric current file and leap to the adjacent i. The function nextfile()
allows u.s.a. to do so.
Before nosotros tin demo this feature, delight let me re-write the ii sample files.
with open('my_file1.txt', mode='w') as f:
f.write('This is line 1-i\n')
f.write('stop reading\n')
f.write('This is line 1-2\north') with open('my_file2.txt', way='west') as f:
f.write('This is line 2-ane\n')
f.write('This is line ii-2\n')
The only difference from the original files is that I added a line of text stop reading
in the starting time text file. Permit's say that nosotros want the fileinput
module to stop reading the first file and jump to the second when information technology sees such content.
with fi.input(files=glob('*.txt')) equally f:
for line in f:
if f.isfirstline():
print(f'> Showtime to read {f.filename()}...')
if line == 'finish reading\due north':
f.nextfile()
else:
print(line)
In the above code, another if-condition is added. When the line text is stop reading
it volition leap to the side by side file. Therefore, we can come across that the line "1–two" was not read and output.
6. Read Compress File Without Extracting
Sometimes nosotros may have compressed files to read. Usually, we will have to uncompress them before we tin read the content. Withal, with the fileinput
module, we may not have to extract the content from the compressed files before we tin can read it.
Let'due south make up a compressed text file using Gzip. This file will be used for demonstration purposes later.
import gzip
import shutil with open up('my_file1.txt', 'rb') as f_in:
with gzip.open('my_file.gz', 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
In the above code, we added the file my_file1.txt
into a compressed file using gzip. At present, permit's see how fileinput
can read information technology without extra steps for uncompressing.
with fi.input(files='my_file.gz', openhook=fi.hook_compressed) equally f:
for line in f:
impress(line)
By using the parameter openhook
and the flag fi.hook_compressed
, the gzip file volition exist uncompressed on the fly.
The fileinput
module currently supports gzip and bzip2. Unfortunately non the other format.
Summary
In this article, I accept introduced the Python built-in module fileinput
and how to use it to read multiple text files. Of course, it will never supercede the open up()
part, merely in terms of reading multiple files into a single stream, I believe it is the best practice.
If you experience my articles are helpful, please consider joining Medium Membership to support me and thousands of other writers! (Click the link above)
whiteheadpailikey.blogspot.com
Source: https://towardsdatascience.com/the-best-practice-of-reading-text-files-in-python-509b1d4f5a4
0 Response to "Read Only a Part of a Text File Python"
Post a Comment