17.5. Alternative File Reading Methods¶
Again, recall the contents of the qbdata.txt file.
Colt McCoy QB CLE 135 222 1576 6 9 60.8% 74.5 Josh Freeman QB TB 291 474 3451 25 6 61.4% 95.9 Michael Vick QB PHI 233 372 3018 21 6 62.6% 100.2 Matt Schaub QB HOU 365 574 4370 24 12 63.6% 92.0 Philip Rivers QB SD 357 541 4710 30 13 66.0% 101.8 Matt Hasselbeck QB SEA 266 444 3001 12 17 59.9% 73.2 Jimmy Clausen QB CAR 157 299 1558 3 9 52.5% 58.4 Joe Flacco QB BAL 306 489 3622 25 10 62.6% 93.6 Kyle Orton QB DEN 293 498 3653 20 9 58.8% 87.5 Jason Campbell QB OAK 194 329 2387 13 8 59.0% 84.5 Peyton Manning QB IND 450 679 4700 33 17 66.3% 91.9 Drew Brees QB NO 448 658 4620 33 22 68.1% 90.9 Matt Ryan QB ATL 357 571 3705 28 9 62.5% 91.0 Matt Cassel QB KC 262 450 3116 27 7 58.2% 93.0 Mark Sanchez QB NYJ 278 507 3291 17 13 54.8% 75.3 Brett Favre QB MIN 217 358 2509 11 19 60.6% 69.9 David Garrard QB JAC 236 366 2734 23 15 64.5% 90.8 Eli Manning QB NYG 339 539 4002 31 25 62.9% 85.3 Carson Palmer QB CIN 362 586 3970 26 20 61.8% 82.4 Alex Smith QB SF 204 342 2370 14 10 59.6% 82.1 Chad Henne QB MIA 301 490 3301 15 19 61.4% 75.4 Tony Romo QB DAL 148 213 1605 11 7 69.5% 94.9 Jay Cutler QB CHI 261 432 3274 23 16 60.4% 86.3 Jon Kitna QB DAL 209 318 2365 16 12 65.7% 88.9 Tom Brady QB NE 324 492 3900 36 4 65.9% 111.0 Ben Roethlisberger QB PIT 240 389 3200 17 5 61.7% 97.0 Kerry Collins QB TEN 160 278 1823 14 8 57.6% 82.2 Derek Anderson QB ARI 169 327 2065 7 10 51.7% 65.9 Ryan Fitzpatrick QB BUF 255 441 3000 23 15 57.8% 81.8 Donovan McNabb QB WAS 275 472 3377 14 15 58.3% 77.1 Kevin Kolb QB PHI 115 189 1197 7 7 60.8% 76.1 Aaron Rodgers QB GB 312 475 3922 28 11 65.7% 101.2 Sam Bradford QB STL 354 590 3512 18 15 60.0% 76.5 Shaun Hill QB DET 257 416 2686 16 12 61.8% 81.3
In addition to the for
loop, Python provides three methods to read data
from the input file. The readline
method reads one line from the file and
returns it as a string. The string returned by readline
will contain the
newline character at the end. This method returns the empty string when it
reaches the end of the file. The readlines
method returns the contents of
the entire file as a list of strings, where each item in the list represents
one line of the file. It is also possible to read the entire file into a
single string with read
. Table 2 summarizes these methods
and the following session shows them in action.
Note that we need to reopen the file before each read so that we start from
the beginning. Each file has a marker that denotes the current read position
in the file. Any time one of the read methods is called the marker is moved to
the character immediately following the last character returned. In the case
of readline
this moves the marker to the first character of the next line
in the file. In the case of read
or readlines
the marker is moved to
the end of the file.
>>> infile = open("qbdata.txt", "r")
>>> aline = infile.readline()
>>> aline
'Colt McCoy QB, CLE\t135\t222\t1576\t6\t9\t60.8%\t74.5\n'
>>>
>>> infile = open("qbdata.txt", "r")
>>> linelist = infile.readlines()
>>> print(len(linelist))
34
>>> print(linelist[0:4])
['Colt McCoy QB CLE\t135\t222\t1576\t6\t9\t60.8%\t74.5\n',
'Josh Freeman QB TB\t291\t474\t3451\t25\t6\t61.4%\t95.9\n',
'Michael Vick QB PHI\t233\t372\t3018\t21\t6\t62.6%\t100.2\n',
'Matt Schaub QB HOU\t365\t574\t4370\t24\t12\t63.6%\t92.0\n']
>>>
>>> infile = open("qbdata.txt", "r")
>>> filestring = infile.read()
>>> print(len(filestring))
1708
>>> print(filestring[:256])
Colt McCoy QB CLE 135 222 1576 6 9 60.8% 74.5
Josh Freeman QB TB 291 474 3451 25 6 61.4% 95.9
Michael Vick QB PHI 233 372 3018 21 6 62.6% 100.2
Matt Schaub QB HOU 365 574 4370 24 12 63.6% 92.0
Philip Rivers QB SD 357 541 4710 30 13 66.0% 101.8
Matt Ha
>>>
Method Name | Use | Explanation |
---|---|---|
write |
filevar.write(astring) |
Add astring to the end of the file. filevar must refer to a file that has been opened for writing. |
read(n) |
filevar.read() |
Reads and returns a string of n
characters, or the entire file as a
single string if n is not provided. |
readline(n) |
filevar.readline() |
Returns the next line of the file with
all text up to and including the
newline character. If n is provided as
a parameter than only n characters
will be returned if the line is longer
than n . |
readlines(n) |
filevar.readlines() |
Returns a list of strings, each representing a single line of the file. If n is not provided then all lines of the file are returned. If n is provided then n characters are read but n is rounded up so that an entire line is returned. |
Now let’s look at another method of reading our file using a while
loop. This is important because many other programming languages do not support the for
loop style for reading files but they do support the pattern we’ll show you here.
There are several important things to notice in this code:
On line 2 we have the statement line = infile.readline()
.
We call this initial read the priming read.
It is very important because the while condition needs to have a value for the line
variable.
The readline
method will return the
empty string if there is no more data in the file.
An empty string is an empty sequence of characters.
When Python is looking for a Boolean condition, as in while line:
,
it treats an empty sequence type as False
, and a non-empty sequence as True
.
Remember that a
blank line in the file actually has a single character, the \n
character (newline).
So, the only way that a line of data from the
file can be empty is if you are reading at the end of the file, and the while
condition becomes False
.
Finally, notice that the last line of the body of the while
loop performs another readline
. This statement will reassign the variable line
to the next line of the file. It represents the change of state that is necessary for the iteration to
function correctly. Without it, there would be an infinite loop processing the same line of data over and over.