Tuesday, October 20, 2009

Common File Methods




I l@ve RuBoard









Common File Methods


The Python file object supports the methods shown in the following list:



  • read()
    read in data from the file



  • readline()
    read a line of data from a file



  • readlines()
    read all the lines in the file and return them as a tuple of strings



  • write()
    write data to a file



  • writelines()
    write a tuple of lines to a file



  • seek()
    move to a certain point in a file



  • tell()
    determine the current location in the file




This is only a partial list. You can read about other file object methods in the Python documentation.



A Little Review


You can get a list of methods that a file object supports by using the dir() command like this:



>>> file = open("\\dat\\hello.txt")
>>> dir (file)
['close', 'closed', 'fileno', 'flush', 'isatty', 'mode', 'name', 'read', 'readinto', 'readline', 'readlines', 'seek', 'softspace', 'tell',
'truncate', 'write', 'writelines']
dir() lists all public members of a class.


write() and readline()


In our c:\dat\hello.txt example, we wrote three strings. Then we used a text editor to read that file and saw that all three strings were on the same line. Of course, this is a very short file, but if all of the items we wrote to a standard-size file were written on the same line, the file would be hard to read and parse. To avoid this, we can write a newline character (\n) at the end of each line to denote that the text that follows begins a new line.


Let's continue with some examples that demonstrate writing data on separate lines. Follow along in the interactive interpreter.


First we write three lines to a file like this:



>>> fname = "c:\\dat\\data.txt"
>>> f = open (fname, "w") #open the file in write mode
>>> f.write("line1 \n") #write out a line of text.
>>> # the \n signifies newline
>>> f.write("line2 ") #write out some text.
>>> f.write(" still line2") #write out some more text
>>> f.write(" \n") #write out a newline character
>>> f.write("line3 \n") #write out a line of text
>>> f.close() #close the file
>>> f = None #set the ref to none

Without the newline character, all of the text is on the same line, which you can see by opening the file (c:\dat\data.txt) and comparing each line of text with the code that created it.



f.write("line2 ") #write out some text.
f.write(" still line2") #write out some more text
f.write(" \n") #write out a newline character

Now let's reopen our file in read mode and read each line individually with the readline() method.



>>> f = open (fname, "r") # open the file in read mode
>>> line = f.readline() # read one line in and store it in line
>>> line # line one contains the first line we
>>> # wrote. Note "\n" = "\12"
'line1 \ 12'
>>> print line # print the line. note that the newline
>>> # is still attatched to the line
line1

>>> f.readline() #read the second line.
'line2 still line2 \ 12'
>>> f.readline() #read the third line
'line3 \ 12'
>>> f.close() #close the file
>>>

Notice that we have a lot fewer readline() calls than we had write() calls. This is because readline() reads a line of text until it hits the newline character, which it interprets as the last character in the input string.



readlines()


The readlines() method (note plural) reads all of the lines in the file and puts them in a list of strings. To illustrate, we'll read all of the lines in the file at once with the following interactive session:



>>> f = open(fname, "r") #reopen the file in read mode
>>> list = f.readlines() #read in all the lines at once
>>> list #display the list
['line1 \ 12', 'line2 still line2 \ 12', 'line3 \ 12']
>>> for line in list: #print out each line
... print line
...
line1
line2 still line2
line3
>>>


Getting Rid of \n


You may want to dispose of the newline character when you read in a line. Here's the not very Python way of doing this:



>>> f = open("\\dat\\data.txt") #reopen the file again
>>> line = f.readline() #read in the line
>>> line #before removing the newline character
'line1 \ 12'

>>> line = line[0:len(line)-1] #chop of the newline character
>>> line #After the newline character is gone.
'line1 '
>>> f.close() #close the file

Here's the more Python way:



>>> f = open("\\dat\\data.txt") # reopen the file
>>> line = (f.readline())[:-1] # read the line and chop
>>> # the newline character off.
>>> line
'line1 '

We're doing two things at once with this call. We're putting readline() in parentheses and then using the [] operator with slice notation on the list it returns.



line = (f.readline())[:-1] # read the line and chop
# the newline character off.

In case you didn't catch what we just did, here's the same thing in slow motion, reading the second line, with a few more code steps for clarity.



>>> line2 = f.readline() # Read in line2.
>>> line2 = line2[:-1] # Using slice notation assign
>>> # line2 to line2 from the first
>>> # character up to but not including
>>> # the last character.
>>> line2 # Display line2
'line2 still line2 '

For a review of slice notation, go back to Chapters 1, 2 and 3.




read()


Let's start a new example to show how to use read(). First we'll create a file and write the hex digits 1 through F to it. Then we'll open the file (c:\\dat\\read.txt) in write mode. (Don't worry about hex for now; just follow along.)



>>> f = open("\\dat\\read.txt", "w")
>>> f.write("0123456789ABCDEF")
>>> f.close()

That's the setup. Now we'll demonstrate the different ways to use read().



>>> f = open("\\dat\\read.txt", "r")
>>> f.read(3) # Read the first three characters.
'012'

>>> f.read(3) # Read the next three characters in the file.
'345'

>>> f.read(4) # Read the next four characters in the file.
'6789'

>>> f.read() # Read the rest of the file.
'ABCDEF'

We can see that read(size) reads a specified number of characters from the file. Note that calling read() with no arguments reads the rest of the file and that the second read() starts reading where the first one left off.



tell() and seek()


We just saw that the file object keeps track of where we left off reading in a file. What if we want to move to a previous location? For that we need the tell() and seek() methods. Here's an example that continues our read() example:



>>> f.seek(0) #reset the file pointer to 0
>>> f.read() #read in the whole file
'0123456789ABCDEF'

>>> f.tell() #see where the file pointer is
16

>>> f.seek(8) #move to the middle of the file
>>> f.tell() #see where the file pointer is
8

>>> f.read() #read from the middle of the file to the end
'89ABCDEF'

The second line reads in the whole file, which means that the file pointer was at the end. The third line uses tell() to report where the file pointer was, and then seek() positions the pointer to the middle of the file. Again, tell() reports that location. To demonstrate that read() picks up from the file pointer's location, we'll read the rest of the file and display it.



For Beginners: Try It Out


If you're sitting there staring at the book and wondering what I'm talking about, you need to do three things:







  1. Enter in the last example in the interactive interpreter, and experiment with tell(), seek(), and read().



  2. Open the file with your favorite text editor, and count the characters in it.



  3. Move around the file, and read various characters. To read a single character, call the read() method with 1 as its argument.



If you still don't get it, don't worry; we'll cover this more in the next section.










    I l@ve RuBoard



    No comments: