Opening HTML Documents
The urllib and urllib2 modules included with Python provide the functionality to open and fetch data from URLs, including HTML documents. To use the urllib module to open an HTML document, specify the URL location of the document, including the filename in the urlopen(url [,data]) function. The urlopen function will open a local file and return a file-like object that can be used to read data from the HTML document. Once you have opened the HTML document, you can read the file using the read([nbytes]), readline(), and readlines() functions similar to normal files. To read the entire contents of the HTML document, use the read() function to return the file contents as a string. After you open a location, you can retrieve the location of the file using the geturl() function. The geturl function returns the URL in string format, taking into account any redirection that might have taken place when accessing the HTML file. Note Another helpful function included in the file-like object returned from urlopen is the info() function. The info() function returns the available metadata about the URL location, including content length, content type, and so on. import urllib html_open.py Date: Tue, 18 Jul 2006 18:28:19 GMT Output from html_open.py code |
Wednesday, November 4, 2009
Opening HTML Documents
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment