Urllib2 Python Library Overview and HOWTOs - ilya-khadykin/notes-outdated GitHub Wiki

urllib2 was split into to parts in Python 3.x:

  • urllib.request
  • urllib.error

urllib.request is a Python module for fetching URLs (Uniform Resource Locators). It offers a very simple interface, in the form of the urlopen function. This is capable of fetching URLs using a variety of different protocols. It also offers a slightly more complex interface for handling common situations - like basic authentication, cookies, proxies and so on. These are provided by objects called handlers and openers.

urllib.request supports fetching URLs for many “URL schemes” (identified by the string before the ”:” in URL - for example “ftp” is the URL scheme of “ftp://python.org/”) using their associated network protocols (e.g. FTP, HTTP).

Fetching URLs

import urllib.request
with urllib.request.urlopen('http://python.org/') as response:
   html = response.read()

import urllib.request
local_filename, headers = urllib.request.urlretrieve('http://python.org/')
html = open(local_filename)

import urllib.request

req = urllib.request.Request('http://www.voidspace.org.uk')
with urllib.request.urlopen(req) as response:
   the_page = response.read()
# or
req = urllib.request.Request('ftp://example.com/')

References