Normally if I have an issue which is answered in the first 2-3 results of a Google search I won’t create a post. On the other hand when I spend 2-3 hours trying to solve something which should be simple I like to take the opportunity to describe the issue & resolution in hopes that someone will find it quickly in the future.
So the task here was to find a way to specify the IP address, aka socket, aka network interface when making an http request using Python’s urllib2. Why would you want to do this you ask? Well for many web API’s the request rate is limited by whitelisting the IP address – such is the case with Twitter. In the event that you want to be able to use the same machine (with multiple network interfaces) to run jobs in parallel you need to be able to specify where the requests should be routed.
The problem is Python’s urllib2 is based on the httplib library which doesn’t let you specify which address to bind to. This person tried to get around the problem in 2005 without any luck, another guy created a patch for httplib in 2008 which hasn’t been accepted, and finally someone else created a subclass for httplib which unfortunately I couldn’t get hooked up to the urllib2 class.
The best solution I found was this “monkey patch” from Alex Martelli over on Stack Overflow. In his example he attacks the problem using the socket library instead of the httplib. By his own admission stuff like this is not ideal, but the solution is actually very simple and elegant. I like it.
I wrapped the snippet up into a function which can be called in a Python script anytime before you invoke a urllib2 request.
true_socket = socket.socket
def bound_socket(*a, **k):
sock = true_socket(*a, **k)
socket.socket = bound_socket
Hope this can be of help to someone in the future who’s searching for the same thing I was.