[Disclaimer: In Python world, I am just a noobie. Sharing my experience along the way of learning. I am more than happy to hear from you.]
I just completed a content downloader which grabs links from a given RSS feed, and then downloads all the linked contents simultaneously. “Bandwidth is your only limit” 😉
It DON’T work on Windows. Yet.
For the downloading part, it primarily depends on the URLGrabber library which in turn depend on PycURL library. On the other hand, for parallelism and other staffs it depends on Python’s built-in libraries, e.g. Multiprocessing, XML parsing, URLlib2.
You can even use proxy settings, bandwidth-control, FTP’s user/password etc.
Here it goes:
As, WordPress.com don’t allow GitHub code embedding, you have to click the link to check this out.