The following is a quick and dirty way of pulling a lot of URLs out of a given pages source code, using two commands in vim, my new favourite text editor. So, right to the point!
Try it out right now on the source code of Imgur’s /r/ScarlettJohansson’s page
Right off the bat, I want to show you the results of this scraping, to give you a bit of motivation. Anyways, thanks to requests and BeautifulSoup, this is made trivially easy. Enough talking, let’s get down to the code! Don’t forget that as usual I’ll include the full source code at the bottom of the post.
This was originally a comment on reddit, but I figure it’d help other folks out, so I’m going to put it up here too.
So git, is a great tool for backing up your code projects, allowing you to easily save and manage different versions of your project among your friends and coworkers. But despite all the friendly packaging, on github, it was really intimidating. I didn’t know what a branch was, let alone a fetch, merge or anything even more complex like trunks and bisecting. Eventually though, I got the basics sorted out. It was enough to send (push) a few of my repositories over to github. This was my basic experience: