Latest Project: A new website!

What’s up dudes! It’s been a while, but I’ve been busy. If you’ve been following me on Twitter, you’ll see that I’ve been working on this new website. It’s basically just like 5000best.com/movies or easyqueue or whatever it’s called. I’ve never used them, since I only use it on the xbox, and the interface is pretty bad!

But anyways, the project has been underway for a few weeks now, and while it’s my first website that I’m building from as scratch as you can get with C#’s ASP.NET MVC4, it’s coming along nicely. So far, I’ve managed to pull down the Netflix API’s Index, so that’s about 55k titles that I’ve got to play around with. The dataset is pretty complete, which means that it’s pretty comprehensive.

Once I got MVC4 set up and working, which was not easy, even with sec_goat’s help by the way, was getting a database local server working. I don’t want to get into that too much, because I’m still sick of the idea, but basically MVC4 uses 2012 SQL databases, and I had SQL SERVER 2008, so I needed to update to 2012 to match the server I had created. That meant I needed to update to ’08, ’08 SP1, ’08 R2, ’08 R2 SP1, ’08 R2 SP3, then finally ’12. I think I even needed SP1 on the ’12. But it really isn’t clear the order to me, since at that point I was just installing everything I could and trying to get it to work. It was frustrating.

Another issue I am currently dealing with is the idea of Database Normalization. What that is is essentially splitting up a database in order to get to be the least redundant as it could possibly be. Reducing columns that work out to the same thing. Like say you had an ingredient list, and you added ketchup and mustard and pickles. But EVERY single time you didn’t use ketchup, you took away mustard too, so that means that you might as well have one column called toppings yes or no, rather than a ketchup and a mustard column. Well, that’s actually a really poor example. As usual, Stack Overflow has a better example, as well as About.com actually.

So back to the issue. I used to have ALL the relevant data from the Netflix API in one massive, ugly table, but I wasn’t happy with how long that it was taking to get data from it (and apparently DB Normalization doesn’t really help with that, but maybe it does, I really don’t know at this point) so I endeavored to break it up. So it’s most of the movie data, like year released, duration, rating etc in one table, then a list of all the genres into another, then a cross table, linking one genre to one movie ID. And with Box Art, I’ve got a table of movie id’s linked with the box art of relevant sizes. That way whenever I need to box art for a movie ID, I can just select it from the BoxArt table, rather than look through the whole movies database. Dunno if it helps my speed case at all, but I hope so.

Right this minute, I’m building the list of Movie to Genre table to make. It’s been about 20 minutes since I’ve started writing this, and it’s at movie 909 out of 55k. It’ll take about 10 hours I think to create the table in full, but I’m told that isn’t a terrible amount of time. But again, I don’t know for sure.

So there you have it, good news and bad new. A lot of frustration but a lot of new information, so you take what you can get. I definitely don’t like knowing absolutely nothing about DBM and SQL, and asking really basic questions, but I guess that’s just part of the job. Er, hobby.

Pull out images from a Minus Gallery, using vim.

I was recently told via reddit that it was hard to pull images from Minus.com as they’ve hidden it behind javascript, and I just took a look to see what I could do, and lo and behold, vim comes to our rescue. In the spirit of Derek Wyatt, I wanted to make a quick tutorial on how to pull those images, in maybe a dozen vim commands.  So just check out that reddit link and enjoy.

Turns a page source into a bunch of valid image links.

 

(more…)

Vim Search and Replace: Grabbing Image URLs from HTML source code

The following is a quick and dirty way of pulling a lot of URLs out of a given pages source code, using two commands in vim, my new favourite text editor. So, right to the point!

:v/jpg/d
:%s/^.*src="\(http:.\{-}jpg\)".*/\1/g

Try it out right now on the source code of Imgur’s /r/ScarlettJohansson’s page

(more…)

How to scrape an ImageBam gallery for images with 30 lines of Python

Right off the bat, I want to show you the results of this scraping, to give you a bit of motivation. Anyways, thanks to requests and BeautifulSoup, this is made trivially easy. Enough talking, let’s get down to the code! Don’t forget that as usual I’ll include the full source code at the bottom of the post.

(more…)