Another Visualization of the Netflix Prize Dataset

netflixAllMoviesSmall Here’s a recent visualization I did of the dataset used in the Netflix Prize Competition. The dataset is 17,700 movies and 31 gigs of user ratings. This viz shows similar movies close to one another, with the similarities determined by a formula based on ratings.

I found most interesting a cluster of movies (in blue) that I’d say are generally acclaimed. The cluster contains movies of across all genres, such as Schindler’s List, BraveHeart, and Super Size Me. Beyond that, there’s a bunch of clusters which are mostly defined by a genre such as music, sports, documentary, Imax, children’s films, or bonus material. The big blob in the center is mostly what I’d call junk movies.

Read More

Share
7 Comments »

An Interactive Visualization of the Netflix Prize Dataset

smallNetflixVizInteractive The visualization activated below (click the button) shows all 17,700 movies that are part of the Netflix Prize Competition. The movies are laid out such that simlar movies are close to one another. Similarity between two movies is computed based on whether users who like one like the other, or (and, really) those who dislike one dislike the other.  Alternatively, take a look at a colorful, static version.

Read More

Share
3 Comments »