netflixAllMoviesSmall Here’s a recent visualization I did of the dataset used in the Netflix Prize Competition. The dataset is 17,700 movies and 31 gigs of user ratings. This viz shows similar movies close to one another, with the similarities determined by a formula based on ratings.

I found most interesting a cluster of movies (in blue) that I’d say are generally acclaimed. The cluster contains movies of across all genres, such as Schindler’s List, BraveHeart, and Super Size Me. Beyond that, there’s a bunch of clusters which are mostly defined by a genre such as music, sports, documentary, Imax, children’s films, or bonus material. The big blob in the center is mostly what I’d call junk movies.

I’ve labeled some movies just to give some sense of what the clusters contain. There’s an interactive version of the viz as well, so you can explore the movies for yourself…

Share