KDD 2011: Recap

Often I write some kind of summary when I get back from a conference, but today I ran into Justin Donaldson, a fellow Indiana alum, who said he still occasionally finds use of the content, or at least the visualizations, on my blog. Anyhow, I felt motivated to write my notes up as a blog post.

Overall many of the talks I attended where using latent factor models and stochastic gradient descent. A few mentions of gradient boosted trees. Good stuff. My favorite talks…

  • Charles Elkin’s “A Log-Linear Model with Latent Features for Dyadic Prediction”. A scalable method to do collaborative filtering using both latent features and “side information” (user/item content features). Can’t wait to try it out! Here’s a link.
  • Peter Norvig’s Keynote about data mining at Google. Some tidbits:
    • Google mostly uses unsupervised or semi-supervised learning, both because of the cost of labeling and because labels themselves can be an impediment to higher accuracy.
    • He had this great graph of the accuracy of several algorithms for the word sense disambiguation task plotted against the amount of data used in training. The best performing algorithm always underperformed the worst algorithm when it was given an order of magnitude more data. A great argument for simple learning algorithms at very large scale.
    • They are very interested in transfer learning
  • KDD Cup. Topic was music recommendation using Yahoo’s data.
    • Many of the same ideas and observations as in the Netflix Prize. Neighbor models and latent factor models trained with stochastic gradient descent seemed pervasive.
    • Ensembles were necessary to win, but the accuracy improvement wasn’t huge over the best individual models. Quite the argument for not using ensembles in industry.
    • Yahoo organizers made some really interesting comments about the data. Among them that the mean rating is quite different for power users, which makes sense. And the data is acquired from different UI mechanisms, if I understood correctly, which impacts the distributions.

Looking forward to tomorrow!