Guide to Getting Started in Machine Learning
October 11th, 2009Someone at work recently asked how he should go about studying machine learning on his own. So I’m putting together a little guide. This post will be a living document…I’ll keep adding to it, so please suggest additions and make comments.
Fortunately, there’s a ton of great resources that are free and on the web. The very best way to get started that I can think of is to read chapter one of The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2009 edition). The pdf is available online. Or buy the book on Amazon here, if you prefer.
Once you’ve read the first chapter, download R. R is an open-source statistics package/language that’s quite popular. Never heard of it? Check out this post (How Google and Facebook are using R).
Once you’ve installed R, maybe played around a little, then check out this page which describes the major machine learning packages in R. If you’re already familiar with some of the techniques, then dive in and start playing around with them in R. On the other hand, if it looks really complicated, don’t worry about it yet.
Oh, by the way, if you want to start playing around with machine learning in R, you’ll need data. Check out the UCI Machine Learning Repository. They have both real and toy datasets. The iris dataset, for example, is famous for showing up in many research publications.
I’d suggest next reading more of The Elements of Statistical Learning. Its an excellent book. Try doing some of the programming exercises using R. If you don’t like this book, there are plenty of others. Bishop’s Pattern Recognition and Machine Learning is a famous one. It can be a little difficult depending on your math background. Tom Mitchell’s Machine Learning is another that’s often used to teach the topic.
If you’re looking for perhaps a more passive experience, or want the feel of a classrom, Andrew Ng of Stanford has posted all of his lectures online. He starts by saying that he thinks machine learning is the most exciting field in all of computer science. Here here!
Another great resource is the machine learning course MIT has posted on their OpenCourseWare site. It has the lecture notes, assignments, and more. ![]()
I’ll stop here now. More later.
October 12th, 2009 at 1:44 pm
Great guide. There’s also a bunch of us at http://machine-learning.eggsprout.com that’s eager to learn ML. You’re all welcome to join us.
October 15th, 2009 at 11:39 am
Elements of Statistical Learning is now free for the whole book.
October 20th, 2009 at 10:23 am
Hei,
This is great! Just what I needed.
Thanks!
Adi
October 20th, 2009 at 11:57 am
Very good guide! Reminds me on the Turing-Test. Somehow
October 20th, 2009 at 12:50 pm
I found the book “Programming Collective Intelligence” (O´Reilly) to be a great resource if you like to dive into the subject in a practical manner…
October 20th, 2009 at 7:52 pm
More, more! Don’t stop here. This is the kinds of stuff I should have been looking at instead of trying to learn yet another language.
October 21st, 2009 at 2:53 am
Thanks for the resources! I must wholeheartedly second Programming Collective Intelligence (O’Reilly) – simple and useful applications of machine learning utilizing “web 2.0″ apps.
October 21st, 2009 at 1:54 pm
Thanks a lot.
I have been trying to learn this field.
October 22nd, 2009 at 2:08 am
Thanks for the great guide.
October 22nd, 2009 at 12:29 pm
I should also mention WEKA for Java folks here. It’s a ML framework with a nice UI to experience small dataset.
May 20th, 2010 at 10:26 am
Hi there,
If you are into R and graphics, might you consider writing (or taking part in) the following competition:
http://www.r-statistics.com/2010/05/user-2010-is-looking-for-a-t-shirt-design/
(I am not connected to the competition, I simply love using R)
Best,
Tal
December 21st, 2010 at 4:13 pm
I found the book “Programming Collective Intelligence” (O´Reilly) to be a great resource if you like to dive into the subject in a practical manner…
March 12th, 2012 at 8:05 am
[...] Guide to Getting Started in Machine Learning http://abeautifulwww.com/2009/10/11/guide-to-getting-started-in-machine-learning/ [...]
March 31st, 2012 at 6:41 am
[...] Guide to getting started in Machine Learning by abeautifulwww [...]
April 24th, 2012 at 11:20 am
Another great source of Machine Learning libraries can be found on the HPCC Systems web site here:
http://hpccsystems.com/ml
They are free to download and comes with complete documentation.