|
|
Message from Dean - May 8th 2007
I am currently testing out a new version of the APF Bridge Component - If you notice any errors within this demo store please drop me a line.
Rating: -
most people have shared their thoughts on the good of this book. I like to point out some of the bad as I read through:
- first, too many typos - both the author and oreilly should do a better job on proof read the materials. the typos are so much that it can easily wreck otherwise good materials.
- second, arcane solution and coding style. Many first step to the solution of machine learning is to represent the problem at hand well. The author's brain apparently wired different from mine so the opinion is personal. For example: chapter 5 on "optimization for preference", he chose to represent a solution as vector form like [0,0,0,0,0,0,0,0,0,0], there is no way I can relate this solution to the real meaning (you want to allocate 10 students into 5 rooms each with two slots) - if there is an easy explanation, the book didn't say so.
thus the 3 star. I believe a second edition is warranted and should be much better.
just my 2c.
Rating: -
Programming Collective Intelligence is a book about applying data mining techniques to analyse collections of data. There is submerged information in Ebay prices, in Facebook profile networks, in collections of movie reviews, in news sites, in the stockmarket; this book by Toby Segaran shows ways to extract, visualise, understand, and predict that information.
Each chapter explains and explores a different data mining algorithm, and builds up a working example in Python, while presenting different methods and parameters of the implementation. I hadn't really worked with Python before, but found the code easy to follow, and picked up some interesting Python idioms that I haven't seen in other languages before. Chapters end with a set of exercises to follow that build your understanding.
As you follow the examples you build up a reasonably generic code base that allows you to swap in and out different implementations, and reuse previous code to add to new applications.
The examples use live examples from the web: sites like Ebay, Facebook, and Yahoo Finance, and this makes the book more interesting and the results more visceral than some other books on the subject which use more contrived or obscure examples. Even though there is a strong web (or web 2.0) focus on the examples, the methods and the understanding is useful for a whole range of applications.
Some of the topics covered:
* Bayesian classifiers to detect spam, or to file news articles into site sections
* Hierarchical and k-means clustering to discover groups of similar items in massive sets
* Euclidiean distance, Pearson Correlation Coefficient, Tanimoto Coefficient: ways to measure the distance (or difference) between items
* Neural networks to predict user behaviour and improve search result ordering
* Optimisation methods like hill climbing, simulated annealing, and genetic algorithms
* Non-negative matrix factorization
* Support vector machines and kernel methods to go where linear regression can't
I found it exciting to read -- it's one of those books that give you a whole bunch of new ideas for things to build as you read it. The presentation is very good: no background is assumed, and it doesn't talk down to those more experienced.
Recommended.
Rating: -
Once I got past the initial shock of finding several glaring grammar and spelling errors in the introduction, I have been pleased with this purchase ever since.
The author gives a good overview the many different approaches to machine language (with great examples in Python). However, it's just that - an overview. While the explanations are very clear and the concepts are presented in a very accessible manner, I found myself having to look elsewhere for more detail on the various algorithms. Yes, with the level of understanding presented in this book you should be able to create functional code for your particular data set. However, I felt that to really get the best results from the algorithms I needed to study them a bit further in order to best apply them to my data.
As a recent CS graduate, I would certainly recommend this book to anyone looking for a basic understanding of machine learning and data mining techniques.
Rating: -
I have just about finished reading this book, and I'm really enjoying it. It's loaded with great information and examples. I like how the author gives the reader tips on when certain techniques are better than others. The python examples are clear and easy to read. I'd love to see more books follow this one's style and structure.
Rating: -
Toby is a genius! He demystifies "Collective Intellignece" and provides some great examples using the public domain Python language. This is one of the best technical books I've ever read as it makes the sometimes difficult connection between theory and practicality.
This is an excellent place to begin for those who have identified the need for a collective intelligence application but are not sure upon which path to take their first step.
Toby's numerous examples are insightful and amusing - and free! It's now required reading for my growing staff of coders. Looking forward to Toby's follow-up (hint, hint).
|