I have often been asked what are some good books for learning data mining. In this blog post, I will answer this question by discussing some of the **top data mining books** for **learning data mining** and **data science** from a computer science perspective. These books are especially recommended for those interested in learning how to design data mining algorithms and that wants to understand the main algorithms as well as understand some more advanced topics.

**“Introduction to data mining”**by Tan, Steinbach & Kumar (2006)

This book is a very good introduction book to data mining that I have enjoyed reading . It discusses all the main topics of data mining: clustering, classification, pattern mining and outlier detection. Moreover, it contains two very good chapters on clustering by Tan & Kumar, which are specialists in this domain. What I like about this book is that the chapters explain the techniques with enough details to have a good understanding of the techniques and their drawbacks unlike some other books that do not go into details. Some free sample chapters of the book can be found here. Before buying this book, note that a 3rd edition has been announced to be released soon, although it has been delayed for more than a year.

2. **Data Mining: Concepts and Techniques, Third Edition **by Han, Kamber & Pei (2013)

This book is another great book that I like. I have also used it for teaching data mining. It covers all the main topics of data mining that a good data mining course should covers, as the previous book. However, this book is more like an encyclopedia. It covers a lot of topics and give a very broad view of the field but does not cover each topics in much details. It is also designed for a computer scientist audience. Besides, it is also written by some top data mining researchers (Han & Pei).

3. **Data Mining and Analysis Fundamental Concepts and Algorithms** by Zaki & Meira (2014)

This is another great data mining book written by a leading researcher (Zaki) in the field of data mining. It also target computer scientist. This books covers all the main topics of data mining but also has some chapters on some advanced topics such as graph mining, which are very interesting. A version of the book that can be used for personal use only is offered freely here. The algorithms in this books are very detailed and it is possible to implement them by reading the book. In general, some algorithms are presented in each chapter. They are not always the best algorithms but are often the most popular (the classical algorithms).

4. **Data Mining: The Textbook **by Aggarwal (2015)

This is probably one of the top data mining book that I have read recently for computer scientist. It also covers the basic topics of data mining but also some advanced topics. Moreover, it is very up to date, being a very recent book. It is also written by a top data mining researcher (C. Aggarwal). It also covers many recent and advanced topics such as time series, graph mining and social network mining, not covered in several other books.

**5. “The Elements of Statistical Learning”** by Freidman et al (2009)

This is aquite popular book a little bit more focused toward statistics. It covers both many data mining techiques such as Neural networks, association rule mining, SVM, regression, clustering and other topics. What is interesting about this book is that it is a top book used in many university courses like the others and can be downloaded for free here.

**Conclusion**

In this blog post, I have discussed some of the top books for learning data mining algorithms for computer scientists. I have tried to discuss about general books that gives a good foundation for learning data mining and that can also be interesting for advanced topics. However, note that if one is interested in specific topics such as recommender systems and text mining, there also exists some specialized books that covers only these topics in details, that may also be interesting.

==

That is all I wanted to write for now. If you like this blog, you can tweet about it and/or subscribe to my twitter account **@philfv** to get notified about new posts.

**Philippe Fournier-Viger** is a professor of Computer Science and also the founder of the open-source data mining software SPMF, offering more than 52 data mining algorithms.

hello sir!

how to improve the efficiency of apriori algorithm?

which algorithms are used to reduce the scanning time in it?

There are different ways to improve Apriori such as implementing apriori using a hash-tree or using transaction id lists (tids) with bit vectors. However, if you care about efficiency, you should use an algorithm such as FPGrowth which is much faster than Apriori. Apriori is an old algorithm that is very slow. It is better to use another algorithm than to optimize apriori.

Thank you for sharing your opinions on Data mining books. I totally agree that they are really useful and enjoyable for reading. I read the book of Han, Kamber & Pei when I start doing my research on Data mining. This book provides a good introduction and it covers many topics of Data mining. Recently, I read the book of Zaki and I really love this book because it is easy to follow and I can learn the Maths background used by algorithms.

Hope that you continue writing many more posts about Data mining and Machine learning.

Hellow Sir, I want a book of data minining best suited for civil engineering apllications, for example for forecasting, prediction,. modelli g

hello sir

I’m a Research scholar and my topic is impacts of fast food using classification algorithm.Can you please suggest me some data mining tools which are related to my research.

Top 5 Data Mining Books for Computer Scientists

I wonder if they still the 5 top books or there is a new book that should be considered?

Hi, That is a good question. These books that I recommended give a good overall foundation about data mining and I still use them for teaching. The older books like Tan & Kumar are maybe now starting to feel a bit outdated. But I still like some explanations in that book for clustering and pattern mining, and thus use it in conjunction with the newer books that cover these areas with less depth.

But if one wants to go deeper in some topics, it is good to look at the new specialized books. For example, althought the above books talk about classification and various techniques like SVM, neural networks, they may not talk about some hot topics such as deep neural networks. For such topics that are newer, there are some good books on these specialized topics like “deep learning” by Ian Goodfellow et al. Or for example, if someone is more interested by natural language processing, then he should check for books on that specialized topics..

So overall, I think the above books are a good start for a computer scientist, and I would complement them with some newer books for specialized topics!

Best regards,

Philippe

Thanks so much for your informative response.