Category Archives: Data Mining

On the Completeness of the CloSpan and IncSpan algorithms

In this blog post, I will briefly discuss the fact that the popular CloSpan algorithm for frequent sequential pattern mining is an incomplete algorithm.  This means that in some special situations, CloSpan does not produce the expected results that it has been designed for, and … Continue reading

Posted in Data Mining, Data science | Tagged , , , , , | Leave a comment

On the correctness of the FSMS algorithm for frequent subgraph mining

In this blog post, I will explain why the FSMS algorithm for frequent subgraph mining is an incorrect algorithm.  I will publish this blog post because I have found that the algorithm is incorrect after spending a few days to … Continue reading

Posted in Big data, Data Mining, Graph mining | Tagged , , , , , | 5 Comments

Postdoctoral positions in data mining in Shenzhen, China (apply now)

The CIID research center of the Harbin Institute of Technology (Shenzhen campus, China) is looking to hire two postdoctoral researchers to carry research on data mining / big data. The applicant must have: a Ph.D. in computer Science, a strong research background in data mining/big … Continue reading

Posted in artificial intelligence, Big data, Data Mining, Research | 6 Comments

How to discover interesting patterns in data?

Discovering interesting patterns in data is often referred as data mining, data science or big data.  In the last few years, I have written several blog posts providing introduction to data mining and key topics in data mining: An Introduction to … Continue reading

Posted in Big data, Data Mining, Data science, Research | Tagged , , , , | Leave a comment

Call for chapters: High Utility Pattern Mining, the book

CALL FOR CHAPTERS High-Utility Pattern Mining: Theory, Algorithms and Applications Editors: Philippe Fournier-Viger, Chun-Wei Lin, Roger Nkambou, Bay Vo An edited book to be published by Springer in 2018 Introduction This book will provide an introduction to the high utility mining, reviews state-of-the-art … Continue reading

Posted in Big data, Data Mining, Data science, Research, Utility Mining | Tagged , , , , | Leave a comment

Introduction to the Apriori algorithm (with Java code)

This blog post provides an introduction to the Apriori algorithm, a classic data mining algorithm for the problem of frequent itemset mining. Although Apriori was introduced in 1993, more than 20 years ago, Apriori remains one of the most important data mining algorithms, not … Continue reading

Posted in Big data, Data Mining, Data science, Open-source | Tagged , , , , , | 12 Comments

How to publish in top conferences/journals? (Part 2) – The opportunity cost of research

Many researchers wish to produce high quality papers and have a great research impact. But how? In a previous blog post, I have discussed how the “blue ocean strategy” can be applied to publish in top conference/journal. In this blog … Continue reading

Posted in Academia, Data Mining, Research | Tagged , | 1 Comment

The PAKDD 2017 conference (a brief report)

This week, I have attended the PAKDD 2017 conference in Jeju Island, South Korea, this week, from the 23 to 26th May.  PAKDD is the top data mining conference for the asia-pacific region. It is held every year in a … Continue reading

Posted in Academia, Big data, Conference, Data Mining, Data science | Tagged , , , | 2 Comments

This is why you should visualize your data!

In the data science and data mining communities, several practitioners are applying various algorithms on data, without attempting to visualize the data.  This is a big mistake because sometimes, visualizing the data greatly helps to understand the data. Some phenomena are obvious … Continue reading

Posted in Big data, Data Mining, Data science | Tagged , , , , | 3 Comments

An Introduction to Sequential Pattern Mining

In this blog post, I will give an introduction to sequential pattern mining, an important data mining task with a wide range of applications from text analysis to market basket analysis.  This blog post is aimed to be a short … Continue reading

Posted in Big data, Data Mining, Data science | Tagged , , , , , , , | 46 Comments