25 years of pattern mining

This year, we are in 2019, and it is already 25 years since Agrawal wrote his seminal papers on frequent itemset mining and association rule mining in 1994. Since then, there has been thousands of papers published on this topic, some about algorithm design, new pattern mining problems, and others about applications in a multitude of fields. And there is still many research issues to work on!

After all these years, it is a good time to look back at what has been achieved to get a new perspective. This is what I did recently with colleagues in a survey paper called “Frequent Itemset Mining: a 25 Years Review“. If you are interested by frequent pattern mining, I encourage you to read the paper, as it makes some interesting observations. For example, it is found that some ideas used in recent algorithms for mining patterns in big data can be traced back to some of the early algorithms. Here is a picture from the paper showing a timeline of key algorithms and events in frequent pattern mining:

What will be the future of pattern mining? You can read my blog post about the future of pattern mining to know more about it!

That is all I wanted to write for today!

Philippe Fournier-Viger is a professor of Computer Science and also the founder of the open-source data mining software SPMF, offering more than 150 data mining algorithms.

(Visited 149 times, 1 visits today)


25 years of pattern mining — 2 Comments

  1. Thanks for introducing this review paper. As an expert on pattern mining, what do you think is the next phase of this field and how’s about its future?

    • Hi Dang, Thanks for your comment! 🙂

      I think that there are many challenges. Currently, I think that one of the biggest problem of many papers is that some pattern mining problems are too simple and some of them are unrealistic (have no applications and not even applied on real data in papers). Besides, too many researchers just focus on performance. While performance is important, I think the most important is to focus on what the user needs. Thus, I see the future as:
      – Treating more complex types of data (e.g. dynamic attributed graphs), that may be more suitable to real applications
      – Finding more complex types of patterns (by considering time, etc.)
      – Having more constraints (because the user may need constraints in practical applications to filter patterns)
      – Integrating the concept of statistical significance in pattern mining to avoid finding spurious patterns that only appear by chance (this is a good topic, and there has been some good papers about that in recent years)
      – Designing a more interactive system to explore the data and visualization

      Personally, my view when starting a new pattern mining problem is to always think about applications first. Would it be useful to learn something about the data? Can I have some real data to show that my new problem is useful?



Leave a Reply

Your email address will not be published. Required fields are marked *