Tag Archives: data science

An Introduction to Data Mining

In this blog post, I will introduce the topic of data mining. The goal is to give a general overview of what is data mining. What is data mining? Data mining is a field of research that has emerged in the 1990s, … Continue reading

Posted in Big data, Data Mining, Data science | Tagged , , , , | 23 Comments

Five recent books on pattern mining

In this blog post, I will list a few interesting and recent books on the topic of pattern mining (discovering interesting patterns in data). This mainly lists books from the last 5 years. High utility pattern mining: Theory, Applications and algorithms (2019). This … Continue reading

Posted in Big data, Data Mining, Pattern Mining, Utility Mining | Tagged , , , , , | Leave a comment

The PAKDD 2020 conference (a brief report)

In this report, I will talk about the 24th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2020), from the 11th to 14th May 2020. The PAKDD conference PAKDD is a top international conference on data mining / big data in the Pacific-Asia part of the world. … Continue reading

Posted in Conference, Data Mining, Data science | Tagged , , , | 3 Comments

(Video) Sequence prediction with the CPT and CPT+ Models

Today, I presents the CPT and CPT+ sequence prediction models in a video. Sequence prediction is an important task in data mining which consists of predicting the next symbols of a sequence. It can be used for example to predict the next word that someone will … Continue reading

Posted in artificial intelligence, Big data, Data Mining, Data science | Tagged , , , , , , | Leave a comment

Correlation does not imply causation

There is a well known principle in statistics that correlation does not imply causation. It means that even if we observe that two variables behave in the same way, we should not conclude that the behavior of one of those variables … Continue reading

Posted in Big data, Data Mining, Data science | Tagged , , , , , | Leave a comment

(video) Mining Frequent Itemsets with the Apriori algorithm

This is a video presentation of the Apriori algorithm for discovering frequent itemsets in data. Frequent itemset mining is one of the most popular data mining task. VIDEO LINK: https://www.philippe-fournier-viger.com/spmf/videos/apriori.mp4 The Java source code of the Apriori algorithm and datasets for evaluating its performance are available in the SPMF software. If you want … Continue reading

Posted in Data Mining, Data science, Pattern Mining, Video | Tagged , , , , , , , | Leave a comment

Interview with Prof. Rage Uday Kiran about Data Mining

Today, I have the pleasure to interview Rage Uday Kiran researcher at the National Institute of Informatics in Tokyo, Japan.  R. Uday Kiran is an Indian researcher who has been working in Japan for several years. He has been active mainly in the field of data mining, and … Continue reading

Posted in Data Mining, Data science, Interview, Pattern Mining | Tagged , , , | Leave a comment

The best data mining mailing lists (for researchers)

Today, I will list a few useful mailing lists related to data mining and big data. Subscribing to these mailing list is useful for PhD students and researchers, as many jobs, conferences, special issues and other opportunities are advertised on these mailing lists. It is … Continue reading

Posted in Big data, Data Mining, Data science | Tagged , , , , , , | Leave a comment

Analyzing the source code of SPMF (5 years later)

Five years ago, I had analyzed the source code of the SPMF data mining software using an open-source tool called CodeAnalyzer ( http://sourceforge.net/projects/codeanalyze-gpl/ ). This had provided some interesting insights about the structure of the project, especially in terms of lines of codes and code to … Continue reading

Posted in Data Mining, Data science, open-source, spmf | Tagged , , , , , | Leave a comment

(video) Mining Sequential Rules with RuleGrowth

This is a video presentation of the paper “Mining Partially-Ordered Sequential Rules Common to Multiple Sequences” about discovering sequential rules in sequences using the RuleGrowth algorithm. VIDEO LINK: https://www.philippe-fournier-viger.com/spmf/videos/rulegrowth.mp4 More information about the RuleGrowth algorithm are provided in this research paper: Fournier-Viger, P., Wu, C.-W., Tseng, V.S., Cao, L., Nkambou, R. (2015). Mining Partially-Ordered Sequential Rules Common to Multiple … Continue reading

Posted in Big data, Data Mining, Pattern Mining | Tagged , , , , | Leave a comment