Datasets of 30 English novels for pattern mining and text mining

Today, I want to announce that I have just made public datasets of 30 novels from English Novels from 10 authors of the XIX century. These datasets can be used for testing algorithms for sequential pattern mining, sequential rule mining, … Continue reading

(video) Identifying Stable Periodic Frequent Patterns using SPP-Growth

Today, I present a video about finding stable periodic patterns in data, and discuss a new algorithm named SPP-Growth for this task.  The  SPP-Growth algorithm and datasets for evaluating its performance are available in the SPMF software, which is open-source and programmed in … Continue reading

The SPMF data mining library v.2.40 is released!

Hi all, I am please to announce that a new version of SPMF has just been published (v 2.40). It contains 9 novel algorithms: the HUIM-ABC algorithm for mining high utility itemsets using Artificial Bee Colony Optimization (thanks to Wei Song and Chaoming Huang for … Continue reading

(video) Mining Frequent Itemsets with the Apriori algorithm

This is a video presentation of the Apriori algorithm for discovering frequent itemsets in data. Frequent itemset mining is one of the most popular data mining task.  The Java source code of the Apriori algorithm and datasets for evaluating … Continue reading

Introduction to clustering: the K-Means algorithm (with Java code)

In this blog post, I will introduce the popular data mining task of clustering (also called cluster analysis).  I will explain what is the goal of clustering, and then introduce the popular K-Means algorithm with an example. Moreover, I will briefly explain how an open-source Java implementation of … Continue reading

Introduction to time series mining with SPMF

This blog post briefly explain how time series data mining can be performed with the Java open-source data mining library SPMF (v.2.06).  It first explain what is a time series and then discuss how data mining can be performed on time series. What is … Continue reading