(video) Mining Frequent Itemsets with the Apriori algorithm

This is a video presentation of the Apriori algorithm for discovering frequent itemsets in data.  The Java source code of the Apriori algorithm and datasets for evaluating its performance are available in the SPMF software. If you want to … Continue reading

Introduction to clustering: the K-Means algorithm (with Java code)

In this blog post, I will introduce the popular data mining task of clustering (also called cluster analysis).  I will explain what is the goal of clustering, and then introduce the popular K-Means algorithm with an example. Moreover, I will briefly explain how an open-source Java implementation of … Continue reading

Introduction to time series mining with SPMF

This blog post briefly explain how time series data mining can be performed with the Java open-source data mining library SPMF (v.2.06).  It first explain what is a time series and then discuss how data mining can be performed on time series. What is … Continue reading

An Introduction to High-Utility Itemset Mining

In this blog post, I will give an introduction about a popular problem in data mining, which is called “high-utility itemset mining” or more generally utility mining. I  will give an overview of this problem, explains why it is interesting, and provide source code of … Continue reading

Discovering and visualizing sequential patterns in web log data using SPMF and GraphViz

Today, I will show how to use the open-source SPMF data mining software to discover sequential patterns in web log data. Then, I will show to how visualize the frequent sequential patterns found, using GraphViz. Step 1 :  getting the … Continue reading