Category Archives: Programming

The SPMF data mining library: a brief history and what’s next?

In this blog post, I will talk about the well-known open-source library of data mining algorithms implemented in Java, which I am the founder of. I will give a brief overview of its history, discuss some lessons learned from the development of … Continue reading

Posted in Data Mining, Programming, Research | Tagged , , , | Leave a comment

How to test if a data mining mining algorithm implementation is correct?

In this blog post, I will discuss how to check if a data mining algorithm implementation is correct and complete. This is a very important topic for researchers who are implementing data mining algorithms since an incorrect implementation may generate unexpected results. … Continue reading

Posted in Data Mining, Programming, Research, Uncategorized | Tagged , , , | 3 Comments

Drawing a set-enumeration tree using Java and GraphViz

In this blog post, I will explain and provide source code to automatically  draw the set-enumeration tree of a set using Java and GraphViz.  Drawing a set-enumeration tree is useful in computer science, for example in frequent itemset mining, a subfield of data … Continue reading

Posted in Data Mining, Mathematics, Programming, Research | Leave a comment

Big Problems only found in Big Data?

Today, I will discuss the topic of Big Data, which is a very popular topic nowadays.  The popularity of big data can be seen for example in universities. Many universities are currently searching for professors who do research on “big data”. Moreover, … Continue reading

Posted in artificial intelligence, Data Mining, General, Programming | Tagged , , , | Leave a comment

Discovering and visualizing sequential patterns in web log data using SPMF and GraphViz

Today, I will show how to use the open-source SPMF data mining software to discover sequential patterns in web log data. Then, I will show to how visualize the frequent sequential patterns found using GraphViz. Step 1 :  getting the … Continue reading

Posted in Data Mining, Programming | Tagged , , , , , | 8 Comments

Why data mining researchers should evaluate their algorithms against state-of-the-art algorithms?

sA common problem in research on data mining is that researchers proposing new data mining algorithms often do not compare the performance of their new algorithm with the current state-of-the art data mining algorithms. For example, let me illustrate this … Continue reading

Posted in Data Mining, Programming, Research | 4 Comments

How to measure the memory usage of data mining algorithms in Java?

Today, I will discuss the topic of accurately evaluating the memory usage of data mining algorithms in Java. I will share several problems that I have discovered with memory measurements in Java for data miners and strategies to avoid these … Continue reading

Posted in Data Mining, Programming, Research | Tagged , , , , , | 1 Comment

What are the steps to implement a data mining algorithm?

In this post, I will discuss what are the steps that I follow to implement a data mining algorithm.  The subject of this post comes from a question that I have received by e-mail recently, and I think that it … Continue reading

Posted in Data Mining, Programming | Tagged , , , , | 40 Comments

Choosing data structures according to what you want to do

Today, I write a post about programming. I want to share a simple but important idea for writing optimized code. The idea is to choose data structures according to what you want to do instead of what you want to … Continue reading

Posted in Data Mining, Programming | Tagged , , , , , | Leave a comment

Analyzing the source code of the SPMF data mining software

Hi everyone, In this blog post, I will discuss how I have applied an open-source tool that is named Code Analyzer ( http://sourceforge.net/projects/codeanalyze-gpl/ )  to analyze the source code of my open-source data mining software named SPMF. I have applied … Continue reading

Posted in Data Mining, Programming | Tagged , , , , | 1 Comment