Author Archives: Philippe Fournier-Viger

How to run SPMF without installing Java?

The SPMF data mining software is a popular open-source software for discovering patterns in data and for performing other data mining tasks. Typically, to run SPMF, Java must have been installed on a computer. However, it is possible to run SPMF on a computer that does not have Java installed. For example, … Continue reading

Posted in Data Mining, Data science, open-source, Pattern Mining, Research, spmf | Tagged , , , | Leave a comment

How to compare two LaTeX documents? (using LatexDiff)

Many researchers are using Latex to write research papers instead of using Microsoft Word. I previously wrote a blog post about the reasons for using Latex to write research papers. Today, I will go in more details about Latex and talk about a nice tool for comparing two Latex files to see the differences. The goal … Continue reading

Posted in Academia, Latex, Research | Tagged , , , | 1 Comment

Subgraph mining datasets

In this post, I will provide links to standard benchmark datasets that can be used for frequent subgraph mining. Moreover, I will provide a set of small graph datasets that can be used for debugging subgraph mining algorithms. The format of graph datasets A graph dataset is a text … Continue reading

Posted in Big data, Data Mining | Tagged , , , , | Leave a comment

On the Completeness of the CloSpan and IncSpan algorithms

In this blog post, I will briefly discuss the fact that the popular CloSpan algorithm for frequent sequential pattern mining is an incomplete algorithm.  This means that in some special situations, CloSpan does not produce the expected results that it has been designed for, and in particular some patterns are … Continue reading

Posted in Data Mining, Pattern Mining | Tagged , , , , | Leave a comment

10 ways of becoming more efficient at doing research

Today, I will discuss 10 ways of becoming more efficient at doing research. This is an important topic for any researcher who wishes to be more productive in terms of research. For example, one may want to be able to publish … Continue reading

Posted in Academia, General, Research | Tagged , , | Leave a comment

On the correctness of the FSMS algorithm for frequent subgraph mining

In this blog post, I will explain why the FSMS algorithm for frequent subgraph mining is an incorrect algorithm.  I will publish this blog post because I have found that the algorithm is incorrect after spending a few days to implement the algorithm in 2017 and wish to save time to other researchers … Continue reading

Posted in Big data, Data Mining, Pattern Mining | Tagged , , , , | 2 Comments

IEEE and its language polishing service

Many researchers are not native English speakers but need to write research papers in English, as it is the common language for sharing ideas with other researchers worldwide.  Some papers are very well-written, others are not so well-written but are still readable, … Continue reading

Posted in Academia, Research | Tagged , , , | Leave a comment

Brief report about the WICON 2017 conference

This week-end, I have attended the WICON 2017 conference in Tianjin, China to present a research paper about the application of data mining to analyze data from water meters installed in the City of Moncton, Canada. In this post, I will give a brief overview of the WICON 2017 conference. About the conference This … Continue reading

Posted in Conference | Tagged , , , | Leave a comment

Introduction to the Apriori algorithm (with Java code)

This blog post provides an introduction to the Apriori algorithm, a classic data mining algorithm for the problem of frequent itemset mining. Although Apriori was introduced in 1993, more than 20 years ago, Apriori remains one of the most important data mining algorithms, not because it is the fastest, but because it has … Continue reading

Posted in Big data, Data Mining, Pattern Mining, Programming | Tagged , , , , , , | 12 Comments

Do not link to impact factors, they will censor you!

On July 20 2017, I received an e-mail from a company called Clarivate Analytics Trademark Enforcement ( legal@ip-clarivateanalytics.com ) about copyright infringement for the Journal Citation Reports, a product by Thomson Reuters. They contact with me because a few years ago I have created a webpage that … Continue reading

Posted in Academia, Research | Tagged , , | Leave a comment