Archives
Categories
- Academia (91)
- artificial intelligence (35)
- Big data (83)
- Bioinformatics (7)
- cfp (13)
- China (1)
- Chinese posts (1)
- Conference (78)
- Data Mining (190)
- Data science (107)
- Database (2)
- General (43)
- Industry (2)
- Java (12)
- Latex (12)
- Machine Learning (22)
- Mathematics (2)
- open-source (43)
- Other (4)
- Pattern Mining (94)
- Plagiarism (1)
- Programming (17)
- Research (108)
- spmf (62)
- Time series (3)
- Uncategorized (29)
- Utility Mining (23)
- Video (19)
- Website (6)
-
Recent Posts
- CFP: Special session at SOMET 2026
- SPMF 2.65 is released!
- The Item-Itemset Matrix Viewer: a new feature in SPMF 2.65 (to be released)
- Merry X-Mas and Happy New year to SPMF users!
- The 1st HP4MoDa workshop was held at BIBM 2025
- Another release of SPMF: v2.64b
- A prototype of an improved GUI for the SPMF pattern mining software
- Upcoming in SPMF 2.64b : The “Pattern Diff Analyzer”
- A keynote speaker unable to manage time (180 slides!)
- A new version of SPMF (v2.64, november 2025)!
Recent Comments


-

-


Tag cloud
- academia
- ai
- algorithm
- apriori
- article
- artificial intelligence
- association rule
- big data
- cfp
- china
- conference
- data
- data mining
- data science
- graph
- high utility itemset mining
- icdm
- itemset
- itemset mining
- java
- journal
- latex
- machine learning
- open-source
- open source
- pakdd
- paper
- papers
- pattern
- pattern mining
- periodic pattern
- phd
- Research
- researcher
- reviewer
- sequence
- sequential pattern
- software
- spmf
- udml
- utility mining
- video
- website
- workshop
- writing
Number of visitors:
2,759,046
Author Archives: Philippe Fournier-Viger
Subgraph mining datasets
In this post, I will provide links to standard benchmark datasets that can be used for frequent subgraph mining. Moreover, I will provide a set of small graph datasets that can be used for debugging subgraph mining algorithms. The format of graph datasets A graph dataset is a text … Continue reading
Posted in Big data, Data Mining
Tagged data mining, dataset, frequent subgraph, graph, subgraph
Leave a comment
On the Completeness of the CloSpan and IncSpan algorithms
In this blog post, I will briefly discuss the fact that the popular CloSpan algorithm for frequent sequential pattern mining is an incomplete algorithm. This means that in some special situations, CloSpan does not produce the expected results that it has been designed for, and in particular some patterns are … Continue reading
Posted in Data Mining, Pattern Mining
Tagged clospan, frequent pattern, incspan, pattern mining, sequential pattern
Leave a comment
10 ways of becoming more efficient at doing research
Today, I will discuss 10 ways of becoming more efficient at doing research. This is an important topic for any researcher who wishes to be more productive in terms of research. For example, one may want to be able to publish … Continue reading
On the correctness of the FSMS algorithm for frequent subgraph mining
In this blog post, I will explain why the FSMS algorithm for frequent subgraph mining is an incorrect algorithm. I will publish this blog post because I have found that the algorithm is incorrect after spending a few days to implement the algorithm in 2017 and wish to save time to other researchers … Continue reading
Posted in Big data, Data Mining, Pattern Mining
Tagged algorithm, correctness, data mining, pattern mining, subgraph mining
2 Comments
IEEE and its language polishing service
Many researchers are not native English speakers but need to write research papers in English, as it is the common language for sharing ideas with other researchers worldwide. Some papers are very well-written, others are not so well-written but are still readable, … Continue reading
Brief report about the WICON 2017 conference
This week-end, I have attended the WICON 2017 conference in Tianjin, China to present a research paper about the application of data mining to analyze data from water meters installed in the City of Moncton, Canada. In this post, I will give a brief overview of the WICON 2017 conference. About the conference This … Continue reading
Introduction to the Apriori algorithm (with Java code)
This blog post provides an introduction to the Apriori algorithm, a classic data mining algorithm for the problem of frequent itemset mining. Although Apriori was introduced in 1993, more than 20 years ago, Apriori remains one of the most important data mining algorithms, not because it is the fastest, but because it has … Continue reading
Posted in Big data, Data Mining, Pattern Mining, Programming
Tagged apriori, code, frequent itemset, frequent pattern, itemset, java, pattern mining
12 Comments
Do not link to impact factors, they will censor you!
On July 20 2017, I received an e-mail from a company called Clarivate Analytics Trademark Enforcement ( legal@ip-clarivateanalytics.com ) about copyright infringement for the Journal Citation Reports, a product by Thomson Reuters. They contact with me because a few years ago I have created a webpage that … Continue reading
How to publish in top conferences/journals? (Part 2) – The opportunity cost of research
Many researchers wish to produce high quality papers and have a great research impact. But how? In a previous blog post, I have discussed how the “blue ocean strategy” can be applied to publish in top conference/journal. In this blog post, I will discuss another important concept for producing … Continue reading
Posted in Academia, Research
Tagged academia, articles, papers, publications, Research, researcher
Leave a comment
The PAKDD 2017 conference (a brief report)
This week, I have attended the PAKDD 2017 conference in Jeju Island, South Korea, this week, from the 23 to 26th May. PAKDD is the top data mining conference for the asia-pacific region. It is held every year in a different pacific-asian country. In this blog post, I will write a brief report about … Continue reading
Posted in Academia, Conference, Data Mining, Data science
Tagged asia, big data, conference, data mining, data science, korea, pakdd
4 Comments