Tag Archives: data science

This is why you should visualize your data!

In the data science and data mining communities, several practitioners are applying various algorithms on data, without attempting to visualize the data.  This is a big mistake because sometimes, visualizing the data greatly helps to understand the data. Some phenomena are obvious … Continue reading

Posted in Big data, Data Mining, Data science | Tagged , , , , | 2 Comments

An Introduction to Sequential Pattern Mining

In this blog post, I will give an introduction to sequential pattern mining, an important data mining task with a wide range of applications from text analysis to market basket analysis.  This blog post is aimed to be a short … Continue reading

Posted in Big data, Data Mining, Data science | Tagged , , , , , , , | 7 Comments

An introduction to frequent subgraph mining

In this blog post, I will give an introduction to an interesting data mining task called frequent subgraph mining, which consists of discovering interesting patterns in graphs. This task is important since data is naturally represented as graph in many domains (e.g. … Continue reading

Posted in Big data, Data Mining, Data science | Tagged , , , , , , | 9 Comments

We are launching a new data mining journal

In this blog post, I will discuss one of my recent and current project. I have been recently working with my colleague Chun-Wei Lin on launching a new journal, titled “Data Science and Pattern Recognition“. This is a new open-access journal, … Continue reading

Posted in Big data, Data Mining, Data science, Research | Tagged , , , | 2 Comments

Introduction to clustering: the K-Means algorithm (with Java code)

In this blog post, I will introduce the popular data mining task of clustering (also called cluster analysis).  I will explain what is the goal of clustering, and then introduce the popular K-Means algorithm with an example. Moreover, I will briefly explain how an open-source Java implementation of … Continue reading

Posted in Big data, Data Mining, Data science, Open-source | Tagged , , , , , , | 2 Comments

Introduction to time series mining with SPMF

This blog post briefly explain how time series data mining can be performed with the Java open-source data mining library SPMF (v.2.06).  It first explain what is a time series and then discuss how data mining can be performed on time series. What is … Continue reading

Posted in Big data, Data Mining, Open-source, Time series | Tagged , , , , , , , , | 2 Comments

Report of the PAKDD 2014 conference (part 3)

This post continue my report of the PAKDD 2014 in Tainan (Taiwan). The panel about big data Friday, there was a great panel about big data with 7 top researchers from the field of data mining.  I will try to faithfully report some … Continue reading

Posted in Academia, Big data, Conference, Data Mining, Data science | Tagged , , , , | 1 Comment

Report of the PAKDD 2014 conference (part 2)

This post will continue my report of the PAKDD 2014 in Tainan (Taiwan). About big data Another interesting talk at this conference was given by Jian Pei. The topic was Big Data. Some key ideas in this talk was that to make … Continue reading

Posted in Academia, Big data, Conference, Data Mining, Data science | Tagged , , , , | 2 Comments

Report of the PAKDD 2014 conference (part 1)

I am currently at the PAKDD 2014 conference in Tainan, In this post, I will report interesting information about the conference and talks that I have attended. Importance of Succint Data Structures for Data Mining I have attended a very nice … Continue reading

Posted in Big data, Conference, Data Mining, Data science | Tagged , , , | 2 Comments