Tag Archives: data

Correlation does not imply causation

There is a well known principle in statistics that correlation does not imply causation. It means that even if we observe that two variables behave in the same way, we should not conclude that the behavior of one of those variables … Continue reading

Posted in Big data, Data Mining, Data science | Tagged , , , , , | Leave a comment

The best data mining mailing lists (for researchers)

Today, I will list a few useful mailing lists related to data mining and big data. Subscribing to these mailing list is useful for PhD students and researchers, as many jobs, conferences, special issues and other opportunities are advertised on these mailing lists. It is … Continue reading

Posted in Big data, Data Mining, Data science | Tagged , , , , , , | Leave a comment

(video) Minimal Correlated High Utility Itemsets with FCHM

This is a video presentation of the paper “Mining Correlated High-Utility Itemsets Using the bond Measure” about correlated high utility pattern mining using FCHM.  VIDEO LINK: http://www.philippe-fournier-viger.com/spmf/videos/FCHM_correlated_itemsets.mp4 More information about the FCHM algorithm are provided in this research paper: Fournier-Viger, P., Zhang, Y., Lin, J. C.-W., … Continue reading

Posted in Big data, Data Mining, Data science, Video | Tagged , , , , , | Leave a comment

Introduction to frequent subranking mining

Rankings are made in many fields, as we naturally tend to rank objects, persons or things, in different contexts. For example, in a singing or a sport competition, some judges will rank participants from worst to best and give prizes to … Continue reading

Posted in Big data, Data Mining, Data science, Pattern Mining | Tagged , , , , | Leave a comment

Skills needed for a data scientists? (comments on the HBR article)

Recently, I have read an article of the Harvard Business Review (HBR) website about data sciences skills for businesses. This article proposes to categorize skills related to data on a 2×2 matrix where skills are labelled as useful VS not useful, and … Continue reading

Posted in Big data, Data science | Tagged , , , , | Leave a comment

Periodic patterns in Web log time series

Recently, I have analysed trends about visitors on this blog. I have made two observations. First, there is about 500 to 1000 visitors per day. For this, I want to thank you all for reading and commenting on the blog.  Second, if we … Continue reading

Posted in Big data, Data Mining, Data science, Time series | Tagged , , , , , | Leave a comment

Report about the DEXA 2018 and DAWAK 2018 conferences

This week, I am attending the DEXA 2018 (29th International Conference on Database and Expert Systems Applications) and the DAWAK 2018 (20th Intern. Conf. on Data Warehousing and Knowledge Discovery) conferences from the 3rd to 6th September in Regensburg, Germany. Those two conferences are well established European conferences dedicated mainly to research on database and data mining. These conferences are always collocated. … Continue reading

Posted in Big data, Conference | Tagged , , , , , , | Leave a comment

A Model for Football Pass Prediction (source code + dataset)

In this blog post, I will discuss the data challenge of the Machine Learning for Sport Analytics workshop (MLSA 2018) at PKDD 2018. The challenge consisted of predicting the receivers of football passes (pass prediction). I will first briefly describe the data and then … Continue reading

Posted in Data Mining, Data science, Video | Tagged , , , , | Leave a comment

This is why you should visualize your data!

In the data science and data mining communities, several practitioners are applying various algorithms on data, without attempting to visualize the data.  This is a big mistake because sometimes, visualizing the data greatly helps to understand the data. Some phenomena are obvious when visualizing the data. In this blog post, I will give a few … Continue reading

Posted in Big data, Data Mining, Data science | Tagged , , , , | Leave a comment