A Model for Football Pass Prediction (source code + dataset)

In this blog post, I will discuss the data challenge of the Machine Learning for Sport Analytics workshop (MLSA 2018) at PKDD 2018. The challenge consisted of predicting the receivers of football passes (pass prediction). I will first briefly describe … Continue reading

PAKDD 2018 Conference (a brief report)

In this blog post, I will discuss the PAKDD 2018 conference (Pacific Asia Conference on Knowledge Discovery and Data Mining), in Melbourne Australia, from the 3rd June to the 6th June 2018. About the PAKDD conference PAKDD is an important conference … Continue reading


哈尔滨工业大学(深圳)工业设计研究中心正在招聘两名博士后研究人员进行数据挖掘/大数据方向的研究。 招聘条件: 计算机科学博士学位, 在数据挖掘或人工智能领域有着深厚的研究背景, 在数据挖掘或人工智能领域的优秀会议或期刊上发表过论文, 对数据挖掘算法的开发和应用有浓厚兴趣, 211/ 985大学或国外优秀学校博士学位优先考虑。 成功申请人将: 工作在与时间序列和空间序列相关方面或者其它与数据挖掘领域相关的理论或者工业应用。(确切的主题会根据申请人的优势讨论后确定)。 加入由Philippe Fournier-Viger教授领导的优秀研究团队,Philippe Fournier-Viger教授是流行数据挖掘库SPMF的创始人,并且与其他领域的优秀研究人员有密切合作。 工作在具有先进设备的实验室(实验室配备高端的工作站,用于大数据研究的服务器集群,GPU服务器,虚拟现实设备,身体传感器等)。 以年薪17.6万元人民币聘用两年(其中51600来自学校,120,000来自深圳市政府)。请注意,博士后研究员不需要对工资支付任何税费,学校会提供低价格的租赁公寓(大约1500/月,很大地节省了住宿费用)。 工作在全球计算机科学领域排名前50的大学之一,以及中国排名前10的大学之一。 工作在中国东南部增长最快的城市之一深圳,这里污染低,全年气候温暖,接近香港。 如果您对此职位感兴趣,请尽快发送您的详细简历(包括出版物和参考文献清单)至Philippe Fournier-Viger教授(philfv8@yahoo.com ),可以申请2018年或2019年的博士后名额。 Related posts:Why researchers should make their research papers available on internet?How to characterize and compare data mining algorithms?Subgraph mining datasets … Continue reading

On the Completeness of the CloSpan and IncSpan algorithms

In this blog post, I will briefly discuss the fact that the popular CloSpan algorithm for frequent sequential pattern mining is an incomplete algorithm.  This means that in some special situations, CloSpan does not produce the expected results that it has been designed for, and … Continue reading

On the correctness of the FSMS algorithm for frequent subgraph mining

In this blog post, I will explain why the FSMS algorithm for frequent subgraph mining is an incorrect algorithm.  I will publish this blog post because I have found that the algorithm is incorrect after spending a few days to … Continue reading

How to discover interesting patterns in data?

Discovering interesting patterns in data is often referred as data mining, data science or big data.  In the last few years, I have written several blog posts providing introduction to data mining and key topics in data mining: An Introduction to … Continue reading

This is why you should visualize your data!

In the data science and data mining communities, several practitioners are applying various algorithms on data, without attempting to visualize the data.  This is a big mistake because sometimes, visualizing the data greatly helps to understand the data. Some phenomena are obvious … Continue reading