Tag Archives: dataset

Visualizing the item frequency distribution of pattern mining datasets

In this blog post, I will explain a quick and easy way of visualizing the frequency distribution of items in a dataset in SPMF format for pattern mining. To do this, we will use a new online tool that I … Continue reading

Posted in Pattern Mining, spmf | Tagged , , , , , , , , , , | Leave a comment

New version of SPMF (2.44): 4 new algorithms, datasets and features

Today, I am happy to announce that a new version of the SPMF open-source data mining software is released (v. 2.44). This is the download page. This new version was made possible due to several contributors. What is new? New … Continue reading

Posted in Data Mining, Data science, open-source, Pattern Mining, spmf, Utility Mining | Tagged , , , , , , , | Leave a comment

Datasets of 30 English novels for pattern mining and text mining

Today, I want to announce that I have just made public datasets of 30 novels from English Novels from 10 authors of the XIX century. These datasets can be used for testing algorithms for sequential pattern mining, sequential rule mining, as well as for some text … Continue reading

Posted in Data Mining, Data science | Tagged , , | Leave a comment

Subgraph mining datasets

In this post, I will provide links to standard benchmark datasets that can be used for frequent subgraph mining. Moreover, I will provide a set of small graph datasets that can be used for debugging subgraph mining algorithms. The format of graph datasets A graph dataset is a text … Continue reading

Posted in Big data, Data Mining | Tagged , , , , | Leave a comment

How to encourage data mining researchers to share their source code and datasets?

A few months ago, I wrote a popular blog post on this blog about why it is important to publish source code and datasets for researchers“.  I explained several advantages that researchers can get by sharing the source code of … Continue reading

Posted in Data Mining, Research | Tagged , , , | Leave a comment