Archives
Categories
- Academia (90)
- artificial intelligence (35)
- Big data (83)
- Bioinformatics (5)
- cfp (10)
- China (1)
- Chinese posts (1)
- Conference (75)
- Data Mining (188)
- Data science (107)
- Database (2)
- General (43)
- Industry (2)
- Java (12)
- Latex (10)
- Machine Learning (21)
- Mathematics (2)
- open-source (40)
- Other (3)
- Pattern Mining (89)
- Plagiarism (1)
- Programming (17)
- Research (108)
- spmf (57)
- Time series (3)
- Uncategorized (23)
- Utility Mining (23)
- Video (19)
- Website (5)
-
Recent Posts
- An Improved Pattern Viewer
- Computer Science Journals and Conferences with the most withdrawals in 2024
- Unusual Web traffic and end of the forum.
- Fast and Effective Genome Sequence Compression, with a multi-purpose method called HMG (2025)
- Categorical data clustering: 25 years beyond K-modes (a survey)
- Vertical and horizontal databases in itemset mining
- CFP: PM4B 2005: A new workshop on pattern mining and machine learning in bioinformatics @ PAKDD 2025
- Merry X-mas and Happy New Year!
- About Academic conferences in China
- An ethical issue in the Elsevier “International Journal of Hydrogen Energy” ?
Recent Comments
- Pierre Aribaut on Big problem on my website on IONOS webhosting!
- Ellie Kesselman on The MDLM 2023 conference: a scam?
- Ronald on More problems on IONOS web hosting… 4 days of downtime!
- Introduction to the K-Means clustering algorithm (with Java code) | The Data Blog on Categorical data clustering: 25 years beyond K-modes (a survey)
- Categorical data clustering: 25 years beyond K-modes (a survey) | The Data Blog on Introduction to the K-Means clustering algorithm (with Java code)
-
Tag cloud
- academia
- ai
- algorithm
- apriori
- article
- articles
- artificial intelligence
- association rule
- big data
- cfp
- china
- conference
- data
- data mining
- data science
- episode
- graph
- high utility itemset mining
- icdm
- itemset
- itemset mining
- java
- journal
- latex
- machine learning
- open-source
- open source
- pakdd
- paper
- papers
- pattern mining
- periodic pattern
- phd
- Research
- researcher
- reviewer
- sequence
- sequential pattern
- software
- spmf
- udml
- utility mining
- video
- workshop
- writing
Number of visitors:
2,364,949
Tag Archives: dataset
Visualizing the item frequency distribution of pattern mining datasets
In this blog post, I will explain a quick and easy way of visualizing the frequency distribution of items in a dataset in SPMF format for pattern mining. To do this, we will use a new online tool that I … Continue reading
Posted in Pattern Mining, spmf
Tagged association rule, association rule mining, data, dataset, density, frequency, frequency distribution, itemset, pattern mining, spmf format, support
Leave a comment
New version of SPMF (2.44): 4 new algorithms, datasets and features
Today, I am happy to announce that a new version of the SPMF open-source data mining software is released (v. 2.44). This is the download page. This new version was made possible due to several contributors. What is new? New … Continue reading
Posted in Data Mining, Data science, open-source, Pattern Mining, spmf, Utility Mining
Tagged algorithm, data mining, data science, dataset, open source, open-source, pattern mining, spmf
Leave a comment
Datasets of 30 English novels for pattern mining and text mining
Today, I want to announce that I have just made public datasets of 30 novels from English Novels from 10 authors of the XIX century. These datasets can be used for testing algorithms for sequential pattern mining, sequential rule mining, as well as for some text … Continue reading
Subgraph mining datasets
In this post, I will provide links to standard benchmark datasets that can be used for frequent subgraph mining. Moreover, I will provide a set of small graph datasets that can be used for debugging subgraph mining algorithms. The format of graph datasets A graph dataset is a text … Continue reading
Posted in Big data, Data Mining
Tagged data mining, dataset, frequent subgraph, graph, subgraph
Leave a comment
How to encourage data mining researchers to share their source code and datasets?
A few months ago, I wrote a popular blog post on this blog about why it is important to publish source code and datasets for researchers“. I explained several advantages that researchers can get by sharing the source code of … Continue reading
Posted in Data Mining, Research
Tagged data mining, dataset, open-source, source code
Leave a comment