
Today, I want to announce that a new version of the SPMF data mining library and software has been released, which is version 2.65. This version bring several improvements, including 8 new algorithms, several optimizations, and new user interface tools for data analysis, and some tools for data processing. The details of this new version can be found on the download page of SPMF. Here is a brief overview.
Eight new algorithms:
- the LinearTable algorithm for mining frequent itemsets, which can work especially well when the number of items is relatively small. This algorithm has very low memory usage in some cases (Lu et al. 2023)
- The SAM algorithm for mining frequent itemsets (Borgelt et al., 2009)
- The TM algorithm for mining frequent itemsets (Song et al., 2006)
- The NEWCHARM algorithm for mining frequent closed itemsets (Ye et al., 2015)
- The DBVMiner algorithm for mining frequent closed itemsets (Vo et al., 2012)
- The FTARM algorithm for top-k association rule mining, which is a variation of ETARM with additional strategies (Liu et al., 2019)
- The ETARM algorithm for top-k association rule mining, which is a variation of TopKRules with additional pruning strategies (Nguyen et al., 2017)
- The AprioriTID_HD algorithm, a modification of AprioriTID for better performance (thanks to Harshil Damania for proposing this improvement )
Performance improvement
I have added several optimizations to improve the performance of algorithms such as Apriori, AprioriClose, Eclat, Relim, AprioriInverse, AprioriRare, AprioriTopK, dEclat, Charm, dCharm, TopKRules, TopKClassRules, etc. In some case, the speed can be improved by several times and the memory performance reduced considerably.
New user interface tools
One new user interface tools is the Itemset-Item Matrix Viewer, which allows to visualize the relationship and similarities between itemsets discovered by itemset mining algorithms. Here is a screenshot:

There is also a new Item Co-Occurrence HeatMap Viewer to visualize co-occurrences between items in transaction databases. For example, here is a visualiztion of the co-occurrences of the top 20 most frequent items in the Chess dataset:

I have also added panels in the dataset viewers to provide interesting statistics about datasets. For example, for the Transaction dataset viewer:

Bug fixes
I have also fixed various small bugs.
Conclusion
This is just a quick overview of this new version of the SPMF pattern mining software, version 2.65. Thanks again to all users of SPMF and contributors for your support!




