If you are not familiar with FP-trees, the FP-tree is a data structure used in the field of data mining by the classical FP-Growth algorithm to discover frequent itemsets. The FP-tree is a type of prefix-tree (trie) structure, in which transactions are inserted.
The new webpage allows drawing FP-trees by entering a set of transactions in a text area, choosing a minimum support threshold and then clicking a button to draw the tree. There are also some other options to customize the result such as some checkboxes to decide whether to draw the header table and node links of the fp-tree or not.
Now, let met me show you the user interface of this tool:
The interface is very straightforward. To use this tool, we must first input a set of transactions using the SPMF format. Thus, we must write one transaction per line, where each transactions is a list of integers (items) separated by spaces. For example, lets say that I enter the following three transactions:
1 2 3 5
2 3 4
1 4 5
and set minsup = 1 transaction, and then if I click the “Draw FP-Tree” button, the result is:
The meaning is as follows:
This type of picture displays the header table, node-links and fp-tree. The webpage also provides the option of only drawing the tree:
Or just drawing the tree and header table:
Let me now show you are more complex example, for another set of transactions:
1 2 3 5
2 3 4
1 4 5
5 6 7 8 9
6 7 8 9
If I set minsup = 1 transaction, the following fp-tree is drawn:
Then, if we raise the minsup threshold to 3 transactions, a smaller fp-tree is obtained:
And if we increase to minsup = 4 transactions, we obtain this fp-tree:
If you want to see more possibility, you can try it! And if you find some bugs, you may send me an e-mail to let me know. It is possible that there is still some bugs for some special cases.
By the way, in another blog post, I have also shown how to draw FP-trees using Latex.
Philippe Fournier-Viger is a professor of Computer Science and also the founder of the open-source data mining software SPMF, offering more than 250 data mining algorithms.