This is a video presentation of the paper “**Mining Correlated High-Utility Itemsets Using the bond Measure**” about **correlated high utility pattern mining **using **FCHM**.

More information about the **FCHM ****algorithm** are provided in this research paper:

Fournier-Viger, P., Zhang, Y., Lin, J. C.-W., Dinh, T., Le, B. (2018)

Mining Correlated High-Utility Itemsets Using Various Correlation Measures.Logic Journal of the IGPL, Oxford Academic, to appear

The source code of FCHM and datasets are available in the SPMF software.

I will post videos about other high utility itemset mining algorithms in the near future, so stay tuned!

==**Philippe Fournier-Viger **is a professor, data mining researcher and the founder of the SPMF data mining software, which includes more than 150 algorithms for pattern mining.

Hello,

Thank you for your informative explanation, I have a question :

how can I calculate the accuracy of a high utility pattern mining algorithm after the generating the High Utility Patterns ?

Hi,

thanks for reading, and welcome to the blog!

In

high utility itemset mining, there is no concept of accuracy. A high utility itemset mining algorithm takes as input a database and a minimum utility threshold. And the output is the set of all high utility itemsets. In other words, there is only one good solution for each high utility itemset mining task, and most algorithms for high utility itemset mining will always find that solution because they are “exact algorithms”. An exact algorithm means that the algorithm always exactly find the desired result. Because of that, there is no point to measure the accuracy.In fact, the goal of high utility itemset mining is to discover some knowledge to understand the data. The user sets the minimum utility patterns and finds the pattern. That’s all.

This is different from other data mining tasks like classification. In classification, one can use a model like a neural network to do some predictions. Then you can calculate the percentage of correct predictions to get the accuracy. But in high utility itemset mining, the goal is not to make prediction.

If you would use the high utility itemsets to do some predictions like some product recommendations then in that case you could calculate some accuracy. It is possible to do that but it would be another step AFTER doing high utility itemset mining, and then you would have to define how to make these predictions.

By the way, there also exists some high utilitiy itemset mining algorithms that are “approximate” (do not guarantee to find the correct result). This is the case for example of algorithms based on genetic algorithms, particle swarm optimization (PSO) etc. For those algorithms, you could calculate how many patterns are missing from their output to calculate some accuracy.

Hope that this is clear.

Best regards,

Philippe

OK, I have another question :

Is there a diffrence between high utility pattern and high utility association rule ? or we just generate the rules from HUPs ?

Hi again,

I first need to say that “pattern” is a general word that means something that you can find in the data. A “pattern” can be a high utility itemset, a high utility sequential pattern, a high utility association rule, etc. In other words, there are many types of patterns.

Now, if you want to know the different between an itemset and an association rule, then yes, there is a difference. An itemset {A,B,C} means that A, B and C are appearing together (e.g. purchased together by a customer). An association rule is a bit different. It has the form A –> B, which means for example that if you buy A then you are likely to also buy B. Usually, how strong a rule is, is measured using some special measures like the confidence and lift.

And generally, association rules are generated from the itemsets. Typically, an algorithm to find association rules will first find the itemsets, and then use these itemsets to generate the rules. So this is usually a two step process (finding the itemsets, and then using the itemsets to generate the rules).

But there are also some algorithms for association rule mining that will directly generate the rules without generating the itemsets. This is also possible but less common.

Most of the studies on association rule mining are designed to find frequent association rules rather than high utility association rules. I only saw a few papers on high utility association rules. If you want to also consider time, there are also a variation called high utility sequential rules.

Best regards,

so, is High utility association rule mining a better research topic as it has less number of papers so there are more opportunities to research or it is wrong see it that way ?

Yes, you can see it that way. I mean to choose a topic, the most important is to think whether it is useful, and then if you can do something new. High utility association rule mining is useful I think. No problem about that. And since there are not so many papers, you can certainly do something in that area. So yes, I think it is a good topic.

Hello,

Where can I find the presentation please ?

Hi,

This is the direct link to watch the video: http://www.philippe-fournier-viger.com/spmf/videos/FCHM_correlated_itemsets.mp4

This is the PDF of the article presented in that video : http://www.philippe-fournier-viger.com/spmf/FCHM_long.pdf

And you can find the software, source code and datasets in the SPMF software: http://www.philippe-fournier-viger.com/spmf/

Best regards,

Philppe

Sorry but this link is not working:

http://www.philippe-fournier viger.com/spmf/HAIS201GV9Jm2u7rmsCe65wKzPTw5jtS38n2tVEGi_13pages.pdf

I see, thanks for letting me know. The correct link is:

http://www.philippe-fournier-viger.com/spmf/FCHM_long.pdf

Fournier-Viger, P., Zhang, Y., Lin, J. C.-W., Dinh, T., Le, B. (2018) Mining Correlated High-Utility Itemsets Using Various Correlation Measures. Logic Journal of the IGPL, Oxford Academic, to appear.

DOI: 10.1093/jigpal/jzz068