How to find cost-effective patterns in data?

Have you ever wondered how to find patterns in data that are not only frequent but also profitable and cost-effective? For example, if you are an online retailer, you may want to know what products are often bought together by customers and generate high profits but require low costs (such as shipping fees or discounts). Or if you are an educator, you may want to know what learning activities are frequently performed by students and result in high grades but require low efforts (such as time or resources).

people shopping in a supermarket

In this blog post, I will briefly introduce a new problem called low-cost high utility itemset mining (LCHUIM), which aims to find such patterns. LCHUIM is a generalization of the well-known problem of high utility itemset mining (HUIM), which focuses on finding patterns that have high utilities (benefits) according to a user-defined utility function. However, HUIM ignores the cost associated with these patterns, such as time, money or other resources that are consumed. The novel problem of LCHUIM addresses this limitation by considering both the utility and the cost of patterns.

To be more precise, a low-cost high utility itemset is an itemset (a set of values) that must satisfy three criteria:

  • Its average utility must be no less than a min_utility threshold set by the user.
  • Its average cost must be no greater than a max_cost threshold set by the user.
  • Its support (occurrence frequency) must be no less than a minsup threshold set by the user.

For example, suppose we have a transaction database that contains information about customers’ purchases and their profits and costs. A possible low-cost high utility itemset could be {bread, cheese}, which means that customers often buy bread and cheese together, and this combination generates high profits but requires low costs.

Finding low-cost high utility itemsets can reveal interesting insights for various domains and applications. For instance, online retailers could use LCHUIM to design effective marketing strategies or recommend products to customers. Educators can use LCHUIM to analyze students’ learning behaviors and provide personalized feedback or guidance.

To solve the problem of LCHUIM, an algorithm named LCIM (Low Cost Itemset Miner) was proposed. The algorithm uses a novel lower bound on the average cost called Average Cost Bound (ACB) to reduce the search space of possible patterns. The algorithm also employs several techniques such as prefix-based partitioning, depth-first search and subtree pruning to speed up the mining process.

The LCIM algorithm was published in this paper:

Nawaz, M. S., Fournier-Viger, P., Alhusaini, N., He, Y., Wu, Y. and Bhattacharya, D. (2022)LCIM: Mining Low Cost High Utility Itemsets . Proc. of the 15th Multi-disciplinary International Conference on Artificial Intelligence (MIWAI 2022), pp. 73-85, Springer LNAI  [ppt][source code and data]

The code and dataset of LCIM can be found in the SPMF open-source data mining library at: SPMF is an efficient open-source data mining library that contains more than 250 algorithms for various pattern mining tasks such as frequent itemset mining, sequential pattern mining, association rule mining and many more.

By the way, the concept of finding cost-effective patterns was also studied for the case of discovering sequential patterns with some algorithms called CorCEPB, CEPN, and CEPB (see this other blog post for more details).

I hope you enjoyed this short blog post and learned something new about low-cost high utility itemset mining. If you have any questions or comments, feel free to leave them below. Thank you for reading!

Philippe Fournier-Viger is a professor, data mining researcher and the founder of the SPMF data mining software, which includes more than 250algorithms for pattern mining.

Posted in Data Mining, Data science, Pattern Mining, spmf, Utility Mining | Tagged , , , , , , , , , , , | Leave a comment

What is a Closed Itemset and Why is it Useful?

In this blog post, I will explain in simple terms what is a closed itemset and give some examples. I will also mention a few algorithms that can be used to find closed itemsets and that they can be found in the SPMF data mining library in Java.

Frequent itemset mining is a popular technique for discovering interesting relationships between items in data represented as a table. The classical application of frequent itemset mining is to analyze data from a transactional database to find correlations between the items that customers purchase.

An itemset is a set of items that appear together in a transaction or a record. For example, if we have a database of customer transactions, an itemset could be {bread, butter,jam}, which means that some customers have bought bread, butter and jam together.

A frequent itemset is an itemset that appears frequently in a database, that is, its occurrence is above a minimum threshold (minsup) set by the user. For example, if minsup is set to 10%, then a frequent itemset is an itemset that appears in at least 10% of the transactions made by customers. Frequent itemsets have many applications. In the context of analyzing customer transactions, they can be used for marketing, recommendation systems, or cross-selling.

Though frequent itemset mining has many applications, a problem is that it can generate a very large number of frequent itemsets, especially when the database contains many items and transactions. This can make it difficult for users to analyze and interpret the results, and also consume a lot of memory and computational resources.

To address this problem, one can use the concept of frequent closed itemset. A frequent closed itemset is a frequent itemset that has no frequent superset with the same support (appear in the same number of transactions). For example, if {bread, milk} has a support of 15% and {bread, milk, cheese} has a support of 15% as well, then {bread, milk} is not a frequent closed itemset because it has a frequent superset with the same support.

Let’s take an example. Suppose we have a database with four customer transactions, denoted as T1, T2, T3 and T4:

T1: {a,b,c,d}
T2: {a,b,c}
T3: {a,b,d}
T4: {a,b}

where the letters a, b, c, d indicate the purchase of items apple, bread, cake and dattes.

If we set the minimum support threshold to 50% (which means that we want to find itemsets appearing in at least two transactions), a frequent itemset mining algorithm will output the following frequent itemsets:

{a}, {b}, {c}, {d}, {a,b}, {a,c}, {a,d}, {b,c}, {b,d}, {a,b,c}, {a,b,d}

But among these frequent itemsets, only four are frequent closed itemsets:

{a,b}, {a,b,c}, {a,b,d}, {a,b,c,d}

The reason is that these itemsets have no frequent supersets with the same support. For example, {a,b} has a support of 100%, and none of its supersets such as ({a,b,c}, {a,b,d}, {a,b,c,d}) have the same support. On the other hand, {a,c} is not closed because it has a superset ({a,b,c}) with the same support (50%).

Why are closed itemsets useful?

Closed itemsets are important because they can reduce the number of frequent itemsets presented to the user, without losing any information. Frequent itemsets can be very large and redundant, especially when the minsup is low or the database is dense. By mining closed itemsets, generally only a very small set of itemsets is obtained, and still all the other frequent itemsets can be directly derived from the closed itemsets. In other words, any frequent itemset can be derived from a closed itemset by removing some items. For example, we can obtain {a} from {a,b} by removing b, or we can obtain {b,d} from {a,b,c,d} by removing a and c. Therefore, by using closed itemsets, we can reduce the number of itemsets presented to the user without missing any important information.

How to find closed itemsets?

There are several algorithms that have been proposed for this task, such as Apriori-Close, CHARM, FP-Close and LCM. These algorithms are based on different strategies, such as pruning, merging, prefix trees or bitmap representations, to efficiently find all closed itemsets without missing any or generating any false positives. Some of these algorithms have also been adapted to mine maximal itemsets, which are closed itemsets that have no frequent supersets at all.

If you want to try these algorithms on your own data, you can use the SPMF data mining library in Java. SPMF is an open-source library that provides very efficient implementations of various data mining algorithms, including those for closed itemset mining. You can download SPMF from its website ( and follow the documentation and examples to run the algorithms on your data. You can also use SPMF as a command line tool or integrate it with other Java applications, or program written in other programming languages such as C#, Python and R (see the website).

A video

If you are interested by this topic, you can also watch a 50 minute video where I explain in more details what are closed itemsets and their properties, but also maximal itemsets and generator itemsets. You can watch it here: Maximal, Closed and Generator itemsets

Hope that this blog post has been interesting 🙂

Philippe Fournier-Viger is a professor, data mining researcher and the founder of the SPMF data mining software, which includes more than 150 algorithms for pattern mining.

Posted in Data Mining, Data science, Pattern Mining | Tagged , , , , , , , , , , | Leave a comment

Introducing PSAC-PDB: A Novel Approach to Protein Structure Analysis and Classification

In this blog post, I will give a short overview of our recent research paper about protein structure analysis and classification by my team (M. S. Nawaz, P. Fournier-Viger, Y. He and Q. Zhang).

Proteins are essential molecules for life, as they perform a variety of functions in living organisms, such as catalysis, transport, signaling, defense, and regulation. The structure of proteins determines their function and interactions with other molecules. Therefore, understanding the structure of proteins is crucial for advancing biological and medical research. In biophysics and computational biology, predicting 3D structure of protein from its amino acid sequence is an important research problem.

However, protein structure analysis and classification is a challenging task, as proteins have complex and diverse shapes that can be affected by various factors, such as environment, mutations, and interactions. Moreover, the number of protein structures available in databases such as PDB and EMDB is increasing rapidly, making it difficult to manually annotate and compare them.

Thus, we recently published a research paper, which proposes a novel method for protein structure analysis and classification. The paper is titled “PSAC-PDB: Analysis and Classification of Protein Structures” and it will appear in the journal Computers in Biology and Medicine, published by Elsevier. The reference of the paper is:

Nawaz, S. M., Fournier-Viger, P., He, Y., Zhang, Q. (2023). PSAC-PDB: Analysis and Classification of Protein Structures. Computers in Biology and Medicine, Elsevier, Volume 158, May 2023, 106814

The main contribution of the paper is to propose a new framework called PSAC-PDB (Protein Structure Analysis and Classification using Protein Data Bank).

The main contribution of the paper is to propose a new framework called PSAC-PDB. A sechma of the overall method can be found below:

Briefly, PSAC-PDB is a computational method that uses a protein structure comparison tool (DALI) to find similar protein structures to a query structure in PDB, and then uses amino acid sequences, aligned amino acids, aligned secondary structure elements, and frequent amino acid patterns to perform classification. PSAC-PDB applies eleven classifiers and compares their performance using six evaluation metrics.

PSAC-PDB also uses sequential pattern mining (SPM) to discover frequent amino acid patterns that can improve the classification accuracy. SPM is a data mining technique that finds subsequences that appear frequently in a set of sequences. SPM can capture the sequential ordering and co-occurrence of amino acids in protein sequences, which can reflect their structural and functional properties. Some examples of frequent sequential patterns of Amino Acids (AA) extracted by the TKS and CM-SPAM algorithms are shown as example in the table below for three families: the S protein structure of SARS-CoV-2 (SSC2), the S protein structures of other viruses and organisms (SO), and Protein (enzyme) structures for others (O).

PSAC-PDB was tested on the case study of SARS-CoV-2 spike protein structures, which are responsible for the entry of the virus into host cells. PSAC-PDB finds 388 similar protein structures to the query structure in PDB, and divides them into three families: S protein structures of SARS-CoV-2, S protein structures of other viruses and organisms, and other protein structures. PSAC-PDB uses four datasets based on amino acid sequences, aligned amino acids, aligned secondary structure elements, and frequent amino acid patterns for classification.

Results have shown that PSAC-PDB achieves high accuracy, precision, recall, F1-score, MCC and AUC values for all three families of protein structures, than state-of-the-art approaches for genome sequence classification. Here are some of the results showing this:

But what is also interesting is that PSAC-PDB shows that using frequent amino acid patterns or aligned amino acids can improve the classification performance compared to using only amino acid sequences or aligned secondary structure elements. Thus, PSAC-PDB can benefit the research community in structural biology and bioinformatics.

For more information, please see the paper. Besides, the datasets from this paper can also be found on Github: and implementations of sequential pattern mining algorithms used in the paper can be found in the SPMF data mining software.

Posted in Uncategorized | Leave a comment

DSSBA 2023, 2nd Special Session on Data Science for Social and Behavioral Analytics @DSAA

I am excited to announce that the 2nd Special Session on Data Science for Social and Behavioral Analytics (DSSBA 2023) will be held at the 10th IEEE International Conference on Data Science and Advanced Analytics (IEEE DSAA 2023) from October 9-13, 2023.

DSSBA 2023 is a forum for researchers and practitioners to present and discuss the latest advances on the design of efficient, scalable and effective solutions for analyzing social and behavioral data. Social and behavioral data are collected from various sources such as social networks, e-learning systems, e-commerce platforms, health care systems, etc. Analyzing these data can help us gain a better understanding of human behavior and social interactions, which can support decision making, personalized recommendation, behavior prediction, behavior change, etc.

The topics of interest for DSSBA 2023 include, but are not limited to:

  • Efficient and scalable algorithms for behavioral and social analytics
  • Evaluation of behavioral analytic models
  • Privacy-preserving techniques for behavioral and social analytics
  • Cognitive and social aspects of behavior analysis
  • Intelligent systems and services powered by behavioral and social data models
  • Applications of behavioral and social analytics in various domains such as education, health care, business, etc.
  • Pattern mining and machine learning models, etc.

We invite you to submit your original research papers to DSSBA 2023 by June 1, 2023. The submission guidelines and more information can be found on the DSSBA 2023 website. The accepted papers will be published in the DSAA 2023 proceedings.

We look forward to receiving your submissions and seeing you at DSSBA 2023!

Posted in cfp, Conference, Data Mining, Data science | Tagged , , , , , , , , , | Leave a comment

Discovering the Top-K Stable Periodic Patterns in a Sequence of Events

In this blog post, I will give a brief introduction to the TSPIN paper about how to find stable periodic patterns in a sequence of events. This algorithm was presented in this research paper:

Fournier-Viger, P., Wang Y., Yang, P., Lin, J. C.-W., Yun, U. (2021). TSPIN: Mining Top-k Stable Periodic Patterns. Applied Intelligence. [source code & data]

What is a periodic pattern? Periodic patterns are sets of values or symbols that appear repeatedly in a sequence of events at more or less regular intervals. For example, in a customer transaction database, a periodic pattern may indicate that some customers buy milk every week or bread every two days. Discovering periodic patterns can thus help to understand customer behavior, optimize inventory management, forecast sales, and so on. For example, here is an illustration of a periodic pattern {a,c} that appears every two hours in a sequence of events, where a is for apple and c is for cake:

The idea of finding periodic pattern is interesting as it can provide insights about the data. However, a major problem is that traditional algorithms for periodic pattern mining have two important limitations:

First, they use a very strict definition of periodicity that requires a pattern to always appear again within a fixed time limit. As a result, some meaningful patterns may be discarded if they are slightly irregular. For example, if a user wants to find periodic patterns that appear weekly, a traditional algorithm will discard a pattern if it does not appear for a week even though it may appear weekly for the rest of the year.

Second, several algorithms use a minimum support threshold to filter out infrequent patterns. This parameter is useful but generally users don’t know how to set it to find patterns and thus adjust it by trial and error to find a suitable value.

To overcome these limitations, the TSPIN (Top-k Stable Periodic pattern mINer) algorithm was designed. It combines two key ideas:

1) Stability: A measure of stability is used in TSPIN to evaluate how consistent a pattern is in terms of its periodicity. A stable pattern has relatively small variations in its occurrence intervals, while an unstable pattern has large variations. TSPIN finds stable periodic patterns but it is designed to be less strict than traditional algorithms so as to also find patterns that are slightly irregular.

2) Top-k: A way of ranking patterns based on their frequency and selecting only the k most frequent ones. This avoids the need to specify a minimum support threshold.

The TSPIN algorithm is an efficient algorithm that integrates several optimizations. Those are explained in the TSPIN research paper. Also that paper describes several experiments on synthetic and real datasets. The results show that TSPIN has good performance and can reveal interesting patterns in real-life data.

There are several potential applications of TSPIN. For example, example, applying TSPIN on a dataset of web clickstream data may reveal that some users visit certain websites periodically with high stability, such as news portals or social media platforms. This information could then be used for personalized recommendation or advertising.

To try TSPIN, the code and datasets are available in the SPMF pattern mining library, which is a popular Java software for pattern mining. That library also offers implementations of a dozen other algorithms for discovering periodic patterns and hundreds of algorithms to find other types of patterns.


In conclusion, TSPIN is an innovative algorithm for mining top-k stable periodic patterns in discrete sequences. It addresses some limitations of traditional algorithms by using a more flexible definition of periodicity and avoiding parameter tuning. It can discover meaningful patterns that reflect regular behaviors or habits in various domains.

Hope this blog post has been interesting. If you like this topic, you may also check my list of key papers on periodic pattern mining on this blog.

Philippe Fournier-Viger is a professor, data mining researcher and the founder of the SPMF data mining software, which includes more than 150 algorithms for pattern mining.

Posted in Big data, Data Mining, Pattern Mining, spmf | Tagged , , , , , , , , | Leave a comment

Data disasters and more…

Today, I will not talk about a serious topic. Recently, I have been experimenting a little bit with generative AI tools for pictures during my free time. Some services can generate some pretty nice pictures based on text prompts. For fun, I have combined the concept of “data” with different types of disasters such as “volcano”, “tsunami”, thunderstorm” and other concepts. Here are a few generated pictures.

Data tsunami

Data volcano

Data thunderstorm

Data tornado

Data mountain

Data vortex

Data waterfall


Actually, it did not take much time to generate these pictures and it is possible to generate many other variations by using the same prompts. But if you use the above pictures as they are, please give credit to this blog and link to this page.

Philippe Fournier-Viger is a professor, data mining researcher and the founder of the SPMF data mining software, which includes more than 250 algorithms for pattern mining.

Posted in Data Mining, Data science, General, Machine Learning | Leave a comment

UDML 2023 workshop!

I’m excited to announce that I am co-organizing a new edition of the Utility-Driven Mining and Learning (UDML) workshop, which will be colocated with the IEEE International Conference on Data Mining (ICDM) 2023 in Shanghai, China.

UDML 2023 at ICDM 2023

The UDML workshop aims to bring together researchers and practitioners who are interested in developing and applying utility-driven methods for data mining and machine learning. Utility-driven methods are those that consider not only the accuracy or interestingness of the results, but also their usefulness or value for specific applications or users. For example, utility-driven methods can take into account user preferences, constraints, costs, benefits, risks, or trade-offs when mining or learning from data.

The UDML workshop will feature invited talk by a leading expert in the field, as well as contributed papers that showcase novel research ideas, methods, systems, applications, and challenges related to utility-driven mining and learning. The workshop will also provide a platform for networking and discussion among researchers and practitioners who share a common interest in this topic.

If you are working on or interested in utility-driven mining and learning, I invite you to submit your paper or poster to the UDML workshop by September 1st 2023. The submission guidelines and topics of interest can be found on the workshop website:

I also encourage you to attend the UDML workshop on December 4th 2023 at ICDM 2023 in Shanghai. You will have the opportunity to learn from the presentation, interact with the authors of the accepted papers and posters, and network with other participants.

I hope to see you at UDML 2023!

Philippe Fournier-Viger is a professor, data mining researcher and the founder of the SPMF data mining software, which includes more than 250 algorithms for pattern mining.

Posted in Big data, Conference, Data Mining, Data science, Pattern Mining | Tagged , , , , , , , | Leave a comment

How to Detect and Classify Metamorphic Malware with Sequential Pattern Mining (MalSPM)

Malware are malicious software that can harm computers and networks by stealing data, encrypting files, or damaging devices. Malware are a serious threat to cybersecurity, especially when they can change their appearance to evade detection by antivirus software. This is called metamorphic malware, and it is a challenging problem for malware analysis and classification.

In this blog post, I will describe a new method called MalSPM (Metamorphic Malware Behavior Analysis and Classification using Sequential Pattern Mining) that can detect and classify metamorphic malware based on their behavior during execution. The MalSPM method was presented in a research paper that you can read for more details:

Nawaz, M. S., Fournier-Viger, P., Nawaz, M. Z., Chen, G., Wu, Y. (2022) MalSPM: Metamorphic Malware Behavior Analysis and Classification using Sequential Pattern Mining. Computers & Security, Elsever, to appear

I will now explain what are the main features of metamorphic malware, how MalSPM analyzes them using sequential pattern mining (SPM), and what are the advantages of using MalSPM.

What are metamorphic malware?

Metamorphic malware are malware that can modify their code or structure without changing their functionality. This means that they can produce different variants of themselves that look different but behave the same. For example, a metamorphic virus can change its encryption algorithm or insert junk code into its body to avoid being recognized by signature-based antivirus software.

Metamorphic malware pose a serious challenge for malware detection and classification because they can bypass static analysis techniques that rely on code similarity or predefined patterns. Therefore, dynamic analysis techniques that monitor the behavior of malware during execution are more suitable for dealing with metamorphic malware.

How does MalSPM analyze metamorphic malware?

MalSPM is a method that uses sequential pattern mining to analyze and classify metamorphic malware based on their behavior during execution. SPM is a data mining task that consists of finding frequent subsequences in a dataset of sequences. In the case of MalSPM, SPM was applied to a dataset that contains sequences of API calls made by different malware on the Windows operating system (OS). This allows to extract patterns representing the characteristics of different families of malware. API calls are functions provided by the OS that allow applications to perform various tasks such as accessing files, creating processes, or sending network packets. API calls are an attractive and distinguishable feature for malware analysis and detection because they can reflect the actions of executable files.

MalSPM first applies SPM algorithms to find patterns indicating frequent API calls in the dataset. These patterns can be of different types such as sequential rules between API calls as well as maximal and closed sequences of API calls. These patterns represent common behaviors of different types of malware such as ransomware, trojan, and worm. For example, here are a few sequential patterns that were extracted by MalSPM from the dataset of malware API calls:

Each line in this table is a pattern. For example, the first line indicates that a frequent pattern for a malware is to call the API NtClose, followed by NtQueryvalueKey, and then followed by NtClose, and that this pattern appears in 919 malware sequences.

Then after extracting the patterns, MalSPM uses them for the classification of different malware. This is done by using the discovered patterns as feature to train classifiers. In this paper, the performance of seven classifier was compared using various metrics. Moreover, the performance of MalSPM was compared with state-of-the-art malware detection methods and it was found that MalSPM outperformed these methods.

Here is a picture that illustrates the overall process of malware detection using MalSPM.

What are the benefits of using MalSPM?

MalSPM has several benefits for malware detection and classification.

First, it can handle metamorphic malware that can change their appearance by focusing on their behavior rather than their code.

Second, it can discover common and specific behaviors of different types of malware by using SPM techniques.

Third, it can achieve high accuracy and efficiency by using effective pruning strategies and database projection. Fourth, it can provide interpretable results by using sequential rules and patterns that can explain the logic behind the classification.

Code and datasets

For more details, please see the research paper of MalSPM. The datasets can be found here. And if you want to try the algorithms for extracting sequential patterns from that paper, please see the SPMF data mining software, which offers very fast implementations of those algorithms.


In conclusion, MalSPM is a novel method that uses sequential pattern mining to analyze and classify metamorphic malware based on their behavior during execution. It can deal with the challenges posed by metamorphic malware and provide useful insights for cybersecurity researchers and practitioners. For more details, please see the research paper.

Philippe Fournier-Viger is a professor, data mining researcher and the founder of the SPMF data mining software, which includes more than 150 algorithms for pattern mining.

Posted in Data Mining, Industry, Pattern Mining, spmf | Tagged , , , , , , , , | Leave a comment

How to propose a special issue for a journal?

Today, I will talk about how to propose a special issue for a journal.

What is a special issue?

A special issue is a collection of articles on a specific topic or theme that is published in a journal. Special issues can provide an opportunity to present the latest research, highlight emerging trends, or address gaps in the literature. If you have an idea for a special issue, you may want to apply to be a guest editor and organize it.

How to organize a special issue?

You can follow these steps:

1. Select a journal that is relevant to your topic and has a good reputation or is good according to other criteria that matter to (e.g., a high impact factor, high citation rate, etc.). You should choose a journal that is within the scope of your field and reaches a large audience. You can consult the websites of big publishers to search for journals in your field and compare their metrics and scope.

2. Consult the journal’s website for the guidelines and requirements for special issue proposals. Some journals may have a specific format, template, or online submission system for proposals. Others may ask you to email your proposal to the editor-in-chief or a designated contact person.

3. Prepare a proposal that outlines the rationale, objectives, scope, and expected outcomes of your special issue. You should also include a tentative title, a list of potential topics and keywords, a timeline for submission and publication, and a brief introduction of yourself and any co-editors. You should also demonstrate the significance, relevance, and timeliness of your special issue, and how it will benefit the readers and the field. You may also need to provide some sample papers or abstracts to demonstrate the quality and relevance of your special issue.

4. Submit your proposal to the journal and wait for their response. The journal editors will review your proposal and decide whether to accept, reject, or request revisions. They may also ask you to provide more information, such as the names and affiliations of potential authors and reviewers, or a detailed budget and funding sources.

5. If your proposal is accepted, you will be responsible for managing the editorial process of your special issue. This includes inviting authors, soliciting submissions, coordinating peer review, editing and proofreading manuscripts, and ensuring that the special issue meets the journal’s standards and deadlines. You will also need to communicate with the journal editors and staff, and follow their instructions and policies.

Applying for a special issue can be a rewarding and challenging experience. It can help you showcase your expertise, expand your network, and contribute to the advancement of your field. However, it also requires a good amount of time, effort, and commitment. Therefore, you should carefully consider your motivation, resources, and expectations before applying for a special issue.


That is all for that. It was just a short blog post about how to propose a special issue for a journal.

Philippe Fournier-Viger is a full professor  and the founder of the open-source data mining software SPMF, offering more than 250 data mining algorithms. 

Posted in Academia | Tagged , , , | Leave a comment


Today, just for fun, I generated some ASCII arts for the SPMF data mining library using this tool:

I think the result is not bad but I am not sure if I would use it:

   _____ _____  __  __ ______ 
  / ____|  __ \|  \/  |  ____|
 | (___ | |__) | \  / | |__   
  \___ \|  ___/| |\/| |  __|  
  ____) | |    | |  | | |     
 |_____/|_|    |_|  |_|_|     

 ___    ___           ___   
(  _`\ (  _`\ /'\_/`\(  _`\ 
| (_(_)| |_) )|     || (_(_)
`\__ \ | ,__/'| (_) ||  _)  
( )_) || |    | | | || |    
`\____)(_)    (_) (_)(_)    

  ____  ____  __  __ _____ 
 / ___||  _ \|  \/  |  ___|
 \___ \| |_) | |\/| | |__   
  ___) |  __/| |  | |  __|  
 |____/|_|   |_|  |_|_|   

   ______..______  .___  ___.  ______ 
  /      ||   _  \ |   \/   | |   ___|
  |  (---`|  |_)  ||  \  /  | |  |__   
   \   \  |   ___/ |  |\/|  | |   __|  
.---)   | |  |     |  |  |  | |  |     
|______/  | _|     |__|  |__| |__|     

Which of the above do you prefer? Let me know in the comment section, below!
For me, I think I prefer the third one.

Posted in spmf | Leave a comment