Academic misconduct by Sandeep Kautish from LBEF / APU

In this blog post, I will talk about a recent case of serious academic misconduct by Sandeep Kautish from LBEF / APU that I experienced when submitting a book proposal to CRC Press. In that book proposal, I am a collaborator (co-editor). The full story is below.

sandeep kautish extortion misconduct

On June 4th 2020 morning, we submitted a book proposal to Prof. Sandeep Kautish, editor of a new book series called “Advances in Informatics and Information Systems Engineering” for CRC Press to propose a book related to artificial intelligence. We submitted to him because he previously made a call for book proposals.

Then, in the afternoon, we received the following e-mail from Sandeep Kautish:

FROM: CRC Editor-AIISE <crceditor.aiise@gmail.com>
TO: +++++, +++++, +++++, +++++
4th June, 14 h 46
Dear All,
Congratulations on the nicely drafted proposal.
Also, I wish to get the consent of you all to add myself (Prof. Dr. Sandeep Kautish, Series Editor CRC Press) as 5th Editor in the proposal. I am Series Editor of 3 (three) book series of CRC Press with over 30 books in production and have been the editor of more than five Elsevier, Springer, and IGI Global books (one Elsevier and one Springer already going on).
My brief biography is given below –

Dr. Sandeep Kautish is working as Professor & Dean-Academics with LBEF Campus, Kathmandu Nepal running in academic collaboration with Asia Pacific University of Technology & Innovation Malaysia.

(…)

Series Editor
Advances in Informatics and Information Systems Engineering
CRC Press (Taylor & Francis Group)

Thus, the book series editor Sandeep Kautish acknowledged receiving our proposal, said that it is a good proposal. But he told us that he wants to add himself as a co-editor of our book (!) This is totally unacceptable and inappropriate, as he did not write a single word of our proposal. And it is a clear conflic of interest.

We don’t have any reason to add him as co-editor. We don’t know him and he directly asks to put his name on our proposal that he did not write. And obviously, the purpose of this message is to make us feel that if we do not accept, he will reject the proposal and not transfer it to CRC Press. And if there is doubts about that, it has been confirmed in the next e-mail and phone call.

Now, since we cannot accept such behavior, one of the member of our proposal told him that we will not add him to the book proposal on the phone. Then, because of this he wrote another e-mail a few hours later to reject the proposal that he once thought was a good proposal:

FROM: CRC Editor-AIISE <crceditor.aiise@gmail.com>
TO: +++++, +++++, +++++, +++++
4th June, 19 h 27
Dear all,
Based on my discussion with _________ over a phone call, I have decided not to process and accept the said Proposal under my series. 

It was latter confirmed to me that he was very angry over the phone that we did not accept to put his name on our proposal. This is really unprofessional and unethical.

A book series editor should never ask to be put as co-editor of books that are proposed in his series, that he did not wrote, and as a condition to process the proposal. It is a very serious case of academic misconduct. And I am sure that this is not the policy of CRC Press, either. Thus, I will also fill a complaint to CRC Press about this so that he does not try to bully other researchers that are in weak positions into putting his name on their books.

I have previously published a few books with Springer and never had to face such bad behavior from a book series editor. In fact, I would never have imagine that this could have happened when submitting to an editor like CRC Press, which is a decent publisher.

Who is Sandeep Kautish?

So you may now wonder who is Sandeep Kautish? He is an Indian researcher who is professor and dean with of some small department called LBEF Campus in Kathmandu Nepal for the Asia Pacific University of Technology & Innovation (APU).
This is his webpage: http://apiitmalaysia.academia.edu/DrSandeepKautish
and his other webpage: https://www.lbef.org/profile/dr-sandeep-kautish/
and this is his official e-mail: sandeep.kautish@lbef.edu.np

sandeep kaitish  unethical behavior

As I see from the webpage of Sandeep Kautish, he does not seems to be a strong researcher. He has about 100 citations in Google Scholar. Thus, I think that CRC Press maybe made a mistake when appointing him to such position as book series editor, and as we discovered he decided to take advantage of this to try to bully people into putting his name on their books. Why? I guess the reason must be to obtain a promotion or such things.

Update 1: A second case of academic extortion by Sandeep Kautish

2020-6-4 1:00 PM. About two hours after publishing this post, someone else has privately contacted me to inform me that Sandeep Kautish has done the same thing to them for another book proposal with CRC Press. They also did not give up to the bullying tactics and refused to add him as co-editor of their book.

But this raise the questions of how many other people, Sandeep Kautish has tried to bully in the same way?

Update 2: A third and fourth case

2020-6-5 3:00 PM. Two more researchers have come up to talk with me privately on LinkedIn to tell me about some bad experiences that they also had with Sandeep Kautish related to bullying for book proposals. The first one told me that such things happened about 10 more times to people that he knows. The other, a respected Indian researcher, told me that he had more or less the same experience as me with Sandeep Kautish for a book proposal more than a year ago.

Conclusion

In this blog post, I have shared a case of highly unethical behavior in academia by an Indian researcher named Sandeep Kautish who works at APU / LBEF. As always, in such case, the best solution is to fill a complaint and make the story public otherwise such things will continue to happen. I have previously reported some other cases of academic misconducts, that you may be interested to read:

If you know other information that may be interesting, you can share in the comment section below or send me a private message.

Datasets of 30 English novels for pattern mining and text mining

Today, I want to announce that I have just made public datasets of 30 novels from English Novels from 10 authors of the XIX century. These datasets can be used for testing algorithms for sequential pattern mining, sequential rule mining, as well as for some text mining applications such as authorship attribution (guessing the authors of an anonymous text) and sequence prediction.

All the datasets  were public domain texts that have been prepared and converted to a suitable format for text analysis by Jean-Marc Pokou et al. (2016) so that they can be used with the SPMF library. 

Each dataset has two versions: (1) sequences of words and (2) sequences of Part-of-Speeches (POS) tags.

The authors and total number of words/sentences in the corpus of each author is as follows: Catharine Traill (276,829/ 6,588), Emerson Hough (295,166/ 15,643), Henry Addams (447,337/ 14,356), Herman Melville (208,662/ 8,203), Jacob Abbott (179,874/ 5,804), Louisa May Alcott (220,775/ 7,769), Lydia Maria Child (369,222/ 15,159), Margaret Fuller (347,303/ 11,254), Stephen Crane (214,368/ 12,177), and Thornton W. Burgess (55,916/ 2,950).

AuthorDatasets (books) in SPMF formatDatasets in SPMF format (with item names)
– can be used with the GUI of SPMF
Original books as text
Catharine Traill– A Tale of The Rice Lake Plains
(words / POS)
-Lost in the Backwoods (words / POS)
– The Backwoods of Canada (words / POS)
– A Tale of The Rice Lake Plains
(words / POS)
-Lost in the Backwoods (words / POS)
– The Backwoods of Canada (words / POS)
– A Tale of The Rice Lake Plains
(words / POS)
-Lost in the Backwoods (words / POS)
– The Backwoods of Canada (words / POS)
Emerson Hough– The Girl at the Halfway House (words / POS)
– The Law of the Land (words / POS)
– The Man Next Door (words / POS)
– The Girl at the Halfway House (words / POS)
– The Law of the Land (words / POS)
– The Man Next Door (words / POS)
– The Girl at the Halfway House (words / POS)
– The Law of the Land (words / POS)
– The Man Next Door (words / POS)
Henry Addams– Democracy, an American novel (words / POS)
– Mont-Saint-Michel and Chartres (words / POS)
– The Education of Henry Adams (words / POS)
– Democracy, an American novel (words / POS)
– Mont-Saint-Michel and Chartres (words / POS)
– The Education of Henry Adams (words / POS)
– Democracy, an American novel (words / POS)
– Mont-Saint-Michel and Chartres (words / POS)
– The Education of Henry Adams (words / POS)
Herman Melville– I and My Chimney (words / POS)
-Israel Potter (words / POS)
-The Confidence-Man His Masquerade (words / POS)
– I and My Chimney (words / POS)
-Israel Potter (words / POS)
-The Confidence-Man His Masquerade (words / POS)
– I and My Chimney (words / POS)
-Israel Potter (words / POS)
-The Confidence-Man His Masquerade (words / POS)
Jacob Abbott– Alexander the Great (words / POS)
– History of Julius Caesar (words / POS)
– Queen Elizabeth (words / POS)
– Alexander the Great (words / POS)
– History of Julius Caesar (words / POS)
– Queen Elizabeth (words / POS)
– Alexander the Great (words / POS)
– History of Julius Caesar (words / POS)
– Queen Elizabeth (words / POS)
Louisa May Alcott– Eight Cousins (words / POS)
– Rose in Bloom (words / POS)
– The Mysterious Key and What Opened (words / POS)
– Eight Cousins (words / POS)
– Rose in Bloom (words / POS)
– The Mysterious Key and What Opened (words / POS)
– Eight Cousins (words / POS)
– Rose in Bloom (words / POS)
– The Mysterious Key and What Opened (words / POS)
Lydia Maria Child– A Romance of the Republic (words / POS)
-Isaac THoppe (words / POS)
-Philothea (words / POS)
– A Romance of the Republic (words / POS)
-Isaac THoppe (words / POS)
-Philothea (words / POS)
– A Romance of the Republic (words / POS)
-Isaac THoppe (words / POS)
-Philothea (words / POS)
Margaret Fuller– Life Without and Life Within (words / POS)
-Summer on the Lakes, in 1843 (words / POS)
– Woman in the Nineteenth Century (words / POS)
– Life Without and Life Within (words / POS)
-Summer on the Lakes, in 1843 (words / POS)
– Woman in the Nineteenth Century (words / POS)
– Life Without and Life Within (words / POS)
-Summer on the Lakes, in 1843 (words / POS)
– Woman in the Nineteenth Century (words / POS)
Stephen Crane– Active Service (words / POS)
– Last Words (words / POS)
– The Third Violet (words / POS)
– Active Service (words / POS)
– Last Words (words / POS)
– The Third Violet (words / POS)
– Active Service (words / POS)
– Last Words (words / POS)
– The Third Violet (words / POS)
Thornton WBurgess– The Adventures of Buster Bear (words / POS)
– The Adventures of Chatterer the Red Squirrel (words / POS)
-The Adventures of Grandfather Frog (words / POS)
– The Adventures of Buster Bear (words / POS)
– The Adventures of Chatterer the Red Squirrel (words / POS)
-The Adventures of Grandfather Frog (words / POS)
– The Adventures of Buster Bear (words / POS)
– The Adventures of Chatterer the Red Squirrel (words / POS)
-The Adventures of Grandfather Frog (words / POS)
ALL THE 30 ABOVE BOOKSwords / POSwords / POSwords POS

If you use the above book datasets, you may want to cite this paper:

Pokou J. M., Fournier-Viger, P., Moghrabi, C. (2016). Authorship Attribution Using Small Sets of Frequent Part-of-Speech Skip-grams. Proc. 29th Intern. Florida Artificial Intelligence Research Society Conference (FLAIRS 29), AAAI Press, pp. 86-91

In that paper, we have discovered skip-grams (sequential patterns) and n-grams (consecutive sequential patterns) of part-of-speech tags to guess the authors of books.

More datasets can also be found on the dataset webpage of the SPMF software.


Philippe Fournier-Viger is a computer science professor and founder of the SPMF open-source data mining library, which offers more than 170 algorithms for analyzing data, implemented in Java.

The PAKDD 2020 conference (a brief report)

In this report, I will talk about the 24th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2020), from the 11th to 14th May 2020.

pakdd 2020

The PAKDD conference

PAKDD is a top international conference on data mining / big data in the Pacific-Asia part of the world. I have attended this conference times and written reports about several editions of the conference. If you are interested, you can read these reports here: PAKDD 2014PAKDD 2015PAKDD 2017,  PAKDD 2018 and PAKDD 2019.

PAKDD Proceedings

As usual, the conference proceedings of PAKDD 2020 are published by Springer in the Lectures Notes on Artificial Intelligence (LNAI) series. This ensures that the proceedings are indexed in DBLP and other major indexes, and gives good visibility to papers.

pakdd 2020 proceedings

This year, there was 628 submissions to PAKDD 2020. From those, 135 papers have been accepted, which means an acceptance rate of 21.5%.

The conference went online

This year, the PAKDD 2020 conference was planned to be held in Singapore. But due to the unforeseen COVID-19 virus pandemic around the world, the PAKDD 2020 conference was held online instead. Part of the registration fee was re-imbursed to the authors because organizers saved money by doing the conference online. And of course, since the conference was online, all social events like banquet, reception were cancelled.

All authors were asked to submit a pre-recorded 13 minute video of their paper in 720p resolution with their slides, before the conference. Then during the conference, authors had to be available to answer questions online after the presentation of their paper. Thus, each paper was alloted a total of 17 minutes. This is somewhat less than previous years where long presentations had about 30 minutes, if I remember well.

The conference could be accessed through the Zoom online meeting system. To attend the different sessions, a password was required, which was made available to registered attendees.

Some video ettiquette tips were given to authors

As for proceedings, since the conference was online, proceedings were made for download from the conference website in PDF format.

Day 1Tutorials and workshop day

On the first day, there was 5 workshops and 2 tutorials.

I first went to have a look at the literature based discovery workshop using Zoom. There was about 22 persons in that workshop at 9:26 AM, watching this presentation about using evolutionary algorithms for matching biodemical ontologies.

pakdd workshop on  literature based discovery

Then, I popped in the Data Science for Fake News workshop at 9:40 AM to see how it was. Although, it was supposed to start at 9 AM, the workshop had not started. Using the chatroom, I asked and was answered that it was delayed until 10 AM (perhaps some technical problem or someone missing due to time zones?).

Thus, I went next to check the Game Intelligence & Informatics workshop at 9:50. There was about 11 persons watching the presentations at 9:47 AM. Game intelligence is a quite interesting topic. Here is a screenshot from that workshop, where game strategies were analyzed:

Then, at 9:57 AM I went to have a quick look at the Tutorial on Deep Explanations in Machine Learning via Interpretable Visual Methods, which was in the fourth parallel session. There was about 44 persons watching it, so it seemed to be the most popular session. This topic is interesting as neural networks can be very effective but are mostly black-box models . In that tutorial, they talked about how to interpret such models, and they also discussed some other ways of interepreting knowledge in data mining such as how to visualize association rules (screenshot below).

So far, all of this was quite interesting. And there was some good questions in the sessions that I have attended.

In the afternoon at 2PM, I attended the 9th Workshop on Biologically Inspired Data Mining (BDM 2020). This is a workshop that has been running for many years at PAKDD, that I personally like as it cover various topics such as genetic algorithms, particle swarm optimization (PSO), ant colony optimization, and also applications of such algorithms. There was about 18 persons attending the workshop at 2:11 PM. First, the organizer Shafiq Alam gave an overview of the motivations for biologically inspired data mining by explaining that optimization algorithms like genetic algorithms can be used to quickly find an approximate solutions to hard problems, if we can accept to lose a little bit about the accuracy. Then, some results were about using PSO for clustering and recommendation. Then, there was some paper presentations, and a discussion about current trends.

At the same time in the afternoon, there was a Tutorial on deep Bayesian network that had about 31 attendees at 2:19 PM, and a workshop on Learning Data Representations for clustering, which had about 14 attendees at 14:21 PM. Overall, it seems that the tutorials were the most popular sessions during this first day.

Day 2

At 8:30 to 9:00 AM, there was the conference opening. There was about 59 persons in that session at 8:58 AM. Some awards were announced:

It was followed by a keynote from Prof. Bing Liu about open-world AI and “continual learning”, which discusssed the need for software that can continuously learn. Here are a few slides:

This was followed by two Industry talks, one by Ussama Fayyad and another by Ankur Teredesai. Below is a few slides from the talk of A. Teredesai about AI for health, which was watche. He discussed how data mining and AI can help for healthcare. In particular, he talked about epidemiological models for diseases such as COVID-19. At 11:18 AM, there was about 27 persons in that session. That talk interesting but there was some internet connection problems at some point such that the audio was hard to hear for a few minutes. But then, it was OK.

Then, in the afternoon, there was paper presentations.

Day 3

On the morning 8:30 AM, there was a keynote talk by Inderjit S. Dhillon about multi-output prediction. There was about 42 persons watching at 8:51 AM. Here is a screenshot of that talk:

In the afternoon, there was a keynote talk by Prof. Samuel Kaski titled “Data Analysis with Humans” about how humans can participate in the machine learning process. There was about 34 persons attending the talk at 2:08 PM. He first illustrated that different problems (and method) require different levels of human intervention.

Generally, the user can participate in different ways in the machine learning of data mining process.

First the user can be a passive data source. Second the user can participate more actively in the process of machine learning or data mining to guide the software program.

Here is a slide from approach 1).

And here is a slide from approach 2), where the user guides than AI program towards a solution.

Then, there was more slides and details but I did not take note of everything.

Then, after that there was more paper presentations.

Day 3

On Day 3, there was the most influential paper talk, a keynote talk by Prof. Jure Leskovec in the afternoon, and more paper presentations.

Papers about pattern mining

Now I will talk a little bit about papers related to pattern mining, as it is one of my topics of interest. I presented a paper about a new algorithm named LTHUI-Miner to discover high utility itemsets that are trending in non predefined time periods in customer transaction databases. This is the work of my master degree student:

Fournier-Viger, P., Yang, Y., Lin, J. C.W., Frnda, J. (2020). Mining Locally Trending High Utility Itemsets. Proc. 24th Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD 2020), Springer, LNAI [video]

You can watch the video of my presentation here.

Also another paper related to pattern mining that was published in PAKDD this year is about discovering frequent subsequences in a set of sequences using an algorithm called Tree-Miner:

Tree-Miner: Mining sequential patterns from SP-Tree. Redwan Ahmed Rizvee (University of Dhaka), Chowdhury Farhan Ahmed (University of Dhaka), Mohammad Fahim Arefin (University of Dhaka)

Is this online format a success?

Overall, the online format of this conference is fine. But I miss the social activities of an offline conference like the coffee breaks, where we can talk with other researchers to exchange ideas and meet new people. For me, this is perhaps the most interesting parts of a conference. For me, this is one of the most interesting aspects of a conference.

Also, as a suggestion, it would have been nice if there was a playback feature to watch presentations that we have missed. In my case, I am in the same time zone as Singapore so it was convenient for me to watch the presentations, but I can imagine that people from some other countries (e.g. some part of Canada with a 12 hours time difference) would have a harder time to watch some presentations.

Special journal issues

Some papers were invited for a special issue in the JDSA journal. This is always interesting to be invited in a special issue. However, although this journal is published by Springer, a problem is that this journal is still quite new, and as such it is to my knowledge not indexed in databases like SCI or EI. In some countries like where I work, this is important and papers not indexed do not have so much value. So for this reason I had to decline the invitation to extend my paper. I would have prefered to be invited in a special issue in a more established journal like some other conferences do.

In the call for papers, there was also a mention that some papers would be invited for an issue in the KAIS journal. This is a quite good journal, but apparently it was only for the few very best papers.

Next year

It was announced that PAKDD 2021 will be held in Delhi, India.

Conclusion

Overall, it was an interesting conference. Due to the virus situation, the conference was held online. The organizers manage to organize the conference very well in this situation. Looking foward to PAKDD 2021 next year.


Philippe Fournier-Viger is a computer science professor and founder of the SPMF open-source data mining library, which offers more than 170 algorithms for analyzing data, implemented in Java.

“Pattern Mining :Theory and Practice” (textbook in Thai, with SPMF)

Hi all, this is to announce that a new textbook in Thai has been published about pattern mining, which includes many examples using the SPMF software. The textbook named “Pattern Mining: Theory and Practice” is written by teacher Panida Songram from Mahasarakham University (Thailand) and can be used for teaching or self-learning, for students or practitionners. I have known the auhor for many years and I am very happy that she let me host a copy of the book that you can download from this link:
Pattern Mining: Theory and Pratice (PDF, 14.2 MB),

The book gives a good coverage of pattern mining. It explains algorithms but also contains many practical examples about how to use SPMF. Some key topics in the book are itemset mining, sequential pattern mining and multi-dimensional sequential pattern mining.

That is all I wanted to share for today. If you can read Thai, I highly recommend to download this book. 😉


Philippe Fournier-Viger is a computer science professor and founder of the SPMF open-source data mining library, which offers more than 170 algorithms for analyzing data, implemented in Java.

(video) Mining Locally Trending High Utility Itemsets

Today, I want to share with you the video presentation that I have prepared for my paper at PAKDD 2020. It presents a new problem where we want to discover locally trending high utility itemsets (LTHUIs). A LTHUI is a set of items purchased by customers that are trending (generate money that follows an upward or downward trend during some non predefined time periods. It is a variation of the popular high utility itemset mining problem.

pakdd 2020

Hope you will enjoy this video! If you want more details about this topic, you can read this paper:

Fournier-Viger, P., Yang, Y., Lin, J. C.W., Frnda, J. (2020). Mining Locally Trending High Utility Itemsets. Proc. 24th Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD 2020), Springer, LNAI, 12 pages.

The source code will be released soon in the SPMF data mining software.

==
Philippe Fournier-Viger is a professor, data mining researcher and the founder of the SPMF data mining software, which includes more than 150 algorithms for pattern mining.

Success and Health for Researchers

Many researchers or students want to be successful researchers in their field. For this they make many sacrifices such as working long hours at the lab every day from morning to the evening. This is important because honestly, success comes with hard work. But it is important to still keep a good life balance to stay healthy. In this blog post, I will talk about the importance of having good life and work habits for researchers.

First let me tell you a bit about my story. Since the start of my graduate studies, I have worked countless hours to improve myself. For example, during my master degree and Ph.D. studies, I would basically not take any rest during the whole year, and work maybe 12 hours every day. That has allowed me to be successful in my field, receive big grants during my studies, publish many papers, and then to land some good jobs in academia. Nowadays, as I have a familly, I cannot work as much as when I was a student, but I still work hard, and I am much more efficient that I was before due to the skills that I have gained. For example, I can write a paper much more quickly. I still work very late at night almost every day.

Health is important

Now, what I have learnt over the year is that working is not everything. Health is also very important. Working for long hours at the lab can eventually bring several health problems like pains in the wrist, neck, back problem, and eye problems. Luckily, I do not have any major problems, but it is something to be awared of, as problems will typically appear later down the road.

My advices

First, it is important to eat healty food.

Second, it is important to have a good posture while working. For example, it is worthy to find a good chair for working and to adjust the height of the table, screen and to have some appropriate mouse and keyboard, to be comfortable.

Third, it is important to avoid sitting for a too long time, and to sometimes rest your eyes. Several studies have shown that sitting for long periods of time may lead to various diseases. Thus, every hour, it is good to stand up and go for a walk for a few minutes, for example.

Fourth, it is equally important to do some exercise every week. Even doing a few hours of exercise every like running, swimming or playing badminton can make you feel better. I personally like to go run for 30 minutes to an hour every day.

Also, if you are tired or are always siting on a chair, you may consider working in a standing position. I have recently started to do this, and it really feels great. I even wonder why I have not done this before! It is very good for the posture and the back. Here is a picture of my setup at home:

working in a standing position

Some people recommend to alternate between a standing and sitting position to avoid getting tired. But personally, I have no problem working for several hours in a standing position. If you dont have a support like mine on the picture, you could as well use some boxes to raise your computer higher.

Another good advice is that if you are working on a laptop, you should consider using an external screen or external keyboard. The reason is that if you put your laptop low, then the keyboard will be perhaps at an appropriate height but the screen will be too low and you will have to bend your neck. But on the other hand, if you put your laptop higher the screen will be at an appropriate height for your eyes but the keyboard will be too high. Thus, using an external screen or keyboard can solve this problem.

Conclusion

In this blog post, I have discussed about the importance of having some good life habits to be a healthy researcher and avoid health problems later in life. If you have some other suggestions related to this, please post them in the comment section below!

==
Philippe Fournier-Viger is a professor, data mining researcher and the founder of the SPMF data mining software, which includes more than 150 algorithms for pattern mining.

A few errors to avoid in research papers

Today, I will write a short blog post just to give a list of some common errors that I observed recently in some journal and conference research papers:

  • Using a reference number as the subject of a verb. For example, “[12] proposed an algorithm” should be written as “Smith et al. [12] proposed an algorithm”.
  • When there is a shorter way of writing something, it should be used. For example, “in order to” should be replaced by “to“. Another example: “this new type of algorithm is” can be replaced by “this new algorithm type is“. Similarly, “A is an extension of B” can be replaced by “A extends B“. One should write concisely.
  • The title of a paper is too long. I recommend to not have more than 10 words, and preferably less. I recently read a paper having a title with more than 20 words!
  • Using too much the word “we”. Generally, it is better to avoid using “we” as much as possible.
  • Using the words “you” or “I”. These words should never appear in a research paper.

I could say much more about this. Indeed, you can look at my other blog posts about writing research papers for more information. But my goal was just to remind you about some common errors!

==
Philippe Fournier-Viger is a professor, data mining researcher and the founder of the SPMF data mining software, which includes more than 150 algorithms for pattern mining.

UDML 2020 – Utility Driven Mining and Learning Workshop

Hi all, This is to let you know the good news that the UDML workshop on Utility Driven Mining and Learning will be back this year, at IEEE ICDM 2020, for the third edition (UDML 2020).

This is a good venue to submit your papers about data mining and machine learning, especially given that all accepted papers will be published in the IEEE ICDM workshop proceedings, just like last year! Also, we are planning to have a special issue in a good SCI/EI journals for the best papers of the workshop (to be confirmed).

open-source data mining software

In particular, if you have some papers about high utility pattern mining (including topics such as high utility itemset mining, high utility episode mining or high utility sequential pattern mining), this is a perfect place to submit your papers 😉

But we are also looking for papers on other more general topics related to the concept of utility, such as to analyze/learning the important factors (eg, economic factors) in the data mining or machine learning process. Here is a non exhaustive list of some potential topics:

  • Theory and core methods for utility mining and learning
  • Utility patterns mining in large datasets, e.g., high-utility itemset mining, high-utility sequential patterns/rules mining, high-utility episode mining, and other novel patterns
  • Analysis and learning of novel utility factors in mining and learning process
  • Predictive modeling/learning, clustering and link analysis that incorporate utility factors
  • Incremental utility mining and learning
  • Utility mining and learning in streams
  • Utility mining and learning in uncertain systems
  • Utility mining and learning in big data
  • Knowledge representations for utility patterns
  • Privacy preserving utility mining/learning
  • Visualization techniques for utility mining/learning
  • Open-source software/libraries/platform
  • Innovative applications in interdisciplinary domains, like finance, biomedicine, healthcare, manufacturing, e-commerce, social media, education, etc.
  • New, open, or unsolved problems in utility-driven mining

The website of the UDML 2020 workshop is here:
http://www.philippe-fournier-viger.com/utility_mining_workshop_2020/

Submissions are limited to 10 pages, and must be formatted according to the IEEE 2-column format(link) Papers will be evaluated based on the evaluation criteria of the main IEEE ICDM 2020 conference for research papers. In particular, papers must present original research that is not under consideration in other journals, conferences and workshops.

==
Philippe Fournier-Viger is a professor, data mining researcher and the founder of the SPMF data mining software, which includes more than 150 algorithms for pattern mining.

(video) Mining Cost-Effective Patterns

In this blog post, I will share another talk that I have recorded recently. This time, I will explain a new paper from my team about discovering cost-effective patterns using some algorithms called CEPB and CEPN. Mining cost-effective patterns is a new topic in pattern mining that combines the concept of utility with that of cost.

Hope you will enjoy this video! If you want more details about this topic, you can read this paper:

Fournier-Viger, P., Li, J., Lin, J. C., Chi, T. T., Kiran, R. U. (2019). Mining Cost-Effective Patterns in Event Logs. Knowledge-Based Systems (KBS), Elsevier

Moreover, you can also download these algorithms, the source code and dataset from the SPMF data mining library.

That is all for today.
==
Philippe Fournier-Viger is a professor, data mining researcher and the founder of the SPMF data mining software, which includes more than 150 algorithms for pattern mining.

(video) Discovering interpretable high utility patterns in databases

Today, I will share a short keynote talk (28 min) about discovering interpretable high utility patterns in data that I have presented at the CCNS 2020 conference. This talk gives an overview of techniques for finding interesting and useful patterns that can help to understand data.

Hope you will enjoy this video! If you want to know more about how to find interesting and useful patterns in data, I have written a series of blog posts on this topic.

I have also published various videos that you can find on this blog. Moreover, to apply this in your projects, you can use the SPMF open-source data mining sofware (which I am the founder). It provides more than 150 algorithms for identifying useful patterns in data.

==
Philippe Fournier-Viger is a professor, data mining researcher and the founder of the SPMF data mining software, which includes more than 150 algorithms for pattern mining.