The Data Blog

Busy times

Posted on 2023-10-02 by Philippe Fournier-Viger

Hi all, I just write a quick message to say that I did not write on the blog recently due to my very busy schedule recently. However, everything is going well. I will be back on the blog with more content soon, and I will start to add more videos to the YouTube channel soon. Also, I am working on the next version of SPMF. 😉

Stay tuned!

Posted in Uncategorized | Leave a comment

The MDLM 2023 conference: a scam?

Posted on 2023-08-10 by Philippe Fournier-Viger

Today, I will talk about the MDLM 2023 conference (International Conference on Machine Learning and Data Mining). Several years ago, I have attended MLDM 2016 (report here) because at that time it was published by Springer. But I was unhappy that the MLDM conference was advertised as being held in New York, while it was finally held in a small hotel 40 minutes away in another city called Newark, close to nothing. And that conference was really expensive at around 650 euros… I have thus never attended that conference again, but I have observed that during the following years, that conference was still advertised as being in New York, while being held in that same hotel in Newark every year (see my blog post about MLDM 2019). Thus, I never submitted a paper again to that conference, and I also observed that it is not published by Springer anymore (it might be because of that). This for me is another reason to not publish there anymore.

Why I talk about MDLM 2023? Because, this year I received a comment on my blog reporting that the conference was not even held and that people went to Newark and found that there was no conference at all! Here is the comment that report this:

And a related tweet:

I did not verify whether this comment is true, but I believe that it is, given that this MLDM conference has repeatedly mislead people about the location of the conference.

Update 2023-10: And here is another comment that I just received on this blog that also had a similar bad experience with MLDM 2023:

By looking at the Internet Wayback Archive, I can see that the conference website of MLDM 2023 still advertised the conference as being in New York:

But when we click on “Location”, as usual, we find that it is not in New York city but instead in the city of Newark:

For those who are not familiar with the map of US, Newark and New York are two cities from two different states:

Thus, these two cities should not be confused!

This blog post is just to give an update about this MLDM conference.

If you have any information about this MLDM 2023 conference and whether it was really held or not or have any other interesting experience to share, please leave a comment below.

Brief report about the ADMA 2019 conference

Report about the DEXA 2018 and DAWAK 2018 conferences

Brief report about the ADMA 2013 conference

Posted in Conference | Tagged conference, mldm, mldm 2023, mldm conference | 2 Comments

Fake reviews…

Posted on 2023-08-03 by Philippe Fournier-Viger

Today, I would like to talk about something that happens sometimes in academic journals, which is fake reviews. While many reviewers spend time to write reviews that provide a fair evaluation of papers, some reviewers have a very unethical behavior and submit fake reviews.

This recently happened for a paper that I submitted to journal. I will not say the name of the journal, but mention that it is a Q1 journal (a top 25% journal) of Elsevier, that is quite highly ranked.

We submitted a paper that was rejected and received two reviews. The first review contained somewhat minor criticisms that I can accept. But the second review is long but only contain some very general criticisms that do not mention anything related to our paper. Upon reading that latter review a few times, I thought that is very strange because that review sounds so generic. Here is the review:

As you can see, this review is very generic. It could be applied to almost any papers. Besides the review contains some unusual choices of words such as in the point (6), where the reviewer calls our paper “an essay”.

Thus, I searched on Google, and found that this exact review appears on a website (https://www.qeios.com/read/LUCUU6) for another paper:

Upon seeing this, it is clear that the reviewer submitted a fake review. Thus, I sent an email to the editor to complain about that fake review, and hopefully, this will be taken into account and that reviewer will be punished.

This is a very unethical behavior. And it leads to the question: Why a reviewer would do this? It is likely because the reviewer wants to review more papers and do this as quickly as possible, as the number of reviews is displayed on some websites such as Publons. However, this shows that the reviewer is selfish as he does not care about the authors who receive such fake reviews and the consequences of these fake reviews, such as the paper being rejected after waiting several months. Hopefully, the editor will punish that reviewer.

This was a blog post to talk about this issue, which probably happens more often than we think. I think I have seen this at least another time in the last few years. Besides, there are many other unethical behaviors that can be observed in academic journals such as reviewers that ask authors to cite many of their papers to boost their citations. I saw this several times and every time I have reported this situation to the editor.

Hope that this blost post has been interesting. If you want to share your story, please post in the comments below.

—
Philippe Fournier-Viger is a distinguished professor working in China and founder of the SPMF open source data mining software.

Posted in Academia, Research | Leave a comment

SPMF: upcoming feature: The Memory Viewer

Posted on 2023-07-17 by Philippe Fournier-Viger

Today, I would like to introduce an upcoming feature that will be released in the next version of SPMF (v. 2.60). It is a tool called the Memory Viewer. This tool is very simple yet useful for investigating the performance of algorithms. Here is a preview of how it works.

To launch the Memory Viewer in the user interface of SPMF, we need to select the algorithm “Memory Viewer” .

and then click “Run algorithm”.

This will open a separated window for monitoring the current memory of the Java Virtual Machine (JVM) in real-time. This window is shown here on the right:

That window displays the current memory usage of the JVM and updates it every second. It displays the last 100 seconds. After opening it, you can run a data mining algorithm and the memory usage will then be updated through the algorithm’s execution like this:

This is a very simple tool, but I think it is quite useful to get some insights about the performance of the algorithms that are running.

Note that this tool will only monitor the performance of algorithms that are running in the same JVM as SPMF. Thus, if you select the option “Run in a separated process” of SPMF to run an algorithm in a separated JVM, the memory will not be monitored.

Update: Thanks for the feedback. I have also added a slider to let the user change the refresh rate of the Memory Viewer:

That is all for today. I just wanted to show you some upcoming feature, as I am currently working on the next release of SPMF. If you have some suggestions, please leave them in the comment section below.

—
Philippe Fournier-Viger is a distinguished professor working in China and founder of the SPMF open source data mining software.

SPMF 2.52 is released

The best data mining mailing lists (for researchers)

A Map of Data Mining Algorithms (offered in SPMF v092c)

Posted in Data Mining, Data science, open-source, Pattern Mining, spmf | Tagged data mining, data science, java, memory, open source, pattern, pattern mining, spmf | Leave a comment

(video) The EFIM algorithm

Posted on 2023-07-12 by Philippe Fournier-Viger

Here is a new video that I have recorded to explain the EFIM algorithm for high utility itemset mining.

You can watch the video here.

Related to this, if you want to try the EFIM algorithm, the original source code and datasets for testing are available in the SPMF open-source software.

SPMF upcoming feature: Algorithm Explorer

Three videos about Recent Studies on Pattern Mining

Test your knowledge of sequential rule mining!

Posted in Pattern Mining, Utility Mining | Tagged algorithm, data mining, efim, high utility itemset mining, high utility mining, itemset, itemset mining, pattern mining, video | Leave a comment

(video) Introduction to episode mining

Posted on 2023-07-10 by Philippe Fournier-Viger

If you want to learn more about frequent episode mining, I have put online a new video that gives an introduction to this topic. To watch the video, you can download it from my website here or watch it from my Youtube channel.

This video explains:

What is frequent episode mining
the EMMA algorithm
Maximal episode mining

And the video also gives a quick overview of those additional topics:

Episode rule mining
Top-K episode mining
High-utility episode mining

Hope you will enjoy the video!

By the way, to try episode mining, you can also check out the SPMF software, which is open-source and free. It offers efficient implementations of many episode mining algorithms and also provide example datasets. You can use this software in your research (I am the founder by the way).

Also, if you want to know more about episode mining, you can check our survey paper:

Ouarem, O., Nouioua, F., Fournier-Viger, P. (2023). A Survey of Episode Mining. WIREs Data Mining and Knowledge Discovery, Wiley, to appear.

How to Analyze the Complexity of Pattern Mining Algorithms？

New version of SPMF Java open-source data mining library (0.95)

Introduction to the Apriori algorithm (with Java code)

Posted in Pattern Mining, Video | Tagged data mining, emma algorithm, episode, episode mining, episode rule, frequent episode, high utility episode, maximal episode, pattern mining, top-k episode, video | Leave a comment

As a speaker, we should always be ready for the unexpected…

Posted on 2023-07-01 by Philippe Fournier-Viger

I often say that as an invited speaker for a conference or as a teacher for a course, we need to be ready for the unexpected and be prepared for every situation that could happen. This means for example, to bring special cables or adapters that may be needed to give a talk in a new location, to have at least two copies of our presentation on different supports (e.g. laptop, USB, or email), and to arrive earlier to avoid being late.

Today, is such a day where the unexpected happened. I was a keynote speaker yesterday at an AI Innovation Think Thank forum in Shanghai, and was supposed to fly immediately after to another city (Changchun) in the evening to give another talk the next day. Long story short, the flight was delayed from 8 PM to 10 PM, and then to 4 AM before being cancelled. Thus, I only slept a few hours, and had to deal with many problems to obtain refunds from the airline company. And given that I would visibly be unable to attend the conference on time, I contacted the organizers early so that we arrange for my talk to be online. I also recorded a video of my talk in the morning that I sent to the organizers, so that they could play it if the network connection is bad for whatever reasons. This is something that was not requested but can truly make a difference as I often saw online talks in conferences were we could barely hear the speaker due to a poor internet connection, and I don’t want this to happen!

Then, as I still had to still fly from airport, I had to give my keynote talk from the airport, and find a quiet place to do it from there before boarding another flight to return home. Thus, I went early to airport to find a suitable place, and the internet connection was very good and I installed myself on a cart in a quiet place.

Also, it helps that I carry with me a portable RODE shotgun microphone that I can use to give a professional sound to my talks while on the go. This type of microphone is very good for an environment like an airport as it focuses on the sound that is directly in front of the microphone and mostly ignore surrounding noise.

I also carry with an excellent pair of headphones.

And sometimes, I also carry a tripod, a portable light, and a noise filter for my microphone as well (but not this time). But here is some pictures of different accessories that I sometimes use with a portable tripod in different situations:

I also like to carry with me a laptop stand for working on the go:

And something very useful is to have a mouse. But not any mouse. I personally highly recommend the Logitech MX Anywhere. It is a portable computer mouse that can work on basically any kind of materials, even on glass, clothing, … anything!

This is perfect when travelling. You can be sure that the mouse can be used anywhere.

So this was just a short blog post to say, that it is always better to be ready for the unexpected 🙂 If you had some similar stories of unexpected things that happened to you, please share with me in the comments below.

By the way, I did not write on the blog for a little while as I had a lot of things going on recently. Now, it is better. I will post more in the coming weeks.
—
Philippe Fournier-Viger is a distinguished professor working in China and founder of the SPMF open source data mining software.

How to record a research talk as a video?

How to prepare your thesis defense?

Email invitation to be a "special" speaker, a scam?

Posted in Academia | Tagged online, presentation, speaker, talk | Leave a comment

Two common English errors in pattern mining papers

Posted on 2023-06-15 by Philippe Fournier-Viger

This is a short blog post to talk about two common errors in pattern mining research papers.

1) The first error is:

“mining frequent itemsets from a database”
“mining patterns from a stream”
“mining patterns over a database”
“mining patterns over a data stream”

In English, we don’t mine something from something else or over something else. We mine something in something else. So it should be “mining frequent itemsets in a database” and “mining patterns in a stream“

2) The second error is:

“association rules mining”, “frequent itemsets mining”

The correct way to write is:

“association rule mining”, “frequent itemset mining”

Conclusion

These two errors are very common. That is why I think it is important to mention them.

Posted in Uncategorized | Leave a comment

How to become a well-known researcher?

Posted on 2023-06-12 by Philippe Fournier-Viger

The other day, a young researcher asked me: what should I do to become a more well-known researcher in my field? In this blog post, I will try to answer that question.

But before that, it should be said that becoming a well-known researcher is not easy and requires hard work, dedication and a lot of motivation. However, it can have many benefits, such as attracting more funding, collaborators, citations and recognition.

Below are my advices.

1. Publish great work in excellent conferences/journals

To be more well-known, one should do some work that is impactful, relevant and original. Besides, publishing papers in respected journals and conferences will increase the likelihood that other people will read your papers. Thus, it is better to write less papers but write good papers in good venues than to write many low-quality papers and publish them in unknown conferences. The paper should also be accessible, that is written in a way that they are not too hard to read.

2. Make the code/data of your research public

Putting your code and data on a public website that everyone can access will also increase the probability that other people will use your work and thus cite your paper as well. This is a good strategy to increase your impact.

3. Build a network

Don’t just work by yourself. Go visit other research teams, make friends with other researchers and try to collaborate with top researchers in your field. This is important to build connections and other people will start to know you. Also, if you are a student, try to work in the team of a well-known professor. Attending academic conferences is also a good idea to meet other researchers and create collaborations or opportunities for the future.

In my case, I for example, travelled to many countries to build collaborations with different teams (e.g. Vietnam, Japan, New Caledonia, South Korea, Spain, Thailand, to name a few) and attended many conferences. This has been very helpful to create collaborations.

4. Write survey papers

Writing survey papers is also a good way to increase your impact in a field. Survey papers can also attract more citations than other papers, and by writing a survey papers, you can describe some research area from your perspective and also mention your own work.

5. Make yourself a website and keep it up-to-date

I think every researcher should at least have a basic website where he should put his papers, code and data freely available. This can help people to find you and find your research papers. It is highly important but I notice that many researchers don’t have a website.

6. Work on improving yourself

Top researchers are generally humble, curious, creative and work hard. Learn to accept that sometimes you will face failure and keep working to get better. Listen to the feedback from other researchers. Follows the trends in your field and try to be open to learn new topics. Try to improve your weaknesses (e.g. writing ability, oral presentation skills).

7. Organize workshops, special issues, books

If you are not a student anymore and now a full time researcher or faculty member, you may think about starting to organize workshops, special issues for journals, edit books, or even organize conferences. All of these will help you to get more well-known in your field and build relationships with other researchers.

Conclusion

This was just a short blog post to give a few tips about how to become a more well-known researcher. Some of these tips are simple but yet many researchers do not follow them (e.g. creating a website and keeping it up to date, or publishing their code and data).

—
Philippe Fournier-Viger is a distinguished professor working in China and founder of the SPMF open source data mining software.

Writing a research paper (1) – keeping it simple

Useful Latex tricks for Writing Research Papers

Some funny or interesting websites related to research

Posted in Academia | Tagged academia, famous researcher, Research, researcher | Leave a comment

Busy times

The MDLM 2023 conference: a scam?

Related posts:

Fake reviews…

SPMF: upcoming feature: The Memory Viewer

Related posts:

(video) The EFIM algorithm

Related posts:

(video) Introduction to episode mining

Related posts:

As a speaker, we should always be ready for the unexpected…

Related posts:

Two common English errors in pattern mining papers

Conclusion

How to become a well-known researcher?

1. Publish great work in excellent conferences/journals

2. Make the code/data of your research public

3. Build a network

4. Write survey papers

5. Make yourself a website and keep it up-to-date

6. Work on improving yourself

7. Organize workshops, special issues, books

Conclusion

Related posts:

Archives

Categories

Recent Posts

Recent Comments

Number of visitors:

Related posts:

Related posts:

Related posts:

Related posts:

Related posts:

Conclusion

1. Publish great work in excellent conferences/journals

2. Make the code/data of your research public

3. Build a network

4. Write survey papers

5. Make yourself a website and keep it up-to-date

6. Work on improving yourself

7. Organize workshops, special issues, books

Conclusion

Related posts:

Related posts:

Archives

Categories

Recent Posts

Recent Comments

Tag cloud

Number of visitors: