Expensive Academic Conferences – the case of ICDM

I was recently thinking of attending IEEE ICDM 2018 (International Conference on Data Mining) in Singapore, next month. It is a top 5 data mining conference. According to my schedule, I could attend it for 2 days, and since Singapore is close to China, it is convenient to go there. However, I was quite surprised by how expensive the registration fee of this conference has become. As of today, the “standard registration fee (by 28 October)” is roughly 1360 USD$ or 9300 CNY.

This is actually the most expensive conference that I have ever considered attending. Most conferences that I have attended have been in the 300-700 USD range, twice less than ICDM. But is it an outlier? To see more clearly, I decided to compare the standard registration of ICDM 2018 with those of previous editions of ICDM:

  • ICDM 2018: 1360 $ USD (11 % increase from 2017)
  • ICDM 2017: 1220 $ USD (12% increase from 2015)
  • ICDM 2015: 1080 $USD (28 % increase from 2013)
  • ICDM 2013: 844 $USD (68% increase from 2011)
  • ICDM 2011: about 500 $ USD

This is quite interesting. It shows a steady increase in the registration price of the ICDM conference over the years. The registration fee has increased so much, that the price is now 2.7 times higher than 8 years ago!

Why is it so expensive?

One could argue that the reason is the location of the conference. But the increase has been steady over the years no matter where the conference was organized. Moreover, such big conferences have often thousands of attendees, and usually many sponsors. I recently attended the KDD 2018 conference, which was also expensive, but less than ICDM. There were about more than 3000 attendees, and if I remember well they received more than 1 million dollars in sponsorship.

Thus, where all this money goes?  A good part goes to renting a convention center, publishing the proceedings and other aspects such as providing scholarships to students. But many conferences also make some considerable profit.  Some conferences are not for profit, while some other conferences will pay the local organizers or the association organizing the conference. I am not sure about how the money is used in the case of ICDM or IEEE and what they will do with the profits, as I could not find the information. But I believe that such big conferences can generate a huge amount of money. By discussing with organizers of smaller conferences (200 attendees) that have much lower registration fees and less sponsorship, I know that some conferences can still make 20,000$ profit.

About IEEE, it is not their only conference in the 1000$ USD range. Some other flagship conferences like IEEE ICC (about communication) also have fees greater than 1000$ USD.  In the field of data mining, the KDD conference is also quite expensive, although currently less than ICDM.  In some ways, many people want to attend these conferences so they are willing to pay these high fees.

Consequences of high registration fees

The consequence of such high registration fees is that some people may not have enough money to attend, and that a lot of money is spent by researchers.  And in many cases, that money comes from research projects funded by the government. Thus, one could argue that this money could be used in better ways.

Personally, I was thinking of attending ICDM but when I saw that I would have to pay almost 1400 $ USD for two days to access the conference, I think it is not reasonable to spend that much money. I have enough research funding to pay this, but I still do not want to waste the money provided by the government for supporting research. Thus, this year, I will use the money for other things rather than going to ICDM.

Update 2019-03-14: One of the general co-chairs of ICDM 2018 has taken the time to provide his insights and given some explanations about the registration fees of ICDM 2018 in the comment section. You can read the comment. It says that basically, the increase in price would be partially explained by fluctuations of the exchange rate, and the 7% sale tax of Singapore.

Update 2023:

ICDM 2023 : 1370 $ USD (offline)
ICDM 2022: 1300 $ USD (conference was announced to be held in person at the time of registration)
ICDM 2021: 325 $ USD (reduced price due to the COVID pandemic / online conference)
ICDM 2020: 600 $ USD (reduced price due to the COVID pandemic / online conference)
ICDM 2019: 1182 $ USD
ICDM 2018: 1360 $ USD (11 % increase from 2017)
ICDM 2017: 1220 $ USD  (12% increase from 2015)
ICDM 2015: 1080 $USD  (28 % increase from 2013)
ICDM 2013: 844 $USD  (68% increase from 2011)
ICDM 2011: about 500 $ USD


Philippe Fournier-Viger is a professor of Computer Science and also the founder of the open-source data mining software SPMF, offering more than 150 data mining algorithms.

Posted in Big data, Conference, Data Mining | Tagged , , , , , , | 2 Comments

Skills needed for a data scientists? (comments on the HBR article)

Recently, I have read an article of the Harvard Business Review (HBR) website about data sciences skills for businesses. This article proposes to categorize skills related to data on a 2×2 matrix where skills are labelled as useful VS not useful, and time-consuming VS not time-consuming. The author of that article has drawn such a 2×2 matrix illustrating the needs of his team (see below).

Obtained from Harvard Business Review

This matrix has received many negative comments online, in the last few days. These comments have mainly highlighted two problems:

  • Why mathematics and statistics are viewed as useless?
  • Data science is viewed as useful but mathematics and statistics are viewed as useless, which is strange since math and stats are part of data science.

Having said that, I also don’t like this chart. And many people have asked why it is published in Harvard Business Review (a good magazine). But  we should keep in mind that this chart illustrates the needs of a company. Thus, it does not claim that mathematics and statistics are useless for everyone. It is quite possible that this company does not see any benefits in taking mathematics and statistics courses or training. Following the negative comments, the author and editor of HBR have reworded some parts of the article to try  to make clearer that this should be interpreted as a case study.

A part of the problem related to this chart and article is that the term “data science” has always been very ambiguous. Some people with very different backgrounds and doing very different things call themselves data scientists. This is a reason why I usually don’t use this term. And it could be a part of the reason why this chart shows a distinction between data science, math and stats, which I would describe as overlapping.

From a more abstract perspective, this article highlights that some companies are not interested into investing into skills that takes too much time to acquire (have no short-term benefits).  For example, I know that some companies prefer to use code from open-source projects or ready-made tools to analyze data rather than spending time to develop customized tools to solve problems. This is understandable as the goal of companies is to earn money and there are many tools available for data analysis.  However, one should not forget that using these tools often requires to possess an appropriate background in mathematics, statistics or computer science to choose an appropriate model given its assumptions and correctly interpret the results. Thus having those skills that take more times to acquire is also important.

What is your opinion about this chart and the most important skills for a data science?  Please share your opinion in the comment section below.


Philippe Fournier-Viger is a professor of Computer Science and also the founder of the open-source data mining software SPMF, offering more than 150 data mining algorithms.

Posted in Big data, Data science | Tagged , , , , | Leave a comment

(video) Minimal High Utility Itemset Mining with MinFHM

This is a video presentation of the paper “Mining Minimal High Utility Itemsets” about high utility itemset mining using MinFHM. It is the first video of a series of videos that will explain various data mining algorithms.

VIDEO LINK : https://www.philippe-fournier-viger.com/spmf/videos/minfhm.mp4

More information about the MinFHM algorithm are provided in this research paper:

Fournier-Viger, P., Lin, C.W., Wu, C.-W., Tseng, V. S., Faghihi, U. (2016). Mining Minimal High-Utility Itemsets. Proc. 27th International Conference on Database and Expert Systems Applications (DEXA 2016). Springer, LNCS, 13 pages, to appear

The source code and datasets of the MinFHM algorithm can be downloaded here:

The source code of MinFHM and datasets are available in the SPMF software.

I will post videos like that perhaps once every few weeks.  I actually have a lot of PPTs to explain various algorithms on my computer but I just need to find time to record the videos.  In a future blog post, I will also explain which software and equipment can be used to record such videos. This is the first video, so obviously it is not perfect. I will make some improvements in the following videos.  If you have any comments, please post it in the comment section!


Philippe Fournier-Viger is a professor of Computer Science and also the founder of the open-source data mining software SPMF, offering more than 150 data mining algorithms.

Posted in Data Mining, Data science, Pattern Mining, Video | Tagged , , , , , , | Leave a comment

Periodic patterns in Web log time series

Recently, I have analysed trends about visitors on this blog. I have made two observations. First, there is about 500 to 1000 visitors per day. For this, I want to thank you all for reading and commenting on the blog.  Second, if we look carefully at the number of visitors per day, it becomes a time series, and we can clearly see some patterns that is repeating itself every week. Below is a picture of this time series for January 2018.

periodic visitor accesses

As you can see, there is a clear pattern every week. Toward the beginning of the week on Monday and Tuesday, the number of visitor increases, while around Friday it starts to decrease. Finally, on Saturday and Sunday, there is a considerable decrease, and then it increases again on Monday. This pattern is repeating itself every week. We can see it visually, but such patterns could be detected using time series analysis techniques such as an autocorrelation plot. Besides, it would be easy to predict this time series using time series forecasting models.

We can also see a relationship with the concept of  periodic patterns that I have previously discussed in this blog. A periodic pattern is pattern that is always repeating itself over time.  That is all for today. I just wanted to shared this interesting finding.


Philippe Fournier-Viger is a professor of Computer Science and also the founder of the open-source data mining software SPMF, offering more than 150 data mining algorithms.

Posted in Big data, Data Mining, Data science, Time series | Tagged , , , , , | Leave a comment

What I dislike about academia

In this blog post, I will talk about academia. There are numerous things that I like about academia, and I really enjoy working in academia. But for this blog post, I will try to talk  about what I don’t like in academia to give a different perspective.

academia

Even when we like something very much, there is always some things that we don’t like. So, here we go. Here is a list of some things that I more or less dislike in academia:

  • A sometime excessive pressure to publish: There is sometimes a great pressure on researchers to produce many publications in a given time frame, which may come from various sources. It is in part necessary as it increases productivity and ensures that researchers do not become lazy. But a drawback is that some researchers may be less willing to take risks or may focus on short-term projects rather than on more difficult but more rewarding projects.
  • Conflicts of interests at various levels. A researcher should avoid conflicts of interest. However, not everyone does and this is a problem. A few years ago, for example, I was a program committee member of a conference and discovered that a reviewer reviewed his own paper. I reported this issue to the conference organizers and that person was kicked out of the program committee. Another, example is some journal reviewers that always ask that we cite their papers in their reviews even if it is not relevant to our paper, just to increase their citation count. In my field, there is one reviewer that is especially known for doing this as several researchers talked to me about him. This is not a good behavior and I usually report it to the journal editor but since reviewers work for free, there is typically no consequence for such people. A third example is that some researchers will often give preferential treatment to their friends. For example, I ever attended a conference  where three of the awards were handed to collaborators of the conference organizer. Although these papers may be good, it remains suspicious. Another example is when I was applying for jobs in Canada, several years ago. At that time, I was one of remaining two candidates for a professor position but finally the other much less experienced researcher was chosen, due to a likely conflict of interest.
  • Predatory journals and conferences. There are many journals of very low quality that only publish to earn money. These journals usually have very broad scope, are published by unknown publishers and sometimes appear to not review papers. They also often send spam to promote their journals. This is a problem, and I obviously dislike such journals.
  • Unethical publications by some researchers. I have discovered and reported several journal papers that contained plagiarism. These papers have been generally retracted, as they should. But in some cases, unethical behavior is not so easy to detect. For example, I have ever read some papers where I thought that results were fake but there was not enough evidences to prove it. It certainly happens that some researchers publish fake results, which is bad for academia.
  • Publishers that sometimes are too greedy. It is well known that some publishers charge very high fees to universities and individuals to publish and/or access research publications. This is somewhat unfortunate because research is often funded by a government, done by researchers and reviewed for free by reviewers, while publishers are those earning money. It would be difficult to change this as popular publishers are well established and there are pressure to keep this system. On the other hand, this publication system is not that bad. Actually, the good publishers will filter many bad papers, and ensure minimum quality levels for papers, which is important.
  • Insufficient funding for research in some countries. Currently, I have a lot of funding so I cannot complain about insufficient funding. But in some other countries, funding is quite rare and often insufficient for researchers in academia. This was the case when I was working in Canada. To apply for the national funding by NSERC, we would have to write a budget requesting large amounts of money but one was considered lucky to even just get a fraction of it. Thus not so much money was available to students, for attending conferences and publications, and buying equipment. Besides, there is not enough professors at several universities in countries like Canada.
  • Reviewers that do not do their job well. As researchers, our work are evaluated by other researchers to determine if our work should be published in a given conference proceedings or journal. Generally, reviewers do a good job and do it for free, which is very appreciated. However, in some cases, reviewers don’t do their job correctly. For example, it ever happened to me that a reviewer rejected my paper because he thought the problem could be solved in a more simple way. But the solution proposed by the reviewer in his review was wrong. Having said that, a reviewer often misunderstand a paper because it is not well written. Thus, such situations are often to be blamed on authors rather than reviewers. And often when a paper is rejected there are multiple problems in the paper.
  • Unprofessional behavior. In some cases, some researchers have highly unprofessional behavior. This was for example the case for the ADMA 2015 conference, which was canceled without notifying authors, after papers had been submitted. The website just went offline and organizers just ignored emails.
  • Bad paper presentations. I have attended many international conferences. Sometimes paper presentations are good. But sometimes they are not good. There are several easily avoidable mistakes that a presenter should not do such as turning is back to the audience, exceeding the time limit, and not being prepared.

This is all for today! I just wanted to share some things that I don’t like about academia. But actually, I really like academia. You can share your own perspective on academia in the comments below, or perhaps that you may want to share solutions on how to improve academia. 😉


Philippe Fournier-Viger is a professor of Computer Science and also the founder of the open-source data mining software SPMF, offering more than 150 data mining algorithms.

Posted in Academia, Research | Tagged , , , | Leave a comment

News about the data mining blog

This data mining blog has been created more than five years ago and has had a considerable success with more than 800,000 views. For this, I want to thank all the readers. Today, I will announce some important news related to this blog.

Translation of the blog

The first news is that the blog will be translated to make it more accessible in other languages. Since I work in China and there is a very large Chinese data mining community, I have recently added a Chinese translation of the data mining blog. It can be accessed by clicking the following link in the menu of this website.

chinese blog

In the Chinese version of the data mining blog, not all blog posts will be translated, but the most important ones.  Currently four posts have been translated. I have published two and the others will be published in the following weeks.

chinese data mining

I am also considering adding a French translation since I am a native French speaker.  Other languages could also be added such as Vietnamese and Spanish if volunteers are willing to help me translating to other languages.

Video tutorials about data mining and big data

The second news is that I am currently experimenting with software to record lectures and publish them online as HTML5 videos. In the near future, I will start publishing  various videos about data mining. This will include some lectures that I have given, as well as some tutorials for my SPMF data mining software. I will also record some video tutorials to present some classical data mining algorithms. Moreover, I will discuss why recording videos can be useful to promote research, in a future blog post.

Conclusion

In this blog post, I have given some news about future plans for the blog. Thanks again for reading and commenting. I am also looking for contributors. If you would like to contribute as a guest author or translator, just let me know.


Philippe Fournier-Viger is a professor of Computer Science and also the founder of the open-source data mining software SPMF, offering more than 150 data mining algorithms.

Posted in General | Tagged | Leave a comment

Report about the DEXA 2018 and DAWAK 2018 conferences

This week, I am attending the DEXA 2018 (29th International Conference on Database and Expert Systems Applications) and the DAWAK 2018 (20th Intern. Conf. on Data Warehousing and Knowledge Discovery) conferences from the 3rd to 6th September in Regensburg, Germany.

dexa 2018 dawak 2018

Those two conferences are well established European conferences dedicated mainly to research on database and data mining. These conferences are always collocated. It is not the first time that I attend these conferences. I previously attended  DEXA 2016 and DAWAK 2016 in Portugal.

These conferences are not in the top 5 of their fields but are still quite interesting, usually with some good papers. The proceedings of the conference are published by Springer in the LNCS (lecture notes in Computer Science series, which ensures that the paper are indexed by various academic databases.

Acceptance rates

For DEXA 2018, 160 papers were submitted, 35 have been accepted (22.%) as full papers, and 40 as short papers (25 %).

For DAWAK 2018, 76 papers were submitted, 13 have been accepted (17.%) as full papers, and 16 as short papers (21 %).

Location

The conference is held at University of Regensburg, in Regensburg, a relatively small town with a long history, about 1 hour from Munich. It is a UNESCO world heritage site. The university:

dexa 2018 location

A picture of the old town:

regensburg dexa

Why I attend these conferences?

This year, my team and collaborators have four papers at these conferences, on topics related high utility itemset miningperiodic pattern mining and privacy preserving data mining:

  • Fournier-Viger, P., Zhang, Y., Lin, J. C.-W., Fujita, H., Koh, Y.-S. (2018). Mining Local High Utility Itemsets . Proc. 29th International Conference on Database and Expert Systems Applications (DEXA 2018), Springer, to appear.
  • Fournier-Viger, P., Li, Z., Lin, J. C.-W., Fujita, H., Kiran, U. (2018). Discovering Periodic Patterns Common to Multiple Sequences. 20th Intern. Conf. on Data Warehousing and Knowledge Discovery (DAWAK 2018), Springer, to appear.
  • Lin, J. C.-W., Zhang, Y. Y., Fournier-Viger, P., … (2018A heuristic Algorithm for Hiding Sensitive Itemsets. 29th International Conference on Database and Expert Systems Applications (DEXA 2018), Springer, to appear.
  • Lin, J. C.-W., Fournier-Viger, P, Liu, Q., Djenouri, Y., Zhang, J. (2018Anonymization of Multiple and Personalized Sensitive Attributes. 20th Intern. Conf. on Data Warehousing and Knowledge Discovery (DAWAK 2018), Springer, to appear.

The two first papers are projects of my master degree students, who will also attend the conference.  Besides, I will also chair some sessions of both conferences.

Another reason for attending this conference is that it is an European conference. Thus, I can meet some European researchers that I usually do not meet at conferences in Asia.

Day 1

I first registered. The process was quick. We receive the proceedings of the conference as a USB drive, and a conference bag.

dexa 2018 proceedings

I attended several talks from both the DEXA 2018 and DAWAK 2018 conference on the first day. Here is a picture of a lecture room.

dexa 2018 lecture

There was also an interesting keynote talk about database modelling.

dexa keynote

In the evening, a reception was held at the old town hall.

Day 2

The second day had several more presentations. In the morning I was the chair of the session on classification and clustering. A new algorithm that enhance the K-Means clustering algorithm was proposed, which has the ability to handle noise. An interested presentation by Franz Coenen proposed an approach were data is encrypted and then transmitted to a distant server offering data mining services such as clustering. Thanks to the encryption techniques, privacy can then be ensured. In the morning, there was also a keynote about “smart aging”. I did not attend it though because I instead had a good discussion with collaborators.

Day 3 – Keynote on spatial trajectory analysis

There was a keynote about “Spatial Trajectory Analytics: Past, Present and Future” by Xiaofang Zhou. It is a timely topic as nowadays we have a lot of trajectory data in various applications.

dexa trajectory data keynote

What is trajectory data? It is the traces of moving objects. Each object can be described using time, spatial positions and other attributes. Some examples of trajectory data is cars that are moving. Such trajectory data can be obtained by the GPS of cars. Another example is the trajectory of mobile phones. Trajectory data is not easy to analyze because it samples the movement of an object. Besides, trajectories are influenced by the environment (e.g. a road may be blocked). Other challenges is that data may be inaccurate and some data points may be redundant.

trajectory data

Trajectory data can be used in many useful ways such as route planning, point of itnerest recommendation, environment monitoring, urban planning, and resource tracking and scheduling. Trajectory data can also be combined with other types of data.

trajectory data applications

But how to process trajectory data? Basically, we need to monitor the objects to collect the trajectories, store them in databases (which may provide various views, queries, privacy support, and indexing), and then the data can be analyzed (e.g. using techniques such as clustering, sequential pattern mining or periodic pattern mining). Here is a proposed architecture of a trajectory analysis system:

trajectory data analysis

This is a first book written by the presenter in 2011 about spatial trajectory mining:

trajectory data book

Here are some important topics in trajectory analysis:

trajectory analysis research topics

Then, the presenter discusses some specific applications of trajectory data analysis. Overall, it was an interesting introduction to the topic.

Day 3 – Banquet

In the evening, attendees were invited to a tour of a palace, and then to a banquet in a German restaurant.

dexa dawak banquet

Day 4

On the last day, there was more paper presentations and another keynote.

Next year

DAWAK 2019 and DEXA 2019 will be hosted in Linz, Austria from the 26th to the 29th August 2019.

Best paper award

The best paper award was given to the paper “Sequence-based Approaches to Course Recommender Systems” by Osmar Zaiane et al. It presents a system to recommend undergraduate courses to student.  This system, applies algorithms for sequential pattern mining and  sequence prediction among other to select relevant courses.

Conclusion

Overall, the quality of papers was relatively high, and I was able to meet several researchers related to my research. It was thus a good conference to attend.

Update: You may also be interested to read my newer posts about DEXA and DAWAK 2019, and DEXA and DAWAK 2021.


Philippe Fournier-Viger is a professor of Computer Science and also the founder of the open-source data mining software SPMF, offering more than 150 data mining algorithms.

Posted in Big data, Conference | Tagged , , , , , , | 4 Comments

China lead in mobile payment and services

In this blog post, I will talk about the wide adoption of mobile payment and mobile services in China. I have been working in China for several years and I am still quite amazed by everything that can be done with a cellphone there.

mobile payment in China

China’s mobile payment systems

A fundamental difference with many western countries is that mobile payment is widely used in China and that virtually everything can be paid with a cellphone, from buying something from a street vendor to paying a bill in a restaurant, or transferring money to a friend.

There are two main mobile payment systems in China called WeChat (by Tencent) and Alipay (by Alibaba). To use a mobile payment systems, one needs to download an application  on his cellphone and validate his identity, and generally link the application to a bank account for transferring money to the virtual wallet. This can be done in just a few minutes.  I will describe the main functions of these applications below.

The core function of Wechat is messaging. It allows to  maintain a list of friends and send messages, and make voice/video calls. But Wechat can also be used for mobile payments.  The main payment features are:

  • Transferring money from a bank account to the virtual Wechat wallet to refill it.
  • Sending money to a friend.
  • Sending money or receiving money from someone else by scanning a QR code on his cellphone or let him scan your QR code.
  • Pay a bill at a store. This requires to scan the QR code of the store with the cellphone and then enter the amount of money and password. Then, the store owner receives the money. Another way is to let the store owner scan your QR code to withdraw money from your account.
  • Pay for a wide variety of services such as:
    • Pay utility bills such as water, electricity
    • Pay the bills of your cellphone
    • Order food to be delivered to your door.
    • Order food at the restaurant by viewing the menu on the cellphone, and selecting items.
    • Order a taxi or ride
    • Rent a public bike by scanning the QR code of the bike,
    • Order cinema tickets,
    • Reserve airplane/train tickets/ hotel room
    • Use your cellphone as a ticket in the bus/subway if the cellphone has NFC technology
    • Buy products from online retail stores
    • Send money to charity
    • and many others

The other main payment system is Alipay.  Unlike Wechat,  Alipay is not a messaging application. It is designed for mobile payment and is actually more popular than Wechat. It offers mostly the same functions. Besides, some other functions that I did not mention above are:

  • Pay a credit card
  • Split the bill between friends at a restaurant
  • Buy game
  • Buy lottery

The Wechat and Alipay mobile payment systems are widely used, everyday by hundreds of millions of people. I know many people in China that basically use this to pay for everything in their daily life, and don’t use cash anymore. Actually, mobile payment is often the preferred way of payments in several stores.  For example, I recently bought some milk tea at a store and the employee asked me to pay with Alipay instead of money because he did not have change.

This is quite different from many western countries where mobile payment is rarely used. For example, Business Insider (https://www.businessinsider.com/alipay-wechat-pay-chinamobile-payments-street-vendors-musicians-2018-5/ ) revealed in May 2018 that the mobile payment market in China is valued at 16 trillions, while in the US, it is only 112 billions. In other words, the mobile payment market is more than 140 times larger in China than in the US.

What is the reason for the wide adoption of mobile payment in China? 

There are several reasons:

  • Cellphone plans are very cheap. Thus, many people has a cellphone with a data plan.
  • Using these payment systems is very simple.  To pay, one scans a QR code or let someone scan his QR code. Then, he enter his password to authorize the payment. It can be done for any kind of transactions, between individuals or at a store.  Anyone can receive or send money.
  • There is no fee to pay using these payment systems. For example, the only fee that Wechat charges is 0.1 % when transferring money back from a virtual wallet to a bank account (if the amount exceed 1000 RMB, which is about 150 $ USD). These fees are almost nothing compared to processing fees of credit cards or  debit card in many western countries.
  • Creating a mobile payment account is simple and basically just require to link the account to a bank account. This is much easier than getting a credit card, since mobile payment systems are not used to borrow money

Impact on innovation and adoption of mobile services

The fact that mobile payments are widely used in China has started to transform many aspects of daily life. For example, at the restaurant, it is possible to scan a QR code on a table to see the menu and then order food, which will then be delivered to the table. Another example is to scan the QR code of a bike on the street to unlock the bike, pay to use it, and then leave it anywhere after using it.  A third example, is to go to restaurant with friend, and then split the bill or quickly transfer money between phones, or use the phone to pay in the bus or subway. A fourth example, is to pay at a vending machine using by scanning a QR code.

The wide usage of mobile payment creates huge opportunities for the development of innovative mobile services in China, that cannot be offered on a large scale in other countries.  Thus, I believe that is a key advantage that helps drive innovation in China for mobile services.

Conclusion

In this blog post, I discussed the adoption of mobile payment and mobile services in China. Hope that it has been interesting! If you have any comments, please write it in the comment section below.


Philippe Fournier-Viger is a professor of Computer Science and also the founder of the open-source data mining software SPMF, offering more than 150 data mining algorithms.

Posted in General | Tagged , , , | Leave a comment

Why doing a Ph.D.?

Why donig a PhD

In this blog post, I will answer the question: Why doing a Ph.D.?  This is an important question for several students that are doing a bachelor or master degree,. Taking a good decision is important as doing a Ph.D requires several years of work. But at the same time, obtaining a Ph.D. can be very rewarding.  In this blog post, I will first briefly explain what is the goal of doing a PhD., and then discuss the reasons for doing it and for NOT it.

What is the goal of doing a Ph.D.?

Put in a simple way, the goal of doing a Ph.D. is to learn how to do research. At the end of a Ph.D. one should be able to independently do research, and work as a researcher. Moreover, during the Ph.D., one should make some novel and significant contributions to research in his fields.

What are the reasons for doing a Ph.D?

Some of the main reasons are:

  • Becoming an expert in a field. By doing a Ph.D. one may become an expert on a specific topic, and work with other experts in that field with state-of-the-art equipment on state-of-the-art problems. If you like learning, just like me, this is very satisfying.
  • Contribute to the advancement of knowledge. By doing research, one can contribute to the advancement of knowledge in a specific field. For example, one can make new discoveries that are useful to other people.
  • Work on something that you like. When doing a Ph.D. one can actually choose to to work on something that he like for a few years. This is something that is not always possible when working in a company.
  • Self-achievement. One can see the Ph.D. as a challenge. One of the reasons why I decided to do a Ph.D. is to test myself to see if I could do it.
  • It is a requirements for some jobs. Although not many jobs requires a Ph.D., some jobs like researcher and university professor requires to have a Ph.D. If one wants to do such jobs, he definitely needs to get a Ph.D. Working as a researcher can be very motivating as it requires to use problem-solving skills everyday.
  • Money (in the long term). A Ph.D. actually does not guarantee earning more money than someone who does not have a Ph.D. But in many cases having a Ph.D. can lead to a good salary, especially if one studies in science, technology and engineering.
  • The title? You can call yourself a “doctor”. 😉 I am just kidding. It is nice to have that title. But one should NOT see this as a reason for doing a Ph.D.
  • Spending a few more years as a student.  Although being a student should not be a reason for doing a Ph.D., many Ph.D. students enjoy the freedom that Ph.D. students have. As a student, there is often more freedom than when working in a company. Thus living as a student for a few more years to do a Ph.D. can be positive.
  • Travelling. Graduate studies are a good opportunity for travelling such as to attend international conferences to present your work, or even in some cases to study abroad.

What are the reasons for NOT doing a Ph.D.?

Some of the main reasons for not doing a Ph.Ds are:

  • Time! Doing a Ph.D. requires to spend typically at least 3 years of your life to work on a specific project. For me, that was never an issue, but some people may worry about this. Besides, the end of the Ph.D typically depends on how fast one can complete the project. Some people who are not good at research or are working part-time may spend 4 or 5 years.
  • Money (in the short term). It depends on the country, but it is a possible that a Ph.D. student does not earn a great amount of money during his studies. This means to live with a small amount of money for a few more years, perhaps. Some people do not like to live like students, while some other people enjoy the student life.
  • Money (in the long term).  Although some jobs that require a Ph.D. are very well paid, there exists some jobs that do not require a Ph.D. and can be paid better. Thus, one should not just think about money when deciding to do a Ph.D.
  • Some jobs do not require a Ph.D. Although getting a Ph.D. can help someone acquire many skills especially about research and writing, some people who have a Ph.D. may actually end up doing some jobs that do not require a Ph.D.
  • Pressures from others. One should not do a Ph.D. if it is only because of the pressures of parents or family. One should really be interested in doing the Ph.D. for himself.
  • It requires at lot of work.  Obtaining a Ph.D. requires a few years of hard work. Some people can easily cope with this, while others do not like to work hard. Working for a few years on a project can at times be hard, and some people do not feel motivated. However, if you like research like me, it is actually very motivating.

Conclusion

In this blog post, I have discussed the main reasons for doing a Ph.D. If you think about other reasons or want to share your opinion about this topic, please post in the comments below. Also, you may be interested to read my related post about: what it takes to do a good Ph.D?

==
Philippe Fournier-Viger is a full professor  and the founder of the open-source data mining software SPMF, offering more than 140 data mining algorithms. If you like this blog, you can tweet about it and/or subscribe to my twitter account @philfv to get notified about new posts.

Posted in Academia, Research | Tagged , | Leave a comment

How to review a research paper?

Today, I will discuss the task of reviewing papers in academia. I will discuss why it is important to review papers, and then give tips about how to review papers and also talk about what a reviewer should do and should not do. This topic is important for young researchers who are invited to review research papers for conferences and journals, and want to do this task well.

how to review a paper?

Why reviewing paper is important?

From the perspective of authors and publishers, the review process in academia is important  as it ensures that papers meet some quality standard before they are published.  Moreover, the review process is used to filter out papers that do not meet the requirements of the journal in terms of quality or other criteria (e.g. a paper should not contain plagiarism). The review process is also important as reviewers can provide constructive comments to help authors improve their paper, even when the paper is rejected.

From the perspective of reviewers, reviewing papers is also important. There are a few reasons. First, it means that the reviewer is recognized as having enough expertise to review papers. For example, if you are invited to review papers for some famous journals, you can mention it in your CV and on your website, as it shows that some famous journals are trusting you for doing reviews. This kind of experience is valuable in academia, but not so much for the industry.  Second, by doing reviews, a reviewer can get a glimpse of the latest research that is unpublished in the field. Of course, a reviewer should always be profesionnal and not take advantage of this information. In fact, a reviewer should only use information about unpublished papers for the purpose of reviewing, and should not share information with other people about reviews. But it can still give a broad overview of what other people are working on to the reviewer, for example to know what topics are popular in general. In that sense, it can be interesting. Third, by reviewing papers, you can also know more about how other people will review your papers. In other words, it helps you to think like a reviewer, and write better papers because you will anticipate problems that the reviewers could raise in your paper. Fourth, by reviewing paper, the reviewer can feel that he is helping the research community.

Is there a drawback to review papers?   

Yes, of course, there is.  The drawback is that it takes quite a lot of time, and in life, we have a limited amount of time. Personally, I receive a lot of requests to do reviews from journals. At first, I was accepting all of them when I was a Ph.D. student or early in my career. But now, I decline many of them because otherwise it will take too much of my time. So I usually only review the papers that are related to my field and for the top journals and conferences. If I receive some offer to review papers that are unrelated to what I am doing or from journal or conferences that I never heard of, I will decline the invitation. There is of course some exceptions. For example, if a friend ask me to review for his journal, I will usually say yes even if the paper is not too much related to what I am doing.  Actually, you can consider the job of a reviewer as free work as the reviewers usually never get paid. In fact, in academia, the publisher typically earns money by selling the papers that are written and reviewed for free by researchers (for typical non open-access and free journals), which is a strange model but it is how it works.

How to review a paper?

Now that I have explained why reviewing papers is important. I will present some criteria that a  reviewer should use in general to evaluate a paper. Of course, depending on the research field, some of these criteria may be more or less relevant.

  • Is the paper easy to understand and well-written? 
  • Does the paper follows the format required by the conference or journal?
  • Does the title of the paper is appropriate and describe the content of the paper?
  • Does the abstract accurately describes the content of the paper?   In particular, it should explain why the problem addressed in the paper is important, describe the contributions and briefly talk about results.
  • Does the introduction explains why the problem addressed in the paper is important?
  • Does the introduction discusses limitations of previous work?
  • Does the paper clearly explains what are the new contributions made in the paper with respect to previous work?Are some important citations missing?  Are the references too old?
  • Does the paper contains plagiarism? Several papers that are submitted to conferences and journals contains plagiarized content.  When I review a paper, one of the first thing that I do is check for plagiarism. I copy some sentences from the paper and search for these sentences using a Web search engine to see if the paper has already been published or contains a considerable amount of text from another paper. If there is plagiarism, I directly reject the paper.
  • Does the proposed solution is described with enough details?  For example, if someone is proposing a new data mining algorithm, does all the details of the algorithm are provided or is some important details missing?
  • Is the paper technically sound? In other words, is there some technical errors in the paper (e.g. some lemma or theorem are incorrect). Does some important technical details are missing? Do the authors make some formal proofs (if necessary in his field) that their solution is correct?
  • Does the experiments are appropriate to evaluate the proposed solution? A proposed solution should ideally be compared with solutions from other researchers and a fair experiment should be done to evaluate whether the proposed solution is better.  Also,  the experiments should be designed to evaluate what needs to be evaluated in the paper.

Besides, a reviewer should:

  • Give a fair evaluation of what is good and what is bad in the paper.  It is important to be fair and review the papers of other people in a fair way, just like we would like other people to review our paper.
  • Provide constructive comments. When reviewing a paper, it is important to give constructive comments about how to improve the paper if it needs to be improved. This will help to improve the paper.

And a reviewer should NOT:

  • Perform a review if there is a conflict of interest.  A reviewer should not write a biased review. The review process should be neutral. The definition of a conflict of interest depends on the journal or conference. But generally, a conflict of interest can occur when a reviewer is invited to review his own paper or the paper of someone closely related to him such as a collaborator during the last few years or family member.  Other types of conflict of interests also exist such as reviewing the paper of someone who is in competition with the reviewer in terms of research (for example, to reject papers of competitors in a biased way). As a reviewer, one should avoid conflicts of interests.  But it still happen quite often. For example, as a program committee member of an international conference, I have found that one of the reviewer was trying to review his own paper, a few years ago. I informed the organizers of the conference and they then banned that person from the committee of  that conference, as it is a very serious academic misconduct.
  • Share or use the information about unpublished papers for other purposes than reviewing the paper. A reviewer has access to papers that are unpublished. It is the responsibility of the reviewer to follow the rules of the journal/conference and not share the unpublished papers with unauthorized persons, and also not use information in the unpublished papers for any purposes other than to review the papers.  I will tell you a short story about that. A few years ago, I submitted a paper to PKDD 2012. My paper was rejected but then I found using a web search engine that my paper was online on the webserver of one of the reviewer. I sent an e-mail to PKDD to complain about this, as my paper was unpublished and should not be shared online by a reviewer.  The, the reviewer removed the paper and sent me an apology. He explained that he just put my paper on his server with other papers because he was travelling and use it as a storage space. He then removed the paper.

The Review Generator

You can also try a website that I have developed to help generate a review draft called the paper review generator:

Conclusion

In this blog post, I have discussed why reviewing papers is important, how to review papers, and talked about what a reviewer should do and not do. I tried to give as much information as possible. If you think that I forgot something important or if you have some interesting story, please post it in the comments below. I will be happy to read it.

==
Philippe Fournier-Viger is a full professor  and the founder of the open-source data mining software SPMF, offering more than 140 data mining algorithms. If you like this blog, you can tweet about it and/or subscribe to my twitter account @philfv to get notified about new posts.

Posted in Academia, Research | Tagged , , , , | Leave a comment