Recently, I found that K. Singh, Shashank Sheshar Singh, Ajay Kumar,
Harish Kumar Shakya and Bhaskar Biswas from the Indian Institute of Technology (BHU) (India) have plagiarized my papers in a paper published in the IEEE TKDE (Transactions on Knowledge and Data Engineering ) journal. I will explain this case of plagiarism below.
*** Important notice: Note that “Kuldeep Singh” is a very common name. This article refers to K. Singh working at BHU University in Varanasi, India. This is not about other people with the same name working in Europe or other locations ***
But before let me explain what is plagiarism. There are two types. First, some people will copy some text word for word from another paper without using quotation marks and a citation. Journal editors can easily detect this using automatic tools like CrossCheck. Second, some people will be more careful. They will copy the ideas of another paper without citations and will rewrite the text to avoid being detected. They will then take the credit for the ideas developed by another researcher. Most of the time reviewers of top journals will detect this but sometimes it will go undetected. This is what happened in the TKDE paper that I will talk about today. That paper is:
Kuldeep Singh, Shashank Sheshar Singh, Ajay Kumar,
Harish Kumar Shakya and Bhaskar Biswas (2018) CHN: an efficient algorithm for mining closed high utility itemsets with negative utility”, IEEE Transaction on Knowledge and Data Engineering. http://doi.ieeecomputersociety.org/10.1109/TKDE.2018.2882421
What is wrong with that paper?
That paper actually proposed a new algorithm called CHN for discovering closed high utility itemsets with negative utility values. In that paper, they extended the EFIM-Closed algorithm that I had proposed at the MLDM 2016 conference, but they did not mention it in the paper. Basically, they copied several techniques from my EFIM and EFIM-Closed papers without mentioning that they were reusing these ideas. They even renamed some of these techniques (e.g. the “utility-bin”) with a different name (e.g. utility array) and rewrote the text. Thus, it appears as Kuldeep Singh et al. proposed several of the techniques of EFIM-Closed, which is unacceptable. Some of the techniques have been adapted in the paper for the different problem, there is a citation for some upper-bounds, but some techniques are exactly the same and not cited.
What has been plagiarized?
I will list the content that has been plagiarized in the paper and provide screenshots of a side-by-side comparison of the papers.
1) In page 4 of the paper of Kuldeep Singh et al., they copy several definitions such as Property 3.1 and Property 3.2 from our FHN paper in KBS 2016.
2) In Section 4.1, they present two techniques: (1) transaction merging and (2) database projection. But those are the same as in the EFIM-Closed paper. The authors rewrote the text. They mentioned that they could reuse a sorting technique from EFIM-Closed but failed to explain that basically all the idea in this section is copied from EFIM-Closed and unchanged from our paper!
3) In Section 4.2, they pretend to use a new technique called “utility-array” to calculate the utility, support and upper-bound of patterns. But basically, they just renamed the “utility-bin” technique of EFIM-Closed to “utility array” and rewrote the text. They copied the idea without citation and then used it to calculate utility and support in the same way, but also some other upper-bounds.
4) In Section 4.4, the techniques for finding closed patterns are all copied from the EFIM-Closed papers without modifications. EFIM-Closed proposed to use backward/forward extension checking in utility mining, by drawing inspiration from sequential pattern mining. Kuldeep Singh et al. rewrote the text and claimed that they were the first to do that and just cited the sequential pattern mining paper that we cited in our paper.
5) In Section 4.5, they present their CHN algorithm that incorporates the copied techniques and also some other modifications. But the pseudocode is very similar to EFIM-Closed since they extend that algorithm. But they never explain that they extend EFIM-Closed as the basis of their algorithm.
6) The following figure look quite familiar?
7) Besides, it is interesting that in Section 4.2, the authors claimed to have proposed a new RTWU upper-bound, while in Section 3 they had already acknowledged that it was from another paper! It is actually from our FHN paper.
So is there any new contribution in that TKDE paper?
To answer that question, I decide to search a little bit more, and I found that the authors had proposed an algorithm for high utility mining with negative utility called EHIN in the Expert Systems journal also in 2018:
Singh, K., Shakya, H. K., Singh, A., & Biswas, B. (2018). Mining of high-utility itemsets with negative utility. Expert Systems, e12296. doi:10.1111/exsy.12296
So what is the difference between the two papers of
Kuldeep Singh, Bhaskar Biswas et al. ? The only difference is the technique for checking that an itemset is closed using forward and backward extensions. But as I have shown before, this technique is copied from our EFIM-Closed paper in section 4.4 without citations. Thus, there is basically nothing new in the TKDE paper.
Now another question is whether Kuldeep Singh, Bhaskar Biswas et al. cite their Expert System paper correctly? They put a citation (see below), but they also do not explain that the TKDE paper is almost the same as their Expert System paper.
Who are the authors?
Kuldeep Singh, Shashank Sheshar Singh, Ajay Kumar, Harish Kumar Shakya, and Bhaskar Biswas are working for the Computer Science and Engg, of the Indian Institute of Technology (BHU), Varanasi, India 2210
Another paper retracted for plagiarism with Bhaskar Biswas
Some reader of this blog pointed out that another paper of Bhaskar Biswas was retracted (in 2011) while he was also affiliated with the Indian Institute of Technology (BHU):
Here, Bhaskar Biswas is the first author, while in the TKDE paper he seems to be the supervisor of some PhD students.
What will happen if?
As usual, when I find some case of plagiarism, I report it to the journal. I have thus sent an e-mail to the editor of TKDE to report that case of plagiarism, and filled a formal complaint to IEEE to ask that they retract the paper, as soon as possible.
I also sent an e-mail to the dean of the Indian Institute of Technology (BHU) and the dean of the school of computer science and engineering to let them know about what happened.
The dean of the computer science and engineering school of IIT (BHU) has confirmed receiving my complaint, and told me that they will investigate this. I am waiting for them to tell me which actions they will take.
The editor-in-chief of TKDE has also informed that action will be taken quickly. Thus, I expect that the paper will be retracted soon.
The first author has communicated with me to tell me that the version on the TKDE website would not be the final version. But normally, it is the accepted version of the paper that the reviewers have read…
But anyway, all I want is that the problem is fixed in a satisfactory way, as I spent already a lot of time to deal with this. If the paper is retracted or fixed on the TKDE website to cite us properly and give the credit where the credit is due, I will be happy and also delete this page from the blog. Hope that this issue can be fixed quickly.
What is the lesson to be learned? In general, there is no problem for a researcher to extend the algorithm of another researcher. This is what Kuldeep Singh, Bhaskar Biswas et al. did in that TKDE paper. They have extended EFIM-Closed with a few ideas to support negative utility values. That would have been fine, if this had been explained. However, the authors rather chose to copy several techniques without citing them and mentioning that EFIM-Closed was extended.
Hope you have learned something from this blog post. That is all for today.
Philippe Fournier-Viger is a professor, data mining researcher and the founder of the SPMF data mining software, which includes more than 100 algorithms for pattern mining.