A key question for data mining and data science researchers is to know what are the top journals and conferences in the field, since it is always best to publish in the most popular journals or conferences. In this blog post, I will look at four different rankings of data mining journals and conferences based on different criteria, and discuss these rankings.
1) The Google Ranking of data mining and analysis journals and conferences
A first ranking is the Google Scholar Ranking (https://scholar.google.com/citations?view_op=top_venues&hl=en&vq=eng_datamininganalysis). This ranking is automatically generated based on the H5 index measure. The H5 index measure is described as “the h-index for articles published in the last 5 complete years. It is the largest number h such that h articles published in 2011-2015 have at least h citations each“. The ranking of the top 20 conferences and journals is the following:
Publication | h5-index | h5-median | Type | |
1 | ACM SIGKDD International Conference on Knowledge discovery and data mining | 67 | 98 | Conference |
2 | IEEE Transactions on Knowledge and Data Engineering | 66 | 111 | Journal |
3 | ACM International Conference on Web Search and Data Mining | 58 | 94 | Conference |
4 | IEEE International Conference on Data Mining (ICDM) | 39 | 64 | Conference |
5 | Knowledge and Information Systems (KAIS) | 38 | 52 | Journal |
6 | ACM Transactions on Intelligent Systems and Technology (TIST) | 37 | 68 | Journal |
7 | ACM Conference on Recommender Systems | 35 | 64 | Conference |
8 | SIAM International Conference on Data Mining (SDM) | 35 | 54 | Conference |
9 | Data Mining and Knowledge Discovery | 33 | 57 | Journal |
10 | Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery | 30 | 56 | Journal |
11 | European Conference on Machine Learning and Knowledge Discovery in Databases (PKDD) | 30 | 36 | Conference |
12 | Social Network Analysis and Mining | 26 | 37 | Journal |
13 | ACM Transactions on Knowledge Discovery from Data (TKDD) | 23 | 39 | Journal |
14 | International Conference on Artificial Intelligence and Statistics | 23 | 29 | Conference |
15 | Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) | 22 | 30 | Conference |
16 | IEEE International Conference on Big Data | 18 | 30 | Conference |
17 | Advances in Data Analysis and Classification | 18 | 25 | Journal |
18 | Statistical Analysis and Data Mining | 17 | 30 | Journal |
19 | BioData Mining | 17 | 25 | Journal |
20 | Intelligent Data Analysis | 16 | 21 | Journal |
Some interesting observations can be made from this ranking:
- It shows that some conferences in the field of data mining actually have a higher impact than some journals. For example, the well-known KDD conference is ranked higher than all journals.
- It appears strange that the KAIS journal is ranked higher than DMKD/DAMI and the TKDD journals, which are often regarded as better journals than KAIS. However, it may be that the field is evolving, and that KAIS has really improved over the years.
2) The Microsoft Ranking of data mining conferences
Another automatically generated ranking is the Microsoft ranking of data mining conferences (http://academic.research.microsoft.com/RankList?entitytype=3&topdomainid=2&subdomainid=7&orderby=1). This ranking is based on the number of publications and citations. Contrarily to Google, Microsoft has separated rankings for conferences and journals.
Besides, Microsoft offers two metrics for ranking : the number of citations and the Field Rating. It is not very clear how the “field rating” is calculated by Microsoft. The Microsoft help center describes it as follows: “The Field Rating is similar to h-index in that it calculates the number of publications by an author and the distribution of citations to the publications. Field rating only calculates publications and citations within a specific field and shows the impact of the scholar or journal within that specific field“.
Here is the conference ranking of the top 30 conferences by citations:
Rank | Conference | Publications | Citations | |
1 | KDD – Knowledge Discovery and Data Mining | 2063 | 69270 | |
2 | ICDE – International Conference on Data Engineering | 4012 | 67386 | |
3 | CIKM – International Conference on Information and Knowledge Management | 2636 | 28621 | |
4 | ICDM – IEEE International Conference on Data Mining | 2506 | 18362 | |
5 | SDM – SIAM International Conference on Data Mining | 708 | 9095 | |
6 | PKDD – Principles of Data Mining and Knowledge Discovery | 994 | 8875 | |
7 | PAKDD – Pacific-Asia Conference on Knowledge Discovery and Data Mining | 1255 | 6400 | |
8 | DASFAA – Database Systems for Advanced Applications | 1251 | 4001 | |
9 | RIAO – Recherche d’Information Assistee par Ordinateur | 574 | 3551 | |
10 | DMKD / DAMI – Research Issues on Data Mining and Knowledge Discovery | 103 | 3264 | |
11 | DaWaK – Data Warehousing and Knowledge Discovery | 503 | 2648 | |
12 | DS – Discovery Science | 553 | 2256 | |
13 | Fuzzy Systems and Knowledge Discovery | 4626 | 2171 | |
14 | DOLAP – International Workshop on Data Warehousing and OLAP | 177 | 1830 | |
15 | IDEAL – Intelligent Data Engineering and Automated Learning | 1032 | 1789 | |
16 | WSDM – Web Search and Data Mining | 196 | 1499 | |
17 | GRC – IEEE International Conference on Granular Computing | 1351 | 1434 | |
18 | ICWSM – International Conference on Weblogs and Social Media | 238 | 1142 | |
19 | DMDW – Design and Management of Data Warehouses | 70 | 993 | |
20 | FIMI – Workshop on Frequent Itemset Mining Implementations | 32 | 849 | |
21 | MLDM – Machine Learning and Data Mining in Pattern Recognition | 313 | 822 | |
22 | PJW – Workshop on Persistence and Java | 41 | 712 | |
23 | ADMA – Advanced Data Mining and Applications | 562 | 676 | |
24 | ICETET – International Conference on Emerging Trends in Engineering & Technology | 712 | 376 | |
25 | WKDD – Workshop on Knowledge Discovery and Data Mining | 527 | 342 | |
26 | KDID – International Workshop on Knowledge Discovery in Inductive Databases | 70 | 328 | |
27 | ICDM – Industrial Conference on Data Mining | 304 | 323 | |
28 | DMIN – Int. Conf. on Data Mining | 434 | 278 | |
29 | MineNet – Mining Network Data | 22 | 278 | |
30 | WebMine – Workshop on Web Mining | 15 | 245 |
And here is the top 30 conferences by Field rating:
Rank | Conference | Publications | Field Rating |
1 | KDD – Knowledge Discovery and Data Mining | 2063 | 122 |
2 | ICDE – International Conference on Data Engineering | 4012 | 104 |
3 | CIKM – International Conference on Information and Knowledge Management | 2636 | 67 |
4 | ICDM – IEEE International Conference on Data Mining | 2506 | 56 |
5 | SDM – SIAM International Conference on Data Mining | 708 | 45 |
6 | PKDD – Principles of Data Mining and Knowledge Discovery | 994 | 40 |
7 | PAKDD – Pacific-Asia Conference on Knowledge Discovery and Data Mining | 1255 | 33 |
8 | RIAO – Recherche d’Information Assistee par Ordinateur | 574 | 28 |
9 | DMKD / DAMI – Research Issues on Data Mining and Knowledge Discovery | 103 | 27 |
10 | DASFAA – Database Systems for Advanced Applications | 1251 | 26 |
11 | DaWaK – Data Warehousing and Knowledge Discovery | 503 | 22 |
12 | DOLAP – International Workshop on Data Warehousing and OLAP | 177 | 22 |
13 | DS – Discovery Science | 553 | 20 |
14 | ICWSM – International Conference on Weblogs and Social Media | 238 | 19 |
15 | WSDM – Web Search and Data Mining | 196 | 19 |
16 | DMDW – Design and Management of Data Warehouses | 70 | 19 |
17 | PJW – Workshop on Persistence and Java | 41 | 16 |
18 | FIMI – Workshop on Frequent Itemset Mining Implementations | 32 | 14 |
19 | GRC – IEEE International Conference on Granular Computing | 1351 | 13 |
20 | IDEAL – Intelligent Data Engineering and Automated Learning | 1032 | 13 |
21 | MLDM – Machine Learning and Data Mining in Pattern Recognition | 313 | 13 |
22 | Fuzzy Systems and Knowledge Discovery | 4626 | 11 |
23 | ADMA – Advanced Data Mining and Applications | 562 | 10 |
24 | KDID – International Workshop on Knowledge Discovery in Inductive Databases | 70 | 10 |
25 | ICDM – Industrial Conference on Data Mining | 304 | 9 |
26 | MineNet – Mining Network Data | 22 | 9 |
27 | ESF Exploratory Workshops | 17 | 8 |
28 | TSDM – Temporal, Spatial, and Spatio-Temporal Data Mining | 13 | 8 |
29 | ICETET – International Conference on Emerging Trends in Engineering & Technology | 712 | 7 |
30 | WKDD – Workshop on Knowledge Discovery and Data Mining | 527 | 7 |
Some observations:
- The ranking by citations and by field rating are quite similar.
- The KDD conference is still the #1 conference. This make sense, and also that the CIKM, ICDM and SDM conferences are among the top conferences in the field
- PKDD is higher than PAKDD, which are higher than DASFAA and DAWAK as in the Google ranking, and I agree with this.
- Some conferences were not in the Google ranking like ICDE. It may be because the Google ranking put the ICDE conference in a different category.
- Microsoft rank DMKD / DAMI as a conference, while it is a journal.
- The FIMI workshop is also ranked high although that workshops only occurred in 2003 and 2004. Thus, it seems that Microsoft has no restrictions on time. Actually, since the FIMI workshop was not help since 2004, it should not be in this ranking. The ranking would probably be better if Microsoft would consider only the last five years for example.
3)The Microsoft Ranking of data mining journals
Now let’s look at the top 20 data mining journals according to Microsoft, by citations.
Rank | Journal | Publications | Citations |
1 | IPL – Information Processing Letters | 7044 | 62746 |
2 | TKDE – IEEE Transactions on Knowledge and Data Engineering | 2742 | 60945 |
3 | CS&DA – Computational Statistics & Data Analysis | 4524 | 24716 |
4 | DATAMINE – Data Mining and Knowledge Discovery | 584 | 19727 |
5 | VLDB – The Vldb Journal | 631 | 17785 |
6 | Journal of Knowledge Management | 747 | 9601 |
7 | Sigkdd Explorations | 491 | 9564 |
8 | Journal of Classification | 550 | 8041 |
9 | KAIS – Knowledge and Information Systems | 741 | 7639 |
10 | WWW – World Wide Web | 540 | 7182 |
11 | INFFUS – Information Fusion | 567 | 5617 |
12 | IDA – Intelligent Data Analysis | 477 | 4167 |
13 | Transactions on Rough Sets | 221 | 1653 |
14 | JECR – Journal of Electronic Commerce Research | 122 | 1577 |
15 | TKDD – ACM Transactions on Knowledge Discovery From Data | 110 | 716 |
16 | IJDWM – International Journal of Data Warehousing and Mining | 102 | 366 |
17 | IJDMB – International Journal of Data Mining and Bioinformatics | 132 | 256 |
18 | IJBIDM – International Journal of Business Intelligence and Data Mining | 124 | 251 |
19 | Statistical Analysis and Data Mining | 124 | 169 |
20 | IJICT – International Journal of Information and Communication Technology | 111 | 125 |
And here is the top 20 journals by Field Rating.
Rank | Journal | Publications | Field Rating | |
1 | TKDE – IEEE Transactions on Knowledge and Data Engineering | 2742 | 109 | |
2 | IPL – Information Processing Letters | 7044 | 80 | |
3 | VLDB – The Vldb Journal | 631 | 61 | |
4 | DATAMINE – Data Mining and Knowledge Discovery | 584 | 57 | |
5 | Sigkdd Explorations | 491 | 50 | |
6 | CS&DA – Computational Statistics & Data Analysis | 4524 | 49 | |
7 | Journal of Knowledge Management | 747 | 46 | |
8 | WWW – World Wide Web | 540 | 42 | |
9 | Journal of Classification | 550 | 37 | |
10 | INFFUS – Information Fusion | 567 | 36 | |
11 | KAIS – Knowledge and Information Systems | 741 | 33 | |
12 | IDA – Intelligent Data Analysis | 477 | 28 | |
13 | JECR – Journal of Electronic Commerce Research | 122 | 21 | |
14 | Transactions on Rough Sets | 221 | 20 | |
15 | TKDD – ACM Transactions on Knowledge Discovery From Data | 110 | 13 | |
16 | IJDMB – International Journal of Data Mining and Bioinformatics | 132 | 8 | |
17 | IJDWM – International Journal of Data Warehousing and Mining | 102 | 8 | |
18 | IJBIDM – International Journal of Business Intelligence and Data Mining | 124 | 7 | |
19 | Statistical Analysis and Data Mining | 124 | 7 | |
20 | IJICT – International Journal of Information and Communication Technology | 111 | 5 |
Some observations:
- The ranking by citations and field rating are quite similar.
- The TKDE journal is again in the top of the ranking, just like in the Google ranking.
- It make sense that the VLDB journal is quite high. This journal was not in the Google ranking probably because it is more a database journal than a data mining journal.
- Sigkdd explorations is also a good journals, and it make sense to be in the list. However, I’m not sure that it should be higher than TKDD and DMKD / DAMI.
- The KAIS journal is still ranked quite high. This time it is lower than DMKD / DAMI (unlike in the Google Ranking) but still higher than TKDD. This is quite strange. Actually, TKDD is arguably a better journal. As explained in the comment section of this blog post, a reason why KAIS is ranked so high may be because in the past, the journal has encouraged authors to cite papers from the KAIS journal. Besides, it appears that the Microsoft ranking has no restriction on time (it does not consider only the last five years for example).
- It is also quite strange that “Intelligent Data Analysis” is ranked higher than TKDD.
- Some journals like WWW and JECR should perhaps not be in this ranking. Although they publish data mining papers, they do not exclusively focus on data mining. And this is probably the reason why they are not in the Google ranking. On overall, the Microsoft ranking seems to be broader than the Google ranking.
4) Impact factor ranking
Now, another popular way of ranking journals is using their impact factor (IF). I have taken some of the top data mining journals above and obtained their Impact Factor from 2014/2015 or 2013, when I could not find the information for 2015. Here is the result:
Journal | Impact factor |
DMKD/DAMI Data Mining and Knowledge Discovery | 2.714 |
IEEE Transactions on Knowledge and Data Engineering | 2.476 |
Knowledge and Information Systems | 1.78 |
VLDB – The Vldb Journal | 1.57 |
TKDD – ACM Transactions on Knowledge Discovery From Data | 1.14 |
Advances in Data Analysis and Classification | 1.03 |
Intelligent Data Analysis | 0.50 |
Some observations:
- TKDE and DAMI/DMKD are still among the top journal
- As in the Microsoft ranking, DAMI/DMKD is above KAIS, which is above TKDD.
- As pointed out in the comment section of this blog post, it is strange that KAIS is so high, compared for example to TKDD, or VLDB, which is a first-tier database journal. This shows that IF is not a perfect metric.
- Compared to the Microsoft Ranking, the IF ranking at least has the “Intelligent Data Analysis” journal much lower than TKDD. This make sense, as TKDD is a better journal.
Conclusion
In this blog post, we have looked at three different rankings of data mining journals and conferences: the Microsoft ranking, the Google ranking, and the Impact Factor ranking.
All these rankings are not perfect. They are somewhat accurate but they may not always correspond to the actual reputation in the data mining field. The Google ranking is more focused on the data mining field, while the Microsoft ranking is perhaps too broad, and seems to have no restriction on time. Also, as it can be seen by observing these rankings, different measures yield different rankings. However, there are still some clear trends in these ranking such as TKDE being ranked as one of the top journal and KDD as the top conference in all rankings. The top journals and conferences are more or less the same in each ranking. But there are also some strange ranks such as KAIS and Intelligent Data Analysis being ranked higher than TKDD in the Microsoft ranking.
Do you agree with these rankings? Please leave your comments below!
Update 2016-07-19: I have updated the blog post based on the insightful comments made by Jefrey in the comment section. Thanks!
==
Philippe Fournier-Viger is a full professor and the founder of the open-source data mining software SPMF, offering more than 110 data mining algorithms. If you like this blog, you can tweet about it and/or subscribe to my twitter account @philfv to get notified about new posts.
H5 (the Google measure) is not useful to assess the average quality of papers, but it is a good proxy for total impact on the field. The reason for this is that H5 does not punish for having very many publications. This kind of information should not be used to determine which journal to submit your paper to.
Perhaps you are also aware that KAIS conducted terribly unethical behaviour in the past; they required every publication to cite at least 4-5 papers from the journal, suggesting to authors that had fewer such references that there paper was ‘not sufficiently clearly in the scope of the journal’. They were banned from ISI then, and have since improved their behaviour and were recently indexed again. It depends on how far back the data is included in the scores whether this would still be visible in the numbers above.
For #3, the Microsoft Journal ranking, you copied the ‘field ratings’ not the citation counts. I do not know what this field rating means, maybe you could find out?
The most recent IFs put DMKD (or DAMI, as it is typically called in the field) just ahead again of TKDE: 2.714 vs. 2.476. My subjective view is that they are both top-tier, with DAMI slightly ahead in quality. I would put SIGKDD Explorations in the same group as well, but that really includes very few papers per year (10 or even fewer). Apparently TKDE publishes considerably more papers than DAMI, so its total impact on the field is higher, hence its field rating may be appropriate. Subjectively, I would rank TKDD and KAIS simply in tier two, just like IPL, CS&DA, SADM. VLDB is not data mining, but it is absolutely top tier in databases, so its IF is apparently a bad indicator. As you probably know IF includes only citations from venues that are indexed in Thomson Reuters’ Web of Science, which includes only a fraction of computer science publications. The Microsoft and Google data is much better for analysis of computer science publication venues.
Of course there are computer science journals that also publish data mining that are (much) more prestigious: Pattern Analysis and Machine Intelligence (PAMI), Journal of the ACM (JACM), ACM Computing Surveys (for surveys only of course). Finally, there are occasionally papers in Science and PNAS. Recent evidence: http://science.sciencemag.org/content/353/6295/163.
Thanks for taking the time and effort to write this!
Hi Jefrey,
Thanks a lot for your insightful comments. I have updated the blog post based on your comments.
The blog post now includes both the “citations” and “field rating” rankings by Microsoft. It is not very clear how Microsoft calculate the field rating. According to their Help Center, it is somewhat similar to the H index but they don’t give the exact formula. Here is their explanation: “The field rating is similar to h-index in that it calculates the number of publications by an author and the distribution of citations to the publications. Field rating only calculates publications and citations within a specific field and shows the impact of the scholar or journal within that specific field.”
Based on your comments, I have also found that Microsoft has no time restriction in their formula for calculating the ranking. For example, they include the FIMI workshop in their ranking, which was not held since 2004. Thus, I think that it explains the high ranking of KAIS. It would means that they are considering the all-time citations of KAIS rather than just the last few years. I was not aware of the past problems of the KAIS journal with citations. Given your explanation, I think that we have found a plausible explanation for the high ranking of KAIS.
By looking again at the ranking, I also found quite unusual that Intelligent Data Analysis is ranked higher than TKDD! IDA is in my opinion a second or third tier journal. It should clearly not be above TKDD.
Thanks again for your discussion on this blog. It is quite interesting to read your comments.
Philippe
Hi Philippe,
I agree with you regarding IDA and TKDD. I did not know even that the IDA journal published that many papers (>400 according to Microsoft in your tables). Anyway, the Impact Factor for IDA is 0.631 (2016) while for TKDD it is 1.0 (2015, couldn’t find 2016), which aligns with our impressions.
Best wishes,
Jefrey
I am a beginner to research field, and this has helped to rationalise what to go for resources and my own publications.
Thank you for taking the time and effort to compile this.
Best,
Darsh