Interview with Prof. Rage Uday Kiran about Data Mining

Today, I have the pleasure to interview Rage Uday Kiran researcher at the National Institute of Informatics in Tokyo, Japan.  R. Uday Kiran is an Indian researcher who has been working in Japan for several years. He has been active mainly in the field of data mining, and is a well-known researcher on the topic of discovering patterns in databases. He has taken the time to answer several questions for this interview.

1) Could you please give a brief overview of your most important contributions?

Frequent itemset mining is an important model in data mining. Its mining algorithms discover all itemsets in the data that satisfy the user-specified minimum support (minSup) constraint. The minSup controls the minimum number of transactions that an itemset must cover within the data. Since only a single minSup threshold is used for the entire data, the model implicitly assumes that all items within the data have uniform frequency. However, this is the seldom case in many real-world applications. In many applications, some items appear very frequently within the data, while others rarely appear. If the frequencies of items vary greatly, then we encounter the following two problems:

  • If minSup is set too high, we miss those itemsets that involve rare items in the data.
  • In order to find the itemsets that involve both frequent and rare items, we have to set minSup very low. However, this may cause a combinatorial explosion, producing too many itemsets, because those frequent items associate with one another in all possible ways and many of them are meaningless depending upon the user and/or application requirements.

This dilemma is known as the rare item problem.   During my PhD, I have tried to address this problem by developing frequent itemset models based on multiple minimum supports.

Periodic itemsets are an important class of regularities that exist within the data. Most previous studies have tried to find periodic itemsets based on an implicit assumption that all transactions within the data occur at a fixed time interval.  However, in many real-world applications, transactions occur irregularly within the data.  For the past few years, I am developing models to discover different types of periodic itemsets in irregular time series/temporal databases.

2) What do you think are the key problems that remain to be solved in the field of pattern mining?

1. Rare Item problem is still a major problem which needs to be addressed in many pattern mining models.

2.  Non-support measures, such as occupancy, have to be investigated to assess the interestingness of an itemset.

3. Tuning is a common practice in pattern mining. So disk based algorithms have been investigated to lower the operational cost.

3) What do you expect to achieve in the next 5 years?

In the near future,  IoT devices become the main source of data. The data generated by these IoTs is often large (petabytes of data) and typically have spatiotemporal characteristics.  In the next few years, I would like to develop models that can extract useful information in spatiotemporal databases. In addition, I would like to investigate parallel and disk-based algorithms to find useful information in very large databases efficiently.

4) Do you think that it is important to collaborate with the industry? What are the keys to a successful collaboration?

Yes. I firmly believe it is important for an academician to collaborate with the industry persons. Industrial collaboration facilitates an academician to know the limitations of current research on a particular topic, thereby, enabling an academician to develop models and algorithms that can cater to the industrial requirements. Mutual trust, regular discussions and openness are crucial factors for a successful collaboration.

5) What is the current state of data mining and artificial intelligence technology in Japan?

In my opinion, this is the hardest question to answer. Japanese government has initiated a project, called Society 5.0, which is a human-centered society that balances economic advancement with the resolution of social problems by a system that highly integrates cyberspace and physical space. In this context, most researchers in Japan are working on developing parallel deep neural network algorithms that can analyze the real-world data effectively.  In my lab at the University of Tokyo, researchers are working on language translation using deep neural networks.

6) Which conferences do you like to attend? Why?

I generally wish to attend top international conferences (e.g. KDD, CIKM, PAKDD, SSDBM, EDBT, DASFAA and DEXA). The reasons are as follows : (1) To know about the hot research problems  which are being addressed by the researchers. (2) Interact with the speakers/authors to gain in-depth perception on the interested topics. (3) Collaboration with fellow researchers working on similar topics.

7) Do you have some advices for young researchers?

Have an open mind. Read as many research papers as possible, and ensure that you are covering many topics. Try to get the grasp of implicit and explicit assumptions made by authors in every research paper. Carefully manage the time. Try to collaborate with the senior research students/persons in your lab.

Thanks for participating to this interview!

This entry was posted in Data Mining, Data science, Interview, Pattern Mining and tagged , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *