How to choose a good thesis topic in Data Mining?

I have seen many people asking for help in data mining forums and on other websites about how to choose a good thesis topic in data mining.  Therefore, in this this post, I will address this question.

The first thing to consider is whether you want to design/improve data mining techniques, apply data mining techniques or do both.  Personally, I think that designing or improving data mining techniques is more challenging than using already existing techniques.  Moreover, you can make a more fundamental contribution if you work on improving data mining techniques instead of applying them. However, you need to be aware that improving data mining techniques may require better algorithmic and/or mathematics skills.

The second thing to consider is what kind of techniques you want to apply or design/improve? Data mining is a broad field consisting of many techniques such as neural networks, association rule mining algorithms, clustering and outlier detection. You should try to get some overview of the different techniques to see what you are more interested in. To get a rough overview of the field, you could read some introduction books on data mining such as the book by Tan, Steinbach & Kumar (Introduction to data mining) or read websites and articles related to data mining. If your goal is just to apply data mining techniques to achieve some other purpose (e.g. analysing cancer data) but you don’t know which one yet, you could skip this question.

The third thing to consider is  which problems you want to solve or what you want to improve.  This requires more thoughts.  A good way is to look at recent good data mining conferences  (KDD, ICDM, PKDD, PAKDD, ADMA, DAWAK, etc.) and journals (TKDE, TKDD, KAIS, etc.), or to attend conferences, if possible, and talk with other researchers.  This helps to see what are the current popular topics and what kind of problems researchers are currently trying to solve.  It does not mean that you need to work on the most popular topic. Working on a popular topic (e.g. social network mining) has several advantages. It is easier to get grants or in some case to get your papers accepted in special issues, workshops, etc. However, there are  also some “older” topics that are also interesting even if they are not the current flavor of the day. Actually, the most important is that you find a topic that you like and will enjoy working on it for perhaps a few years of your life. Finding a good problem to work on can require to read several articles to understand what are the limitations of current techniques and decide what can be improved.  So don’t worry. It is normal that it takes time to find a more specific topic.

Fourth,  one should not forget that helping to choose a thesis topic is also the job of the professor that supervise the Master or Ph.D Students. Therefore, if you are looking for a thesis topic, it is good to talk with your supervisor and ask for suggestions. He should help you.  If you don’t have a supervisor yet, then  try to get a rough idea of what you like, and try to meet/discuss with professors that could become your supervisors. Some of them will perhaps have some research projects and ideas that they could give  you if you work with them. Choosing a supervisor is a very important and strategic decision that every graduate student has to make.  For more information about choosing a supervisor, you can read this post : How to choose a research advisor for M.Sc. / Ph.D ?

Lastly, I would like to discuss the common question   “please give me a Ph.D. topic in data mining“, that I read on websites and that I sometimes receive in my e-mails. There are two problems with this question. The first problem is that it is too general. As mentioned, data mining is a very broad field. For example, I could suggest you some very specific topics such as detecting outliers in imbalanced stock market data or to optimize the memory efficiency of subgraph mining algorithms for community detection in social networks. But will you like it? It is best to choose something by yourself that you like. The second problem with the above question is that choosing a topic is the work that a researcher should do or learn to do. In fact, in research, it is equally important to be able to find a good research problem as it is to find a good solution. Therefore, I highly recommend to try to find a research topic by yourself, as it is important to develop this skill to become a successful researcher. If you are a student, when searching for a topic, you can ask your research advisor to guide you.

Also, just for fun, here is a Ph.D thesis title generator.

If you like this blog, you can subscribe to the RSS Feed or my Twitter account (https://twitter.com/philfv) to get notified about future blog posts.

Related posts:

This entry was posted in Data Mining, Research and tagged , , . Bookmark the permalink.

521 Responses to How to choose a good thesis topic in Data Mining?

  1. John Stan says:

    Good article.

  2. Dvijesh says:

    Really good one….

    • Mohd Shahid Khan says:

      Sir, i want to do Ph.D on data mining but i am not a good programmer so please tell me the topic of research on data mining or any other area which need no programming skills.

      plz take it urgent.
      Hoping for your reply at the earliest.

      • If you want absolutely NO programming and still do some data mining, then I think that you would have to go toward mathematics and statistics. Or you could do some applied data mining. For example, you use some already made data mining tools or software such as Matlab or R to analyze some data. The contribution would be to do something new with the data instead of proposing new algorithms.

        For finding a specific topic, you can read the blog post above which explain the steps for finding a good topic.

        • neha says:

          i am interested in page ranking topic for my mtech thesis in data mining. Can you suggest me some sources from where i can get good information about the topic?

        • durgesh says:

          Yes Sir, Mr. Mohd Shahid Khan said right for those who can not do programming and the same problem with me also so please suggest those topics through which I can do my project in data mining in M-Tech as you said in his reply that “use some already made data mining tools or software such as Matlab or R to analyze some data. ” So please suggest me on this type of topic.

          • The goal of this blog post is to explain how to search for a topic, rather than give topic. It takes time to find a good topic, so I cannot do it for you. But you can read the blog post to understand how to search.

  3. Paolo says:

    HI. I want to do a Ph.D. in statistical data mining. But it is difficult to choose a topic. There are too many data mining topics. I’m thinking of improving decision tree mining algorithms with statistical validity tests. Is it a good idea?

    • I’m not working with decision trees, but it seems like an interesting topic. What is important is that you will propose something new that has not done before and that is useful. I think that it could meet these criteria. I would recommend to read about what has been done on statistical validity tests with decision tree, to try to see what can be done. Note that, even if you choose this as your project, it is possible to change the orientation of your research later on.

      Hope this helps,

      Philippe

      • Uday Kiran Rage says:

        Many people have already worked on extending statistics to various types of classifiers. Try to do research on Ensemble classifiers and other types. As you are interested in decision trees, I suggest you to look at the usage of emerging patterns.

  4. Murthy says:

    Sir, i want to do Ph.D on data mining but i am unable to find any problem in data mining so please tell me the problems on data mining or any other areas.
    plz take it urgent.
    Hoping for your reply at the earliest

    • Data mining is very broad. I could give you some topics, for example, what about applying neural networks to recognize music, using association rules to classify medical data, improving the memory efficiency of clustering algorithms, sequential pattern mining algorithms, etc. I mean, I could suggest you a lot of topics like that. But data mining is a very broad field. It would be better that you like. Do you have any specific interests in data mining?

  5. arvind says:

    Hi ,

    I am looking for a research topic within data mining in Big data.
    Please suggest me a topic and where to start looking for it ?

    Thanks & Best Regards,
    arvind

  6. vinodkumar says:

    sir, i would like to work on the concpt of audio and video streaming in dataminig. which topic will suitable for me for resarch. Please suggest the topic,

  7. Azaim says:

    Sir, iam seraching for a thesis topic in Data Mining for my M.Tech course. I select Cloud Mining but in this topic what i have to do i dont know. Please suggest me.

    Thanks & Regards

  8. AP says:

    Thanks for the good articles.
    I want to do research on Data Mining Applications to Agricultural Field topic( I feel this interesting topic as of now).
    What would be the best topics on this.
    How to approach to these? Is that could be a good idea or not?

    Thanks

    • Yes, i think it can be good idea. I’m not familiar with agricultural field so I cannot tell you what are the best topics related to this. But you can find this out by doing a literature review. What you could do is to find a problem that people in the agricultural field have and then to see what data mining technique can be applied and what are the limitations. Then, if you could extend the data mining technique to address the limitation, it would be great.

      • AP says:

        Thanks Philippe for quick reply.
        I am reading all your articles and will follow them.
        Mean while understand the data mining algorithms using your suggestions(books,from open source software, etc). Try to implement small algorithms and get confidence….
        Will contact you If I need a suggestions.
        Thanks

  9. Anju says:

    Hello Sir,
    I am Student of M Tech(CSE) and i want to do research on data mining and mine area of
    Interest in “Web Usage Mining and Social Network Mining”.Can u suggest me Certain topic regarding this Mining Area……

  10. Deepa says:

    Hello sir,
    I like to do research in data mining sir but I do no what to do? And how to do? Please help me.. my guide is asking me research topic. I do no anything.. pls help me to find good topic sir in data mining or please tel me any nice topic. I do no what current process is going on data mining sir…

    Its very ugent sir please help me pls..

    thank u sir…

    • If you had read the blog post, I wrote that the question “please give me a topic in data mining” is too general. You should first learn about what is data mining and what are your interests in data mining because data mining is a very broad field. If you just ask for a topic in data mining, then I could suggest you any random topic such as: improving the memory efficiency of frequent subgraph mining algorithms for the case of uncertain data with applications in social networks. But will you like it? I think that it is better that yo u find a topic that you like. In the blog post, I gave the information about how to find a good topic.

  11. Anju says:

    Thanks sir,

    And suppose i am selecting the Social Network Mining as a My Research Topic can u tell me for the Implementation purpose which Programming lang. or the tool is required?
    is there any specific Web Site from which i can get the Source Code

    • I do not work on social network analysis so I cannot help you too much. I know that some social network datasets are available on the web such as the DBLP dataset. There is some datasets here for example:
      http://snap.stanford.edu/data/
      You can also find more by searching on Google.

      I know that there is a conference called ASONAM about social network mining. You could check what is published there to get some ideas of what researchers do on this topic and there is some journals about social network mining such as this : http://www.journals.elsevier.com/social-networks/

      For the programming language, I think that you can use anything you like. However speed is sometimes important in data mining. So I would not choose a language like PHP that is considered slow. Personally, I would use a language like C++, Java of C#, because they have good performance. But it may depends on what you do…

      For source code, I don’t know. You would need to search.

  12. deepa says:

    hello sir.,

    In ‘improving the memory efficiency of frequent subgraph mining algorithms for the case of uncertain data with applications in social networks’ its already researched sir you can see in this website…
    http://dl.acm.org/citation.cfm?id=1646028

    thany u…

  13. M. Satari says:

    Very helpful post. I’ve somehow managed to choose a topic but I don’t know weather it’s a good topic to continue on my Ph.D or not. I’m masters student and I’m working on “online microbloging/social networks’ stream mining”. I want to know your opinion on this topic as an expert in data mining domain.

    Thank you

    • I think that it is a good topic. Social network and micro-blogging are popular research areas, right now, and it is good to choose something that is popular. Stream mining is also an interesting topic because it is more difficult than mining a static database.

  14. Ankita says:

    Hi Philippe ,
    A very Big thank for this article , It helps me lot .
    I m interested in “using association rules to classify medical data” , will you please give me more specific idea , in which way I could do research for this topic . how can I improve already exist research work in this topic.

    • I think it can be a good topic. In my opinion, for this kind of topic, you can have two contributions: (1) a contribution is mining the medical data to do something useful and (2) the second contribution is maybe improve some association rule mining algorithm so that it work better with your data.

      It would be important to find some medical dataset for your project. Would it be patient record? datasets about drugs ? datasets about genetic data? datasets about diseases? etc. I don’t know. You would need to search for some data. You could have a look at what is available on internet. Or you could also see with your supervisor if you can obtain some medical data at your university if there is a medicine department. Ideally, you could collaborate with some people working in the medicine department that could help you about understanding medical data in your project and tell you what is important..

      After you got your data and you know what is the goal that you want to do with your data, you could apply association rule mining and see what are the problems that you have with the current algorithms. Then, you could find some way to improve them so that they better suit your data and what you want to do with the data. For example, maybe you find that current algorithm cannot do X. Then you find a way to modify it to do X, whatever X is. Or if you find that association rule is not the best solution, then you can use something else.

      For improving algorithms, you would need to understand the current algorithm and see what they cannot do for your medical data.

      Hope this helps,

      Philippe

  15. Anju says:

    hello sir can u suggest me some topic regarding “Frequent Pattern Mining”

  16. Theva says:

    hello sir,

    I like to do research in healthcare using data mining.

    please give me more specific idea about this topic.

    • You can make two kinds of contributions:
      – contribution about doing news things with the healthcare data by applying data mining algorithms
      – creating some new data mining algorithms or techniques that are better than existing techniques or deal with specific problems in healthcare data.

      For more specific topics, I cannot help you because i’m not a specialist in healthcare. Furthermore, I gave the steps in the blog post about how to search for a good thesis topic in data mining.

  17. dhivya says:

    hello sir, i am searching for the m.phil thesis topic in datamining and give some idea about the thesis paper.

  18. M DOORVASULU NAIDU says:

    sir i want to do research in data mining in the field of clustering please mention any new suggestions on this topic

    • Clustering is good. But you need to find something more specific. For example: improve the speed or memory usage of some clustering algorithms? modify the algorithms to handle new types of data? use the algorithms to perform something better? propose some new ways to detect outliers in clusters? …. I just say that as examples. You would need to do a literature review to know what is a good topic and what has been done in clustering (I don’t work in this sub area of data mining).

      • MANU says:

        Sir, i want to prepare a journal on data mining.its a part of our syllabus.so could you please suggest one topic on data mining. take it urgent.
        Hoping for your reply at the earliest
        Thanku

  19. subeg says:

    Respected sir, I am Mtech student and m going for resarch in data mining…my guide says for selecting a particular field like data mining or cloud mining .sir plz suggest me which is best..

    • Data mining or cloud computing? I think that both are ok and popular in the industry. So to choose one depends on what you like.

      • subeg says:

        sir if i select one from both data mining and cloud mining plz give your idea…….

        • You could design a new data mining algorithms that can run on the Hadoop technology for cloud computing. It could be a clustering algorithm, a pattern mining algorithm, a classification algorithm… or something else. Then when you design an algorithm, you need to show that it is faster, more memory efficient or that it performs new things that no other algorithm can do, or be used to do something new. If you do a literature review, you should be able to find a good topic.

  20. sara says:

    hi
    what about opinion mining? is it good topic for master thesis?

  21. sara says:

    what about data mining in cancer, diabets and heart disease data? are they good? or they are old?

    • There are certainly some things to do on these topics (you would need to do a literature review to see what has been done recently). For medical topic,s a challenge is to obtain medical data and find a specialist that could guide you about what the data means and what would be important to do with this data. If you can have access to medical data (maybe that there are some on the web too), it could be great.

  22. priyanka says:

    sir i want to start mtech thesis in cloud mining or web mining sir plz suggest me any suggestion………..

  23. suzanne says:

    Hello sir, the subject of my thesis is the prediction of links in social networks. What is your opinion and do you can help me with some ideas.

    • I don’t know much about social network mining. If you want to know what is the latest research on this topic, you would need to do a literature review (search for articles on this topic in recent conferences/journals).

  24. jyoti upadhyay says:

    first of all i would like to give my thanks for the article. It is really vry helpful for new researchers . I want to do Phd in “Data Mining on e-learning ” plz suggest me some research topic on it.

    • You should look at recent conferences about data mining & e-learning to see what are the current topics and see if there is something that you like. For example: EDM, AIED, ITS, ICALT, EC-TEL etc.

  25. G.M.Sha says:

    Thanks for excellent support to researchers, my topic is privacy preservation in data mining. Is topic is good to work on? can u pls. suggest some problems on this topic.
    I am interested in consumer behavior and dental field.
    thanks in advance..

    • Privacy preservation is definitely an important topic, since more and more data are collected about individuals or other sensitive topics and it is important to protect it. For the challenges, and finding a good specific topic, the only way to find out is to do a literature review (read recent recent articles on privacy preservation to see what other researchers have done on this topic recently). Me, I don’t work on this topic, so I don’t know what is the latest research on this topic.

  26. Thanks for your help sir..

    Its great that science doesn’t require any boundary..

    Thanks for your expertise.

  27. Prakash Saket says:

    Hi Philippe ,
    A very Big thank for this article , It helps me lot .
    I m interested in “using association rules to classify medical data” , will you please give me more specific idea , in which way I could do research for this topic . how can I improve already exist research work in this topic.

    • You could:
      (1) improve association rule mining algorithms
      (2) apply them in novel ways on medical data or to perform something better by using association rules
      (3) do both (1) and (2).

      But you need to define more precisely what you want to do with medical data and what kind of medical data. Personally, I have no knowledge about medical data. The only way to find out is to read articles on this topic or to find some expert from the medical domain who could guide you about what is challenging in the medical domain. Also, maybe you can check if someone has done a similar topic before and identifies what are the limitations of their work. This could be a starting point.

      You could also start by searching for medical data because you need data if you want to do data mining! Depending on the data that you can get and what you want to do with the data, it may gives you some hint about what are the problems with the current association rule mining algorithm for this task and what could be improved.

  28. Prakash Saket says:

    Hello sir,
    I like to do research in data mining sir but I do no what to do? And how to do? Please help me..

    • As I said in the blog post, data mining is a very broad field. If you don’t know what you like in data mining, then perhaps that you need to read about data mining first (e.g. an introductory book), or read articles from recent data mining conferences and see if there is a topic that you like.

  29. adnan says:

    sir i have to choose a topic in data mining for thesis in mtech
    having to read this blog i got topic which i like are using association rules to classify medical data
    so suggest me how initiate work on this

  30. neha says:

    Hello sir,
    I like to do research in data mining sir but I do no what to do? And how to do? Please help me.. my guide is asking me research topic. I do no anything.. pls help me to find good topic sir in data mining or please tel me any nice topic. I do no what current process is going on data mining sir…i hv no knowledge of programming more.

    Its very ugent sir please help me pls..

    • As I said in the blog post, I will not answer the general question of “give me a topic in data mining”, as data mining is a very broad field and I will not do a literature review for you to find a good up-to-date topic. I suggest first reading about what is data mining and then to look at recent papers published in data mining conferences and choose something that look interesting .

  31. amir rezaei says:

    Hi
      If I want to do data mining using fuzzy logic, can you help me

    • You need to define your topic more clearly. Fuzzy logic is a technique for representing fuzzy things. But then, what you want to do with fuzzy logic? You want to apply this to what kind of data? Why you want to apply fuzzy logic to this data? What is your goal?

  32. priyanka says:

    sir i m doing research in cloud plz suggest me any area or artical or topic.

    • You need to read articles from recent conferences to see what people have done on this topic recently, as I explained in the blog post. This will give you ideas about what is interesting for this topic.

  33. amir rezaei says:

    Hi
    I have great interest in data mining. but my professor field is not data mining.
    but I want work on data mining.
    Will you please help me in this regard. or suggest me person that work on data mining.
    I’ll give you the final results.
    I hope you can help me

  34. ankita says:

    hi
    i am M.tech student
    i want to do dissertation in data mining
    my interest to do research in security in cloud computing with data mining
    can you pls suggest better topic or any idea about data mining in cloud computing
    pls give me quick reply
    its very urgent!!!!!!! pls……..
    thanx

    • As I said in the blog post, I suggest reading recent articles related to your topic in recent data mining conferences/journals to see what is popular right now and find a topic that you like.

    • Firas M.Awaysheh says:

      Hi ankita ,
      I see that you post this almost a year ago and i have the same interests in my research. So how was it!? and what suggestions you may kindly offer to enhance my proposal.

      many thanks dear.
      Firas M.A

  35. Colm says:

    Hi Philippe,

    Fantastic website, Recently I’ve started my 4th year of college and have chosen data mining as an elective. Im looking into the major challenges faces in adopting data mining techniques within the public and private sectors. Just wondering could you steer me in the right direction.

    Many thanks,
    Colm

    • Hi Colm,

      Thanks. I’m glad that you appreciate the website.

      I’m into the technical aspect of data mining. I’m a professor in Computer Science. What I do is that I design some data mining algorithms and we apply them in some research projects. I don’t work closely with the public/private sectors since I’m working on the fundamental aspects of data mining (algorithms). So I cannot give you a lot of information about the challenges for adoption of data mining techniques.

      Some challenges that come to my mind are:
      – you need some trained people to understand what is data mining and how to analyze the data (e.g. data scientist)
      – data mining techniques offered in data mining software are not always well-suited to all domains. All data mining software offers a limited choice of data mining techniques. If they don’t fit with what the company wants then you perhaps need to hire some data mining specialist to design something custom.
      -….
      Philippe

      • Colm says:

        Hi Philippe,

        Fantastic, That’s a great start none the less. I appreciate your reply. Thank you again.

        Colm

  36. kiran says:

    sir m doin a research on Research issues on web data mining is this a good topic… i have read the introductory part nd some pdf also what else i can do in that in order to make my research better.

    • Yes, web data mining should be a good topic. Then, I think that you need to continue reading on this topic. By reading on the topic, you will see what people have done recently and get some ideas about what you can improve or do differently. It is important to know what other people have done before starting to do a research project.

  37. Mesfin Alemu Wotere says:

    I’m MSc student interested in data mining for my thesis. I have got a good approach how to do my thesis. Your suggestion is valuable. My thematic area is weather forecasting, what do you help me in identifying specific title.
    It is urgent!
    Thank you very much

  38. lap dat camera says:

    This is great information, thanks’ for share!

  39. P s raju says:

    Dear Sir/Madam

    I am pursuing M.Tech(CSE), I want do project in data mining, and also same project need to PhD thesis after MTech. Shall you give me good research topic on data mining?
    Please reply me
    Thank you

  40. Ahmed Mohammed says:

    i need to make my thesis in data mining using cluster algorithm for web pages to make semantic web

  41. Nishant Bhatt says:

    hi sir,

    I have choosen mood identification from music data as my topic.its a part of multimedia data mining.so i want to research using data mining algorithm.so can you give me suggestion on this topic.

    thank you sir.

  42. shakthi says:

    sir,
    i want to do research in data mining for my m.phil. how can i find a topic for my research or give some topics for suggesstion. or any website/book i can refer pls give that address/bookname

    thnks in advance

  43. parminder singh says:

    hi sir,
    i want to do my mtech theisis on clustering or classification. can u suggest any new topic in this field???????????? thank you sir

  44. agboola martins says:

    your type of article should be encouraged in research development. the article has directed my path on how I can come up with a researchable topics. I will still call on you after following your laid down approach to data mining. thank you Sir

  45. GAURAV AMETA says:

    Hello Sir

    I want to do my PhD work for privacy preserving in data mining for various transactions.Suppose I purchased items among a b c d e f g h
    1 0 1 0 1 1 01 . 1 means I purchased that particular item and 0 means not purchased.Which kind of technique I can apply in to protect binary data.I think the Privacy can be protected by using some cryptography algorithm.I am little bit confused how I can implement my idea with these binary transactions. please help.

    • Yes, I think that you can encrypt your data using an encryption algorithm.

      The fact that the transaction is binary or text should not make any problem for an encryption algorithm. In any case, you will give data to the algorithm who will generate encrypted data.

      To know how to do it in a particular language such as Java, etc. you would need to see what are the encryption algorithms available and search for some tutorials, I think.

  46. subeg singh says:

    i m doing thesis in cloud mining secuirty,sir plz suggest me any important topic………

    • I think that you need to define your topic more clearly. Cloud mining is a too broad topic. You need to make a literature review to see what is interesting in this area or discuss with your research advisor. Me, I don’t work in this area of data mining and I will not do the literature review for you 😉

  47. Alex says:

    Hello. I’m a Business Analytics (M.Sc.) student. I would like to write my thesis about a specific problem on business data which can be solved by a data mining algorithm. I like the kind of problems from the Data Mining Cup. I can not decide on any specific algorithm right now but I know that the problem should be business oriented. I would be thankful for any advice.

  48. Abdul rasak says:

    sir
    i am an assistant professor in comp. aplications. i am interested to do a research in agricultural field especially in cardamom cultivation. it would be helpful if you suggest some related topics to the starting of the work

  49. amir rezaei says:

    Hi
    Your opinion do we work on the subject ‘ data mining based constraint’

  50. Animesh says:

    Sir,
    I m interested in “using association rules to classify medical data”, but i am not so good in programming.Is it possible for me to continue this topics without programming or is the source code available in the web site? if source code is needed , will you able to provide me some links for source code?

    • Hi,

      There are some source code for association rule mining such as my own data mining library: http://www.philippe-fournier-viger.com/spmf/

      However, if you want to make a Ph.D. in Computer Science you will most likely need to do programming because doing a Ph.D. or Master degree in Computer Science without doing programming does not make sense, I think.

      Another possibility is to do a thesis from a mathematic perspective if you don’t want to do programming.

      Philippe

  51. FA says:

    hello sir
    special thanks for the article.
    sorry if it is also includes in the “can’t be answered ” questions.
    I would like to do my Ms thesis regarding Social network analysis, I do some research but I can’t decide yet ,which filed: could you suggest me some more conference or journals regarding this topic despite from http://asonam.cpsc.ucalgary.ca/ and http://www.journals.elsevier.com/social-networks/ which you already mentioned
    or any other way to extract open areas in this field.

    thank you so much
    regards,

  52. Jagrut says:

    Sir, i have to work on “finding closed frequent patterns and their association rules”
    on data mining, i’ve tried but cant find better approach to work with, so please tell me some parameters and approaches that can improve the efficiency.
    thanks

    • Hello,

      If I had a very good idea about this I would keep it for myself as I also work on pattern mining 😉 There are certainly some ideas but it takes time to find a good idea and when you got a good idea is better to keep it for your own research. If you cannot find one, then you can consider variations of the problem of association rule or itemset mining. For exampple, if you consider uncertain itemsets, fuzzy association rules, etc. etc., then maybe these topics are less explored and it is easier to find some ideas. This is just some thoughts. In my opinion, it is easier to create a new problem or a new algorithm for a variations of the problem of association rule mining than to try to make something faster than the best algorithms for association rule mining.

      Best,

      Philippe

  53. Gajanan says:

    can you please tell me few research topics on fuzzy multimedia data mining related topics

  54. vaitheeswaran says:

    Hello Sir,
    I’m interested in Big data.
    1.
    How does data mining and Big data are related ?
    2.
    I scanned the internet for Big data architecture, i cant get clear picture of it. Can u suggest me a standard article or book for understanding Big data ?
    3.
    how to develop a own architecture for large data handling ?

  55. Divdeep Singh says:

    Sir, can you please tell me the research topics in data mining for m.tech thesis. i am not good at programming.

  56. Rohith says:

    Sir,I need some topics which have less programming to do as a project.please suggest me some ideas.

  57. Fatima says:

    sir! give me some idea that what new can i do for “feature subset selection” in data mining….

  58. Meenu says:

    Sir,
    Im dng Mtech my research area is datacube Materialization & MapReducing..
    can you gve some suggessions for me..

  59. Ravi Kumar says:

    Really good. Many thanks to Mr.Philippe for his replies.

  60. Devendra Vashi says:

    i am searching Ph.D research topic in privacy preserving data mining

  61. vilas says:

    Sir, May I get data set for doing my research on “data mining for elearning” pl repy.

  62. Neeta H. Jadhav says:

    Sir, I am doing M.Tech in Computer science and technology. i want to do final project on data mining but i am unable to find any problem in data mining so please tell me the problems on data mining or any other areas.
    please take it urgent. I look forward for hearing good response.
    Thank you.
    Neeta Jadhav

  63. maheep gupta says:

    sir, I m student of Mtech and I am not able to find topic for my thesis can you suggest me some interesting topics.

  64. maheep gupta says:

    Sir, I am seraching for a thesis topic in Data Mining for my M.Tech course.

    can you suggest me some recent topics, I m not able to find any topic.
    just suggest me some topic so that I can start reading about it.

    memory efficiency of clustering algorithms, is this topic is good.

  65. Tich Chigidhani says:

    I am doing an undergrad research on multimedia mining can you suggest a topic?

  66. Jagrut says:

    Hello sir,
    I’m doing my thesis on closed frequent patterns and association rules, i am trying to find it by combining “n-list” and “charm” , So, how can i use any frequent pattern finding structure to find closed patterns, or is it possible. Any guidance plz…..
    Thanks,

    • There are many ways to solve the problem of closed itemset mining. Several algorithms have been published. To my knowledge, LCM should be one of the fastest according to the last FIMI competition in 2004. Other fast algorithms are DCI_Closed, FPClose, Charm etc. So, yes there are certainly still possibilities to improve these algorithms.

      How to do it ? I would say to read recent articles to see advances or ideas in that could be transposed to closed itemset mining. Or to try to find new ideas. However many research have been done on this topic. To make a significant contribution on this topic one would preferably need to compare his new algorithm with the fastest closed itemset mining algorithm.

      About nList, I have briefly read about it and it seems to be another variation of the FP-Tree. Therefore, perhaps that it would be easier to use it with FPClose than Charm…. I don’t know. Besides, I have noticed that the author of nlist seems to be posting links to his own article on many internet forums including on my own forum several times. I had checked the paper but I was personally not convinced by the experiments.

  67. balvinder taneja says:

    sir,
    i want to work on data mining in cloud computing? can u suggest me any appropriate topics related to this field.
    balvinder taneja

  68. malini joshi says:

    Sir, I am seraching for a thesis topic for my M.Tech course.
    can you suggest me some recent topics, I m not able to find any topic.
    just suggest me some topic so that I can start reading about it.
    n my area of interest is network security or data mining

    • As I said in the blog post, I will not suggest any topics. Looking for a topic takes time and I don’t have time to search for topics for other people. Moreover, as I explained in the blog post, you should choose a topic that you like, not a topic that someone else has chosen for you.

      If you have no idea what to do about data mining and network security, then a good start is to search for papers about data mining and network security in Google Scholar to see what other people have been doing on this topic.

  69. Kalpana says:

    Really nice article , I just want some research topics using clustering

  70. Umar Hayat says:

    Hello Sir,
    I am very glad to know that someone is guiding the humanity in some way, like you through such a honestly interesting page.
    Sir i am searching for my thesis topic ……..but after studing your kind blog and following your steps I am going to do my research in obtaining a data about earth properties and their environment and materials condition and based on that data i am going to know that to some extent that in this place there is occuring that ratio of iron , gold water etc..
    Is this will be good for me .
    Need your kind reply.
    thanks in advance.

  71. Jhosat says:

    sir,
    what possible topics can i do if i like to work on text mining
    i want to do a research that can be use to our school
    can you suggest some ideas ?

    plz reply
    thanks mr.phillipe

  72. Hossam says:

    I want researches about applying Data Mining techniques on master and PHD thesis and dissertations database. Thanks.

  73. neha says:

    Sir
    i want to do my mtech thesis on page ranking algorithm in data mining. Please suggest me some good topic in this for my thesis.

  74. ehsan says:

    Dear Sir,

    Firstly, I would thank you for useful article, then I am interested in clustering techniques for Data mining. Please guide me a topic for my PhD dissertation.
    My email is ehsan_amiri125@yahoo.com
    thank you.

    • Hi,
      As explained in the blog post, I would suggest you to do a literature review on this topic to find out what has been done recently on clustering. You can start by searching on Google Scholar to find some recent articles on clustering and to see what are the current challenges.

      Best regards,

  75. Anitha says:

    I am doing my research work in recommender systems (Data mining area) and concentrating attack concept. Now i am seeking classification, clustering algorithm. which one is the best and how to proceed?

    • I have no idea what is “concentrating attack”. I think it depends on what you want to do. Classification is generally to classify instances in a set of predefined categories. Clustering is generally to automatically create categories. So they don’t exactly do the same thing.

  76. riya says:

    hii

    i have decided to do my thesis on focused web crawlers for gathering educational material on the web. i just wanted to know if this topic is relevant in today’s time , i hope it not a very old topic?

    • I think it is ok. It depends on how you do it. You need to read articles on this topic to see what has been done already on this topic. I personally don’t know what has been done on this topic. Depending on what has been done, to make a good research work you need to improve or bring some new ideas. If you can do that on that topic, than the topic should be good.

  77. Sam says:

    Hi,
    I am looking for a research topic within data mining. I am interested in multimedia data mining in social media with social event detection as an application. My problem is to find a gap and a topic for my thesis.
    Please suggest me a topic and where to start looking for it?
    Best Regards,
    Sam

    • As I said in the blog post, looking for a topic takes time. You need to look for it by yourself or ask help from your research advisor. If you have no idea where to start, then I suggest to just search the keywords that you know in Google Scholar to find some articles or look for some general survey articles perhaps to get an idea about the whole domain first. Then when you find something interesting, look deeper.

  78. sam says:

    Hi
    I am a Phd student of medical datamining but Im still trying to identify my research questions … I am looking on applying datamining algorithms on clinical datasets as i dont have strong programming skills…could you please suggest any new applicable methods that is yet to be explored as my possible reseach questions?

    Thanks, Hope to hear from u soonest!

  79. eli says:

    hello
    I like to do datamining on blood transfusion data plz guide me how I can define subject about it

  80. Francois says:

    Hi Philippe

    It is truly astonishing to see how many ‘graduate’ students completely misread your post! I stumbled on this post during my own search for masters ideas, and I guess I got my night’s entertainment too.

    Good post, and I applaud your patience.

  81. nastaran says:

    Hi, thanks about your recommendations they are so help full.
    can you plz recommend some topics about outliers in data mining. I read all of the comments but i could not find any related things . is it some topic that common or it was old and have not any new thing for research.
    thanks alote philip.

    • Hi,

      If a research area become less popular, it does not mean that there is nothing interesting to do. There are always some research problems. You just need to find one.

      For specific topics, I recommend to follow the steps in the blog post, which starts by doing a litterature review.

  82. kutty says:

    hi sir
    can you please tell me related topics on Analytical hierarchy process in data mining for Ph.D. Please give me more specific idea about this topic. Plz reply s soon as possible

  83. Aadil Hussain says:

    Hello sir, I have to give a research proposal for M.phil in computer science. can you please suggest me some brief review type of research topic in neural networks(extensible).

  84. Kuldeep says:

    Sir i want to relate data mining in agriculture.. but i’m unable to find any specific thing. plz guide me for my PhD topic selection

  85. Muhammad Asadullah says:

    I want to do my research in data mining…

    please sir can I apply any data mining technique on DIP..

    or to develop any data mining technique for image processing…

    • I don’t work on image processing but there are certainly some way to apply data mining on images.

      For example, there are some works on spatial association rule mining to find patterns in geographical data. This geographical data could be extracted from satellite image before applying the data mining algorithm. This is just an example. You would need to search to find more information about the possibilities.

    • dev says:

      data mining have different technique like svm ,ANN , PCA, decision Tree ,rough set theory, clustering that are mostly used for DIP

  86. sara says:

    Hi
    If i can use current method of text mining for another language, for example English method for Arabic, would it be a new challenge for my thesis and write a new article?
    Arabic language is different from English, so by using current method in different langue i can achieve new result.

    • From a data mining perspective, it is a new challenge, if the method do not work well with Arabic and you need to modify the method to make it work for Arabic. If the method just work for arabic without modifying it, it would seem too simple, perhaps. But text mining is not my specialty. You may have a look at papers about text mining for other languages to see how they have presented their work.

      • sara says:

        Hi
        It is about a year that a write a comment here :). On that time I choose opinion mining as my thesis topic. I work hard to fine a new approach in opinion mining. But you know, I was not so successful in this way. I make a new sentiwordnet for a new language from existing wordnet. but I couldn’t suggest a new method. some one told me, that by combing existing method I can introduce new one. but how can i do that ? for example by combing svm and naive bayes how can i create a new method?
        really I need your help. how can I continue my thesis in a way to find a new idea?
        thanks

  87. pankaj says:

    hi hello , im student of MTECH 1st year and want to do reseach in data mining , by which i can easily take admission in phd programe .. so tell me topic which have wide scope of research ?

  88. crystal says:

    I really need more help into the research of mining and what questions I should research more on it is due tomorrow and I do not know anything

  89. crystal says:

    I need help to answer the questions I am a year 12 pa student from international overseas

  90. Deepali says:

    Sir, can you please help me, i am searching for PHD research topic, i am thinking for context aware computing in mobile computing.is it a good topic to move on?

  91. zeynab says:

    hi,i need help i must impelementation this paper with c# but i dont know how do ?plz help me

    A Novel Method for Privacy Preserving in
    Association Rule Mining Based on Genetic
    Algorithms
    Mohammad Naderi Dehkordi
    Ph.D Student, Science & Research Branch, Islamic Azad University (IAU)
    Department of Computer Engineering, Tehran, Iran

  92. vrushali says:

    Yes good article
    i really like it
    i will find out how we can improve better library services though data mining
    i will also try to learn it

    Thanks

  93. nasrin says:

    Any management related data mining topic for MSc thesis? I don’t have much algorithm related knowledge , got programming knowledge in c/c++/sql.Please provide me some recent topic relatd to management related .

  94. Rafi ullah says:

    dear sir i am too much confused that what topic for thesis should i select. if you have any good topic so plz guide me thanx .

  95. salina says:

    if i want to do thesis in datamining first thing i need is the data sampel. but in our country people donot give any data.. then can you give me solution for findind the data samples for datamining thesis.

    • Then, you may have to collect your own data or to use public data from another country if there is some available. There are many websites that you can find using Google that offers public datasets that you can download. If you are lucky, there will be some public datasets.

      Otherwise, depending on your topic, you could consider generating some artificial datasets but it may not be a good idea depending on your project. In general, it is recommended to use public data.

  96. salina says:

    can you say me how can i collect the data sets required for datamining because it is impossible in the country like nepal..

    • How to collect the data depends on what kind of data you want to collect.

      For example, if you are doing data mining on network data, then if you have access to a webserver or router, you could just use the logs from the server or router to perform data mining. Another example is if you are doing data mining on source code from software. You could just download any open-source project and perform data mining on the source code.

      On the other hand, for some types of data, it may be very difficult to collect data. For example, if you want to do data mining on medical records, it may be very difficult to get access to that kind of data from the government, unless some public data is already available.

  97. Msharifi says:

    it was very usefull article. thanks so much:-)

  98. niku says:

    Thanks yr bog provides very important guidelines.
    sir
    I want to do Ph D in Data Mining. I read yr blog & from that i decided to do work in Data Mining on Medical Data(gynec or orthopedic). I can get database of patient, diseases etc as one of the my family members is a doctor. so pls could you tell me what can i do after getting a database.

    • It is good that you have data, but what is your goal ? what do you want to do with the data? Do you want to perform early prediction of who will get a particular disease? do you want to predict what will be the reactioon of a patient to some drug prescription? etc. You need to decide what you will do with the data. To do that, you should have a look at what other people have done by reading some articles on this topic.

      • niku says:

        Thanks for reply
        Sir i wanted to predict the baby born date and prediction of premature delivery.
        so can do , how to do

        • You would need to have some training data. Then you could train a neural network to do the task of prediction. Similar techniques to neural networks may also be used like SVM…

          • niku says:

            after selecting data how to select algorithm or method

          • In general, it is recommended to try the popular methods first. To know what is the popular methods for your topic, sometimes it is necessary to read articles on the topic. In your case, you could try neural networks directly I think. If it does not work well, then you would need to find something else by reading. Or sometimes in data mining, we even need to modify the algorithms to achieve what we want if there is no algorithms that are appropriate for what we want to do.

  99. Nidhi says:

    Hello sir, I am pursuing mtech. I have chosen opinion mining under data mining. Can you suggest some topics in it? Can you suggest which tool I should learn? I am free for 2 months. I want to utilize this free time in doing something productive in topic. So can u please help me ?? Thank you

    • As said in the blog post, you should do a litterature review, which means to read articles on your topic (opinion mining). This will allows you to find some topics. I cannot do that for you.

  100. Raghav says:

    hello sir, i am willing to do research on data mining field, i have decided some topics like data mining techiniqs for online social networks and analysis, web mining for predicting user behaviours , topic detection and tracking .. out of them which one is suitable or how can i combine all these issues..

    • These topics are are quite general, so any of them would be ok. But in any case, you will need to define your project more precisely. And to do that, the only way is by reading papers on these topics and try to get some new ideas about what could be done better or differently, or what has not been done yet. To do this takes time, and you need to do it by yourself or with your the help of your supervisor.

  101. satish nathu bhadane says:

    I will be try for Phd in Psychological data & Data mining concept but my work that analysis of mind traffic.
    Psychological or spiritual concept & data mining Concept combined for research work &Is it contribution of Computer Engineering?
    suppose yes then please tell me right way……………….

    • Yes, it can be. It is definitely possible and interesting to combine Computer Science with other disciplines such as Psychology. Now, whether it deserves a Ph.D in computer engineering depends on whether you will be solely a user of the software of someone else or you will develop your own software program that solves non trivial computer science problems with respect to your application. You may also want to check previous thesis published at your university to see examples of Ph.D thesis subjects that have been accepted as PhD thesis.

  102. devendiran says:

    dear sir i am completed mphil too much confused that what topic for thesis should i select. if you have any good topic so plz guide me thanx .

  103. krina says:

    sir i am interested in data mining in educational or learning field for my m.tech dissertation can u give me some specific ideas abt it???

    • Data mining in educational or learning field is still a very broad topic. You need to define something more precise.

      You could search papers in Educational Data Mining (EDM) and Learning Analytics communities (LAK) for papers on these topics. Or also in some e-learning related journals and conferences.

  104. krina says:

    sir for decision support in dental disease prediction which kind of further research i can do for masters can u give me some idea about this ?

  105. jalpa says:

    i need your guidence .. to choose my thesis topic on data mining

  106. Makisna says:

    Sir, I was choose datamining for my phd. 2 years over.but still i did not choose proble. but my dataset is medical imbalanced liver data dataset. It’s correct or not .how to choose problem based on this things and how. please help me.

    • I think you should discuss with your research advisor to ask for some help in choosing your problem. I cannot give you any ideas since I don’t work on medical data and choosing a problem takes time. Actually, what you probably need is a medical expert that can tell you what are the important problems in the medical field that you could solve. Otherwise, you could have a look at related papers and see what other people have published on this type of data.

  107. D Rajakumari says:

    Sir, I was choose datamining for my phd. 2 years over.but still i did not choose proble. but., my minor project topic is TECHNOLOGY AND ITS IMPACT IN THE CLASSROOM. i think continue this topic . pls help related to data mining or big data. It’s correct or not .how to choose problem based on this things and how. please help me.

    another question
    big data is new one. to choose related to the topic in TECHNOLOGY AND ITS IMPACT IN THE CLASSROOM. pls help

    • Hi, technology & impact in the classroom looks like an important topic for the society.

      If you want to do data mining related to this, as I have described in this blog post, you will need to read research papers on this topic and to find something that has not been done before or that you think you can do better. I cannot help you to do this because it takes time to read papers. But you can ask your research advisor for help.

      Another important point in data mining is that you will need data for your research, either by downloading it or by collecting your own data. Since the availability of data may have an influence on what you can do for your topic, you could also search on the web or ask authors of papers if they can provide their data, if you plan to use the data of someone else rather than collecting your own data.

  108. Ananda says:

    Sir,
    i would like to do a Masters dissertation on the topic data mining in Automotive Diagnostics. Could you please suggest me some topics

    • For data mining, you need some data to do the mining. Can you obtain a database of automative diagnostics. If yes, then you could try to build a system that may automatically diagnose what is the problem of a car based on what he learned from the database. That is just an idea. I don’t work on this topic and I did not do a literature review. So it does not mean it is a good idea. As I said in the blog post, what you should do first is a literature review.

  109. Namita Garg says:

    Sir I want to do Ph.d in Database. I have knowledge of Batch programming. But i am unable to find that how can i choose topic for research and how i use database and batch programming combination in research area?

    Please help me.

    • In my opinion, if you want to do a Ph.D in computer science, you need to know some programming language like C++ or Java. If you only know batch programming (such as .bat files), it seems hard to do a Ph.D. in computer science.

      • Namita Garg says:

        Sir i have good knowledge of database also. i have done my internship in oracle. and i am very interested to do ph.d in database. but i am confused to choose a topic for research.

  110. devipriya says:

    Helo sir, now i’m doing my m.phil computer scince, please give idea to select the topic in data mining area,

    • You can start by reading the blog post on top of this page. It gives you the steps to find a good topic rather than giving you a topic because finding a topic takes time and thus I cannot do it for other people.

  111. devipriya says:

    And also need some research paper using clustering, now I’m interested to do my project in mat lab so give topic to related to using mat lab

  112. soumya says:

    hello sir,
    i have chosen “stock price prediction using machine learning algorithms” as my project topic…and i got information from my senior that we need to consider the stocks of america or London as Indian stock market is not so strong and we can’t get much data..but i don’t know from where i can get stock data for mining..and how can i access data and make use of it for mining in my project…please guide me from where can i get stock details of these countries stock market and how to mine that data??

    • I have not worked with stock data before. But usually, here are the ways to get data for a data mining project:
      Find some data that have been collected by other researchers. If you read some papers that have used stocked data, they may indiciate that the data is available on their webpage. In that case, you can download it directly. If its not, you may contact the authors of the paper directly by e-mail and ask them for their data. You may tell them that you will cite their paper if you use their data.
      Collect your own data. You could write a small program such as a web crawler to browse some website that offers stock data and collect it by yourself.
      Check some dataset repository. There also may be some webpages that are datasets repository such as the UCI repository that offer many datasets. You may check them. But I don’t think that there are stock data on UCI, for example.

      If you collect your down data, the advantage is that you will have all the data that you want. But you will spend more time to collect the data.

      If you use the dataset of someone else, then, you may not have all the information that you want, because the other researchers may just have collected the information that he needed for their research.

  113. HIMA says:

    hi…..
    really good one. this page especially helped me a lot as i am in my initial days for selecting my research topics this helped me alot like how to start? where to start?
    i am really thankfull to the advisor.

  114. Thaddeus Ogwoka says:

    Dear
    Philippe Fournier-Viger
    I have real benefitted from your blog. Most appreciated, it has real enlightened me on choosing my MSc research proposal.

    Nevertheless, how do i write aproposal to be approved by the panel.
    Kind regards,
    Thaddeus

  115. nao says:

    Hi Philippe Fournier-Viger
    I am a beginner in data mining. As i am going to carried out my Ph.D in data mining applied in bioinformatics. I am really confused how to start my work.

  116. abuasba says:

    I really thank prof.Philippe Fournier-Viger for replies and help .

  117. fayaz ahmad says:

    sir i want to choose a research topic in data mining sir what can i do for choosing a research sir i need those topic which are so sample and are not complex sir plz say me today

  118. Abdulrauf says:

    hello sir, i did my M.Sc research on enhanced data mining algorithm on online Electronic shopping please i want to some topics related to this for my PhD research works.

  119. R.Siva says:

    Hello sir,
    I need your guidance to choose topic in conferrence,So sir please give ideas about some recent problems in data mining, and how to solve it?

  120. soumya says:

    sir,my topic is “stock market prediction using machine learning algorithm”, first thanks for your idea regarding collecting data from the different sources, sir now i am planning to use any machine learning algorithm in prediction and not understanding among SVM,ANN and K-Nearest Neighbour techniques which will be more suitable for stock prediction?? please do help

  121. galag says:

    hello sir,
    i have chosen “opinion mining” as my project topic, my question is what is new”hot” areas.
    i know some topic like:
    1. opinion mining classification
    2. opinion extraction
    3. opinion search and retrieved
    4. opinion spam detection
    5. quality of reviews
    6. lexicon generation
    i am interesting on recommendation system based on opinion mining . how about that and please suggest me new areas .
    my 2 ed q is how to find a real problem ???

    • Since I don’t work on opinion mining, I’m not aware of the recent topics in this area. You could find some ideas by reading the recent papers in conferences on data mining, AI or recommendation system (e.g. RecSys), or journals. I see that you have done some litterature review already. Then, you should continue reading 😉

  122. galag says:

    thank you prof Philippe Fournier for your guidelines, my second question is how to find a real problem ?

    • I’m not sure what you mean by “real”. I guess that you mean important problem with real world applications. If this is what you mean, then well, there is different way to find interesting problems to solve. One way is to read papers and see what are the problem solved by other people. Then you may define the problem slightly differently than what they are doing. Or you can use their problem and propose a better solution to their problem. Or if you want to find some ideas of problems that are more applied, you may talk with expert in the industry, etc.

  123. Raj says:

    I really have to commend you Philippe for continuing to respond to comments over a year and a half later! This blog post was very helpful. I have already been reading quite a bit of literature and I am running into the problem of having too many interests. I just have to focus on one and go with it. Thanks!

  124. sara says:

    Hi
    I wrote a comment but there was no answer 🙂 I write it again.
    Hi
    It is about a year that a write a comment here :). On that time I choose opinion mining as my thesis topic. I work hard to fine a new approach in opinion mining. But you know, I was not so successful in this way. I make a new sentiwordnet for a new language from existing wordnet. but I couldn’t suggest a new method. some one told me, that by combing existing method I can introduce new one. but how can i do that ? for example by combing svm and naive bayes how can i create a new method?
    really I need your help. how can I continue my thesis in a way to find a new idea?
    thanks

    • Hi, sorry for not answering you the first time. I have been a little bit busy recently. First, thank you for coming back to this website! Doing research is hard… it is not always easy to find new ideas. That is true. Sometimes, it may takes times to find something interesting. Or sometimes, we may have a new idea but if may fails or provide bad results. This is normal. The more you do research, the easier it will get. Yes, it is possible to combine two methods to get a new method. But it should make sense to combine the two methods. We should not just combine two methods for combining two methods. For example, combining naive bayes and SVM may not make sense depending on what you are doing. Also, combining two methods does not mean that you will get better results. Moreover, if you combine two methods, but the combination is straightforward, then maybe that it is not a significant contribution because it may be too simple.

      If you are working on a new language, then you may try to find what challenges are specific to this language. If you can identify some new challenges for your target language than you may find a way to adress these specific challenges and it could be your contribution.

      I just talk in general because i don’t work on opinion mining. But don’t worry. If you work hard enough you should find some new ideas.

      • sara says:

        thanks. But sometimes I become hopeless. I think maybe opinion mining a research area that there I cant continue it because the are so many research about it and is not possible to find a new method.

  125. AMIT KUMAR says:

    Dear sir,

    kindly suggest me the topic for research on data mining so i can start my theses on data mining.

    thanks & regards
    amit kumar

  126. Nishat Raihana says:

    Hello Professor Philippe

    I am Nishat Undergraduate Final Year Student. I would like to do my undergraduate thesis on Data Mining. I have read several article on it still I am little bit confused about it. However, I would like to do my thesis on Data Mining specifically on Social Networking site like – facebook or twitter. However, I need your suggestions how to start working on them. It would be really great if you could provide me some suggestions. I am looking forward to hear from you.

    • Hi,
      Thanks for commenting on this blog post. My suggestions:
      – read papers about social network mining to know what other researchers are doing. In your thesis, you need to do something that other people did not do yet or do something better (faster, more accurate) than what they are doing. So you need to know what they are doing.
      – you will also need to find data. Either you try to collect your own data. Then you should try to work with the API provided b;y twitter etc. to see what you can collect. or you can try to find some public datasets or contact authors of other papers to get their data. This s important because you need data to do your research and dependng on what your data contains you may or may not be able to do some research project related to social network mining.
      Best,

      • Nishat Raihana says:

        Dear Professor Philippe

        Thank You for your Kind Suggestions.It shows me a way to start work on my thesis.

        Thank you

  127. Vayn says:

    So to start with, im helping my friend to find good reference in finding the thesis topic. she kind interested in school related topic sir. can you suggest me anything or give me a little hint as milestone maybe, i would really appreciate it. thanks

  128. salmooz says:

    Hello Professor Philippe

    I am student in computer scince . I would like to do my undergraduate thesis on Data Mining , I want to use wireshark to collect data from intranet (small lan (univercity))and track the movement about ip and protocols and use data mining tools to know types of protocols and IPs and content of massages, I need your suggestions how to start working on them. It would be really great if you could provide me some suggestions. I am looking forward to hear from you.

    • Hi,

      Thanks for message. I cannot help you too much since i’m not working on this topic. But the most important in a data mining project is always to have data. You may try to collect some data to see the feasability. Besides, what is even most important is to read what other people have done about applying data mining or machine learning techniques to analyse network data. There are some people who have work on this topic before. You need to read at least briefly their paper to see what they have done. This is important because in research, it is expected that you will do something new that other people did not do, or something better than what other people did.

      Best,

  129. salmooz says:

    I am student in computer scince . I would like to do my undergraduate thesis on Data Mining , I want to use wireshark to collect data from intranet (small lan (univercity))and track the movement about ip and protocols and use data mining tools to know types of protocols and IPs and content of massages, I need your suggestions how to start working on them. It would be really great if you could provide me some suggestions. I am looking forward to hear from you.

  130. sara says:

    Hi,
    I have a data-set of 1000 feature. I use svm and naive bayes for classification. I use present-absent and TFIDF for them. but the result is different. how can I find the reason of this difference? I want to know why svm and NB got different result and why using different feature cause different too.

    there are 1000 feature and I cant examine them one by one its correlation. is there a method for that?

    • Hi,

      Different techniques gives different results. Why is a difficult question to answer. You may need to make several experiments with your data to find out why. And as you said, with 1000 features, it may be complicated. But have you tried to use some techniques to reduce the number of dimensions like PCA? Or you could just do some pre-processing to remove some attributes and perform some tests to see the results. Just some ideas.

      Best,

      • sara says:

        thanks, but by using PCA only the number of feature decreased, but I cant understand which features have more effect on the performance of classification.
        can you you guide me which experiment should I do to understand the reason?

        • Ok. If you have a target attribute X and you want to know which attributes A1, A2… Am is more important with respect to X, then what about just calculating the Pearson colleration between each attribute and X?

  131. sara says:

    yes, but I think calculation is not logical and it is time consuming.

    • If you want to know if two attributes are corelated, calculating the correlation or covariance are very good ways to get an answer.

      You can write a short program that make a loop over all your attributes and calculate the correlation of each of them to the target attribute. Then you just run it and you will see which attributes are more correlated to the target attribute.

  132. Kittu says:

    Sir,
    I am doing M.tech( CSE)..
    I want to do my research on Proactive risk management on banking delivery channels (i.e ATM , net-banking , mobile- banking etc)
    can i use Data Mining for this??
    or plz suggest me sm data mining technique or tool, which i can use to analyze banking data

    • I’m not much familiar with banking. But data mining can be used on any kind of data. So there should not be any problem.

      Maybe classifications algorithms like neural networks, SVM etc. could be a good technique if you want for example to classify the customers as at risk or low risk.

  133. asmi says:

    Hello prof Philippe,

    I’ll be starting my PhD.
    Topic which i have selected is “Improving existing data mining algo for big data on mapreduce ” what do you think about this topic? Can you give me any suggestions?

    • Hi,

      It looks like a good general topic since big data is popular and map reduce is a popular technology. But it is still a little bit too general maybe. What kind of algorithms? What do you want to improve? Maybe you could be more specific. If you want to improve clustering algorihtms, then you could say that in your topic. Or if you want to improve something else, then you could say it.

      Best,

  134. AJ says:

    Hi, I came across your blog while I was searching for articles to refine my area of research which is data mining. The post is very helpful.
    Since data mining is sort of a new area for me and I am reading on data mining in Manufacturing and Operations Management. My aim is to optimize the production process to keep one of the the raw material (fuel) cost at minimum. I have collected 10 years of daily records and I would be pleased if you could suggest on how I can approach this from your expertise.

    Regards

    • I am not familiar with this kind of problem. It seems like an optimization problem. In artificial intelligence, genetic algorithm, ant colony algorithm etc, may be used for optimizations problems, but i’m not sure that it would fit your problem.

  135. Neema says:

    Thanks Philippe your blog is very helpful. I did Msc computer sc and I want my PHD Topic to focus on improving data mining techniques for Big data. I am apply for PHD and I dont have a supervisor yet. So far I have been familiarizing myself with several data mining algorithms but because of time can you guide me somehow on data mining techniques whose improvement can be good for big data analysis.

    • Thanks for your comment on the blog. I cannot recommend much about that because data mining is very broad. With respect to big data, you could improve any kind of algorithms: clustering, pattern mining, classification, stream mining, outlier detection, etc. etc. In other words, most topics in data mining can be combined with big data. So you should choose something that you like. I would just recommend to continue reading and try to focus on a topic that you like.

    • Fatima says:

      Hi sir,
      I came across your blog while I was searching for articles on data mining. This is really a kind contribution..
      I have planned to do work on Educational data mining, i will get students data including social and economic attributes and then find their effects on grades, finally suggest some ways of improving grades.
      I want to know is this enough for an MS thesis or some thing more should be done???? I have also seen some other researchers done same work but mine data set and attributes will be different. Please give me some kind suggestion..

  136. soheila mirzaei says:

    Hi,I am searching about topic for thesis of phd
    on the other hand, I love datamining with .may you help me how can these fields combine together , many professors said to me that not can define the topic in level of phd about datamining please guide me

  137. mitali says:

    hello sir..really a nice article..
    i am doing my masters in computer science and for my thesis i am interested in data mining and specially in web mining….. as i was starting from a scratch this helped me a lot…
    thank you…..

  138. Hiteshree Lad says:

    truly helpful article for beginners …starting from the scratch

  139. shashika says:

    Hi Sir,

    I’m an undergraduate.I hope to do my final year research in data mining.I think to do identify hidden purchase pattern of hotels customers through the data mining.sir,can you give some advices to successful my research?

  140. Ali Mohammed says:

    Hi
    Sir please tell me what is the missing on text mining topics related to Arabic language.
    I’m really confused about selecting my master topic.
    I have read about Ontology and summarization topics but until now the idea or the topic title is now clear for me.
    So, kindly please guide me on my research topic with respect of Arabic language text mining and NLP techniques.
    Thanks,,,

  141. Dipak Kawade says:

    Your Blog is very good for young researcher. I am working on frequent pattern mining. Can you suggest from where I got quality work related to this topic.
    Thanking you

    • Hi, thanks. I would recommend to look at the papers published in top conferences: KDD, PKDD, PAKDD, ICDM, CIKM, ADMA, etc. and also published in top journals: TKDE, TKDD, DMKDD, etc. You may find many papers on Google Scholar. I would say try to look at what the people have been doing recently to find something that you can do.

  142. MANU says:

    Sir i want prepare one journal on data mining..its a part of our syllabus..so could u please
    suggest one topic on data mining..and also the procedure for preparing a journal.
    Thank you

  143. vishnu murthy says:

    Thank you sir for your replies.Please continue the same in future also.
    I am doing ph.d and want to do research in clustering.
    Kindly guide me.I want to do on medical data.

  144. kamal says:

    Hi sir,

    My area has ben changed now as my Guide is changed. he told me to look for “Big data and web mining”.
    I have only one month to prepare the synopsis… Should I go in Content web mining or usage web mining. Which one is easier to work with? I want to take social networking sites for this. Will it be fine?

    Kindly Guide me..
    Regards

  145. Farhad Alam says:

    Hi Philip,

    I m Farhad and working as a DWH Engg and my skill set is ETL, Oracle SQL,PL/SQL, Teradata and Unix.

    so can you please suggest the topic in DWH, I know ETL very well.

    current working in eBay for e-commerce project.

    thanks

  146. p.praveen says:

    Hi,I am searching about topic for thesis of phd
    on the other hand, I love datamining with .may you help me how can these fields combine together ,memory efficiency of frequent subgraph mining algorithms for the case of uncertain data with applications in social network is it correct or not.if it is corrctet how to get the information about the problem tell me suggestion .thanq.

    • You need to read the papers on topic to know if someone has already done that or not and to know if you like the topic or not. I did not read much on this topic so I cannot tell you. But if you like subgraph mining, then, you can find some problems in this area. To find papers, just search on Google or Google Scholar and read it.

  147. Fatima says:

    Hi sir,
    I came across your blog while I was searching for articles on data mining. This is really a kind contribution..
    I have planned to do work on Educational data mining, i will get students data including social and economic attributes and then find their effects on grades, finally suggest some ways of improving grades.
    I want to know is this enough for an MS thesis or some thing more should be done???? I have also seen some other researchers done same work but mine data set and attributes will be different. Please give me some kind suggestion..

    • For a MS thesis, the requirements depends on the university. Some university may have lower or higher requirements. The topic looks interesting but you may discuss with your supervisor to know if it is enough. If someone has done it before it is not so original but if you can show something new in it, it would be better.

  148. Ronesh says:

    Hello Sir,

    Greetings of the Day!

    Was searching for topic for my thesis on Data Mining and came across this site.
    Very useful indeed. 🙂

    I have been able to get transnational data of ATM of a bank which I got with recommendation from my college for the sole thesis purpose only.

    Basically, I want my Thesis report could suggest bank’s manager to predict how much Money is necessary to be kept in ATM Vault for a particular day as more money kept is no good.

    Now I want your suggestion whether I should stick to only One Algorithm and find the solution Or Should I analyze various Prediction Algorithms and find the best one among them?

    As being an Average Joe, I have to research and get into with numerous algorithms and techniques for comparative analysis of algorithms and afraid i miss my thesis deadline as well.

    Lastly, if anything else can be done with this transaction data apart from vault cash
    prediction, please suggest me so that I can think other aspects as well for my thesis.

    Looking forward for positive response.

    • Hi,

      I would suggest to first start with an algorithm to get a first solution. Then, after that you can see if you can improve the work by using different algorithms. But before starting with a first algorithm you still need to read a little bit to choose that algorithm well.

      Other topic? Yes certainly. You could for example, try to find the optimal moment for refilling the ATM with more cash. Should we fill it everyday in the morning ? or … ? I think that you may find other variations of this problem.

      Best,

  149. Ronesh says:

    Thank you for your valuable suggestion.

    Good Day!

  150. kaveri says:

    Hello sir…Thank you for this blog post. I am new in data mining and and i find this field very interesting and want to do my thesis in this but unfortunately i am not getting any support from anywhere.I have learnt about all algorithms in theory and now i am planning to work on missing value imputation but i am not sure about is it a good field to proceed. I am not getting help from anywhere. Any of your suggestion would be of great help to me.
    Thank you.

    • My advice: try to look at recent research on this topic. If there are recent research on this topic in good conferences/journals, then it is a good topic. Otherwise, it may be an older topic. But there are certainly some challenges to solve. My final advice: work hard. The topic is important but how hard you work is also important to do good research.

  151. Kasthuri says:

    sir,

    i am kasthuri. How will i select a research topic in data mining?

  152. Kasthuri says:

    sir,

    i am kasthuri. How will i select a good research topic in data mining?

  153. Kasthuri says:

    I wish to my part time time ph.d in data mining. give some ideas about how to choose research topics

  154. vikas says:

    Hello Sir ,
    i need your help I am doing master of computer Engineering i want list of topic in data mining for dissertation.Please help me as soon as possible .
    Thank You
    Vikas

  155. RSP says:

    hello sir..
    good afternoon…
    i have some confusion about selecting area of data mining.
    i like data mining but some people told me that data mining is very common to do research ..so give me some positive response.

    thank you

    • Yes, data mining is popular and it is good to work in a popular field. It means that you may eventually find a job in this field and that more people may read your research. It is much better to work on something popular than on something that is not.

  156. RSP says:

    thank you sir..
    sir give me some current research topic in data mining..
    i mean i read many paper but i am very confused to select some topic.
    give me some proper topic from current research.

    • It’s better that you choose a topic that you like. But if you really want some topics, I can give you some random topics: fuzzy sequential rules mining, incremental sequential rule mining etc.

  157. vaishnavi says:

    Sir,,, i m in last year so i want to project on data mining so plz suggesst me sutaible topic in this area.

  158. Preeti says:

    Hello Sir,

    I want to do Phd. I cleared written exam for Phd and my interview schedule on 14th of July. In Interview,I have to represent my broad area of research interest.

    After go through lots of Research Papers, I decide my broad area of research interest is Mining Web Log Files to improve the performance of website. Sir I want to use existing technique and make some modification in it and obtain better results.

    But Sir, I am really confuse in how to justify my research methodology in interview. What should i say when they ask question like “what kind of techniques you want to use?”
    Because I have no Idea. Please Sir Help.
    Reply ASAP.

    • Hi,

      Congratulations for your interview. I think you need to read a few paper on weblog mining to find what are the main techniques. It is better if you can talk about a few techniques even if you just know the basic idea. For example, one technique that may be used is sequential pattern mining (discovering sequences frequently appearing in a set of sequences). It could be used to find what occurs often in the weblog to understand the behavior. Another technique is clustering if you want to find similar web users. Or if you want to predict what will be clicked next, you may use something like our CPT+ model: http://www.philippe-fournier-viger.com/PAKDD2015_sequence_prediction.pdf
      Those are just some example. There are many techniques that could be applied on weblog data. Actually choosing a technique depends on WHAT you want to do (what is your goal). Saying that your interest is “mining log files” does not mean much. What is your goal? Do you want to find patterns? Do you want to cluster the users? Do you want to predict what people will click on next? … etc. etc. I think that you need to define more clearly what is your goal. It may be more important than talking about the techniques.

  159. nidhi dadhich says:

    Hello to everyone , sir I want to do mtech thesis in data mining . sir can u suggest some topics for thesis / project

  160. nidhi dadhich says:

    Superb work sir really appreciating while reading .

  161. V.INDHUMATHI says:

    Sir, now i am final year MCA..i can choose my project in Research area..But i cant no idea about that..But,still i have learned data mining i like to doing in my project also data mining technique in Clustering also..So,If i want to one Topic for my project..Pls,Say sir….

  162. jyoti sharma says:

    Hello sir ,
    I read all your suggestions about research topics i am an m.tech student i have studies grid clustering in minor thesis now i have to work on major thesis can u kindly suggest related topic plz plz plz i want to discuss it with you how can i discuss. i will be very thankfull.

  163. jyoti sharma says:

    Respected Prof.Phillippe,

    I really appreciated your suggestion, you are guiding all of the needy people you did a great work sir. I also need your help in my research i will be very thankful. I have some topic in my mind but want to discuss for proceeding further.

  164. jyoti sharma says:

    Thank you so much for replying sir i just want suggestion about any open source tool of clustering which can perform grid clustering also, if you can help only any tool name i will work on it myself. i will be very thankful.

  165. jyoti sharma says:

    thank you sir
    i have studied comparison of some algorithms ,what can i further do in clustering or few of its algorithms .Is there any other option in clustering for my research.

    • There is a lot of thing that one can do in clustering: make faster algorithm, make better clustering algorithm, apply clustering algorithm in new ways, combine clustering with some other techniques, etc. As I said in the blog post, to find a good topic, one need to read several papers, and then you may get some ideas about what other people have been doing recently.

  166. Amuthavalli SathishKumar says:

    Thank u for your tips to start our Phd. Sir can suggest some topics related to
    data mining in biological data .

  167. jyoti sharma says:

    Ok sir thank you

  168. M.Durga says:

    Hello sir,
    I am studying M.sc 2yr,i have one research paper(project) for this sem sir…i don’t know that which area i select but i like to do in data mining..i know only little bit of knowledge in data mining sir…could u help me sir for which topic that iwill select in data mining..

  169. Anum says:

    Sir I want to start my thesis.. i want to work on data Mining specificly on clustering can u suggest me where to start or reach to find best problem?

  170. Sindhu says:

    Sir,
    I am working in a telecom company for the last 10+ years and mainly in the IT wing. Now I wish to do a Phd in large data analysis for which I am having the required data. I am not having a correct idea regarding the way I have to proceed. Can you guide me in this regard.
    Thanking you

  171. ROJ says:

    HI. I want to do a M.Sc. in data mining security . But it is difficult to choose a topic. There are too many data mining topics. Is it a good idea?

    thanks

  172. Rachana says:

    hey Philippe,

    i wanted to know whether it is necessary to have a good knowledge in statistics and maths to pursue research work in data mining? am interested in data mining but i don’t have good hold on stats and maths.. could you please suggest me?

    • Hi, It certainly helps. But it is not a requirement. You can choose a topic that is more algorithmic oriented for example and where there is less math and statistics. For example, if you work on designing algorithms for “clustering” or “pattern mining”, you can do something with not much math.

  173. Leopord says:

    Hey Philippe,
    i wanted to know whether it is possible to combine classification and regression as hybrid model if i want to have to enhance classification prediction or regression models ,could you please suggest me some knowledge needed and tools .

    Thx

  174. Leopord says:

    Hey Philippe,
    i wanted to know whether it is possible to combine classification and regression as hybrid model if i want to enhance classification or regression prediction models ,could you please suggest me some knowledge needed and tools .

    Thx

    • Hi,
      It is certainly possible. Actually, regression can be used to performed classification. For example, you may want to guess what is the salary of some employee and you have the salary of other employees from the same area. You may do some regression to get that information, and it could be seen as classification. This is just a simple example, but there are certainly more complex ways. Since I don’t do research on that topic, I cannot say too much about that. You would need to search a little bit and read recent papers to know more about what has been done recently.

  175. I’m doing phd in big data mining.
    I wan to work on any real time scenario for research.
    Can you suggest me specific dataset/problem for it?

    • This is very broad. You could do for example stream mining to detect security attacks by analyzing network traffic. I think that there are such datasets available about network trafic. Besides, I think that whatever the topic that you choose, something important is to find some datasets whether from some companies, make your own datasets or download some public datasets.

  176. Divya says:

    hello sir,
    can u suggest any topic for research even other then data mining i needed it for my final year thesis…im really confused since last few days unable to find any topic..

  177. Selva says:

    Hi,

    I’m Selva from India,I’m interested in doing my Ph.D work in Natural Disaster Management using spatial data mining.Since the natural disaster management is a vast area I’m struggling to drill down into the exact area and also I’m worried about getting the proper data-sets on this area. Could you please suggest your ideas on this.

    Thanks,
    Selva

    • Hi, I’m not familiar with this area. Maybe you should try to talk with some special about natural disaster. If there is some at your university, you may try to contact them and perhaps that they can provide some data. Also, maybe look at some government websites. Some places like Japan or Taiwan have many earquakes. I think you should be able to get data from these governement websites about the location and time of each earthquakes if you search carefully for example.

  178. Md. Zakir Hossain says:

    Dear Sir
    Thanks a lot for your such helps. A lot of students are facing these types of problems and you are helping in a very effective way. We are grateful to you. Could you tell me which are the latest research areas of data mining? In which conferences or journals I can find what people have been doing recently on data mining so that I can choose an area or topic on data mining for my PhD program?

    I will be grateful if you help me in this regard.

    Thanks………

    Zakir

    • Hi You are welcome. You may look at the data mining conferences : KDD, PAKDD, ICDM, PKDD, ICDE, CIKM, DAWAK, DASFAA, DEXA, ADMA, etc. also the data mining journals: TKDE, DMKD, TKDD, KAIS, etc.

      There is a lot of possible topics. Some current trends are big data, social networks, etc. But you can also choose something else.

  179. Md. Zakir Hossain says:

    Thank you again sir…………

  180. Shivani says:

    Sir..i want to do some work in data mining for my phd…sir my programming skill is weak .sir plz suggest me topic to research purpose which has less programming

    • If you are doing a PhD in computer science, you will certainly need to do some programming and need to improve your programming skills if they are not so good. Data mining in general requires to be good at programming especially if you want to design some algorithms, because you will want to make algorithms that are fast, since data mining algorithms are generally applied on large amount of data.
      If you are not good at programming, perhaps that you should choose a topic that invovle less programming such as research on e-learning, or you could still do data mining but more toward the applications of data mining rather than working on algorithms. For example, you could work on applying data mining to biomedical data, biology, e-learning etc. If you are good at math, another option is to take more of a math approach and use software such as Mathlab, etc.

  181. Shivani says:

    Thanks a lot sir

  182. Shivani says:

    Sir which language is used in data mining algorithm…can i learn this language at home?if i want to learn language for data mining ..which language i should learn..which book prefer to leraning that language…

  183. Md. Zakir Hossain says:

    Dear Sir
    I studied different papers from difference conferences and journals given by you. Thanks a lot for your suggestions. Actually I have very limited knowledge on data mining but I am interested too much to do my research in this area. I am ready to study a lot. But I have to prepare a PhD proposal within very short time. Now I guess I can work on “Spammer Detection Methods/mechanisms/techniques in social media systems or something related this”. But I can not be enough confident what should/might be the proper topic. Could you please help me in proper direction for choosing a topic in this regard.

    Thanks

    Zakir

  184. shivani says:

    Sir which language is used in data mining algorithm…can i learn this language at home?if i want to learn language for data mining ..which language i should learn..which book prefer to leraning that language…

  185. omsad says:

    I need PhD spicific topic in healthcare data mining, please

  186. bakry says:

    Hello sir,
    I am studying M.s.c and decision to do my research in data mining direction particular in revenue accounting fraud statement detection please sir i want to determine the algorithm that achieve my word correctly and display acceptance result
    thanks

  187. ELMLK says:

    Dear sir,
    I want to do some work for my phd in data mining especially sequential pattern mining on Big Data but i could not find any new specific problem in this area.
    would you please help me and suggest your idea to find a specific problem on this ?

    Thanks you.

    • Hello, there are a lot of possibilities. You can combine any two topics to obtain a new topic. For example, I did not see any algorithms for sequential rule mining in big data. Designing a sequential rule mining algorithm running on big data framework such as Hadoop or Spark is a good topic. Besides, that there are many other possibilities. For example, you could combine : incremental mining + sequential pattern mining + big data, or any other combinations of topics.

  188. saeed bahrami says:

    i interested in work on data mining . Also i have many years experience in fields of education. please introduce some topics about data mining in education.
    thank you.

    • You may read papers from the Educational Data Mining conference (EDM), or related conferences such as Artificial Intelligence in Education (AIED), Intelligent Tutoring Systems (ITS), and others to see what are the current problem related to data mining and education.

  189. Sheenu says:

    Sir, I want to do research in Parallelisation of Data Mining Algorithms. I am also good in Optimization techniques like Genetic algorithms. Please suggest me topic related to this.
    Thanks in advance…

    • I suggest to read recent papers in data mining conferences/journals to find a good topic. It takes time to find a good topic.

      • Sheenu says:

        Actually I want to work on Distributed framework or shared frame work of Data mining algorithms like clustering and classification algorithms either on Hadoop Mapreduce or using OpenMP/MPI, but which application I choose to apply data mining clustering or classification algorithm, I am little bit confused. I do not want to go towards Semantic web mining … Please suggest…..

        • Data mining can be applied in almost any domains: bioinformatics, medecine, psychology, education, etc. In my opinion, you should choose a domain (1) that you like, and (2) where you can easily collect data to be used by your data mining algorithms.

  190. prem kumar cahndrakar says:

    Sir , I want to work on agricultural data.please suggest current topic.
    am are work “study of agricultural land soil using classification techniques”
    in M.Phil. program. I am look a P.Hd Topic in data mining .

  191. latheefa says:

    Sir, I want to do P.hd in data mining.Please suggest me title for my work

  192. ELMLK says:

    Hi Sir ,
    would you please suggest me a problem in high utility sequential pattern mining ?

    Thanks alot

    • Phil says:

      You can combine high utility seq. pattern mining with any other topic such as fuzzy pattern mining, incremental pattern mining, stream mining, etc. You may read other papers and try to combine two ideas to get a problem.

  193. stefany says:

    Hi Sir,

    would you please suggest me a topics for thesis in information technology major in business analytics.

    thanks a lot.

    • I don’t really have time to find topics for other people than my own students. Sorry. You should read papers in recent conferences and journals and choose a topic that you like and try do to better or do something that other people did not do.

  194. Sai says:

    Hello sir,
    I have bit knowledge in Data Mining techniques.
    If i apply DM techniqes and do some analysis on any real time problem, does it consider for PhD.
    because i am not doing anything new,just applying DM techniques and getting some reslut.
    Please help me in this regard

    • It can be. The goal of a Ph.D. is to contribute to the advancement of knowledge in some field. You may either work on more fundamental problems or working on some more applied problems. For example, one may design a new data mining algorithm (a more fundamental problem), but one may also use existing data mining techniques do something new or better in some fields (for example, use existing algorithms in a new way to identify the authors of texts, or to discover communities in social networks). What is important is that a PhD need bring something novel. So if you work on an applied problem, you still need to bring some novelty but at the application level. Personally, I prefer topics at a fundamental level (design of algorithms) but there many people who work on a more applied level (for example, you may check any conference on e-learning, there is a lot of applied research).

  195. Abdul says:

    Hi
    how are you?i hope u are doing well
    actually im really interested in data mining topics such as clustring ,data mining in big data
    ive read many articles but the problem is i could’t find something interesting
    what sub-topics in “data mining in big dat”a that can i work on ?

    could you help me with

    • I cannot find a topic for you. You may ask help from your supervisor. He should be able to help you. Otherwise, you may look at recen data mining conferences to look for some topics.
      “data mining” and “big data” is very broad. It can mean basically anything in data mining.

  196. Abdul says:

    im afraid that i would spend more time looking for topic,and after that ill find that the topic is already solved.

    i wan work on something likes this in general :processing big data and finding the relations and interesting pattern is difficult ,So ,what kind of challenges theta we maybe face when processing big date and how can we provide efficient data mining techniques to overcomes the issues?
    would be a good PhD topic? or i have to look to something different?
    thank in advance.

    • Before doing any research project, one should read enough papers about the topic to know what already exist, and make sure that the problem has not been solved. If you want to become a researcher, this is something that you need to learn to do. Even if it takes a lot of time, you still need to do it, and it is important that you do it.

      Ok, so you want to find patterns or relationships in big data… But that is still not a topic because it is too general. Finding patterns in big data could mean almost any data mining algorithms applied to big data. So you still did not define a topic. To define a topic, you would need to 1) define what kind of patterns you want to find: Clusters, outliers, frequent patterns, subgraphs, communities, etc., 2) and define in what kind of data: graphs, transaction databases, multimedia data, spatial data, data streams, etc. and 3) perhaps also what kind of approach you want to use to find these patterns.

      So I cannot say whether it is a good topic because your topic is still too general. Finding a good topic takes time. There are an infinite amount of “good topics” that one could find. But as I said before, I cannot do the litterature review for you. You need to do it by yourself and evaluate by yourself if it is a good topic. Even in data mining, I’m not familiar with all the topics, and I will not read the papers for you to tell you if it is a good topic or not, unless your ask me about something that is directly related to what i’m doing.

  197. Abdul says:

    that’s really helpful .thank you very much.

  198. Hemant says:

    Please suggest me application of confabulation association rule mining for multidimensional association rule generation.

  199. Ahmed says:

    hi Philippe when i am reading your answer to the comments , now i am sure about the topic must be specific and not general .

    what i want is i need to do classification on cloud data can you help in this way

    preparing a paper of applying classification techniques on cloud data

    thanks a lot

  200. Rashid says:

    i am very much interest to purse my PhD degree in Data Mining field. i need your guild lines to select my PhD topic in opinion mining and sentiment analysis. please share some recent research areas in this field.

  201. Amlan says:

    I am very interested to pursue research in Data Mining in Healthcare Science Applications, Please suggest me some recent popular topics where I can explore to start my research.

  202. Ananya Sethi says:

    Hello Sir,
    I am a computer science student, entering into research for the first time. I have been searching for topics in data mining.
    I wanted to ask what are the topics related to stocks that I can research on or write a thesis on.
    There is a lot of research going in this field. I m really confused on which direction to go- something less explored and with a lot of scope of research.
    It will be great if you could suggest an approach on this topic.

    P.S. If you have any suggestion on a ‘popular topic’ -as you suggested, that would make my way easier to some conferences, would be of great help.

    • Hi,

      It would be hard to suggest topics about stocks since I don’t work on that. But some typical problems that have previously seen are to predict the stock market, detect fraud on the stock market, find correlations or patterns indicating that some stocks are related and behave the same way on the stock market…. I think you could find more ideas by searching a little bit on Google Scholar

      Best regards

      • Ananya Sethi says:

        Thanks for your helpful suggestion.

        Do you know anybody who is currently working on this topic who i can ask questions from?
        I wish to get more insight on this.

        • No. I don’t personally know anybody working on that. But if you find some papers about stock markets on Google Scholar, you can always send e-mail to the authors to ask them questions or ask for their dataset. Sometimes they will give you some advice or maybe even their datasets.

  203. shaik says:

    Hi Philippe,
    Iiam looking forward my PhD in DataMining. Iam interested in Social Newtwork Mining. How and where can we apply datamining algorithms ( clustering, classification algorithms, etc.) on Social Network Analysis.
    And also how can we apply Neural Networks on datamining applications.

    Best regards.

    • Hi,
      Wish you good luck for you Ph.D. This is some very broad questions. For applications of data mining on social networks, you could have a look at the papers published at the ASONAM conference ( international conference on Advances in Social Network Analysis and Mining ) which is about data mining and social networks. You will see that there are many topics. It would be too long to list all of them here. You could also check journals related to that topic and papers on Google Scholar.

      Neural networks also have hundreds of applications. Basically, you could see neural networks as a data mining technique. So the applications of neural networks in data mining, would be all the applications of neural networks, and there are a lot.

      Best regards,

  204. J.Jayapandian says:

    i want good thesis topic in data mining for ph.d

  205. sweety says:

    sir am doing m.phil, my guide suggest to learn scikit website. pls suggest some topic regarding this.

    • Scikit is just a tool that you may use in your research. It can be helpful. But you should perhaps find a topic first instead of looking at scikit to find a topic. That is my opinion

      • sweety says:

        Thank you sir.I decided to do in medical field. Is it correct decision
        ? and now is it current trend? I am just confused sir. Please suggest me some of the field which are in current trend.

        • I don’t work in the medical field. I work in the field of data mining. The medical field is a possible application of data mining. If you can get some real medical data and know some people in the medical field that can guide you in your project, then why not choosing that field. I guess it could be ok. But even if you choose a field, you still need to find a specific topic. The “medical field” is still very broad and could lead to an infinite amount of topics.
          I suggest to read papers in recent conferences and journals related to your interest to see what people are working on.

          • sweety says:

            Ok sir. Data mining is my domain sir, in that i decided to do healthcare or big data application. Please tell me which is sufficient to collect data. Also please suggest me some of the other application which are more effective.

          • “big data application” or “healthcare” is still very broad. You need to read some recent papers and see what other researchers are doing to find a good topic. I cannot do that for you. It can take actually quite a lot of time to find a good topic. But you can always ask your supervisor to help you about that.

  206. vidya says:

    sir , this blog is really useful. I am planning to do datamining in the feild of nano technology . any ideas or suggestions of topics related to this .plz suggest.

  207. Fiqsya says:

    Hi. Right now I’m planning on continue my research study on malware analysis. I’m thinking of using data mining approach in my research. However, since i’m doing in PhD level, my supervisor do not want me just applying the data mining technique. He always asked me about the weakness/limitations I’ve found in data mining. However, as far as i’m concern, there is none limitations on data mining. So, may I asked if you or any visitors in this blog have any idea about the data mining weaknesses/limitations in malware analysis that I have overlooked?

    Thanks again.

    • I understand what you mean. At first, it may seems that a technique has no limitations. But there is always some. What you could do is to choose a technique from the literature, try it, and after you tried it, you will certainly think about some way to improve it. For example, if you try some classification technique to classify malwares, then what about considering also the time dimension? what about doing real-time classification? what about developing an approach where your algorithm can say how many percent certain he is that a program is a malware? etc. There is generally always some way to extend a technique for example by considering more information such as time etc. Besides, another possibility is to combine characteristics from different approaches that you like. By reading a few papers, you may try to see the different approach for classifying malwares and then try to combine the best characteristics of all these approaches. This is just some ideas.

      • Fiqsya says:

        Thanks for your quick reply. But I have questions about combining a characteristic from different approaches like you said. I had confused with the meaning of characteristic? is it characteristic of the malware or the technique in classification it use (eg c4.5, dt etc) ? and by mention different approaches, did you mean by combining statistic and machine learning? I’m so sorry for asking but I’m really stress right now. I’m glad you’re replying me.

        • I mean, it could be different characteristics of the data mining approach. For example, maybe there is an algorithm A for classification, and another algorithm B for classification that consider time, and another algorithm C for classification that let you add constraints when doing the classification. If you think that your approach need classification + time + constraint, you could try to combine the algorithms A + B + C in a single new algorithm. This is just some general ideas. In general, the more information you consider such as time, the more complicate the problem become, and then you need to extend the original algorithms to consider this additional information. I just tell you this as some general idea. You could see how that could apply to your problem by reading some papers and comparing the different data mining approaches to then try to take the best of each approach if possible. Or try to add something new that other people did not do.

  208. vidya says:

    Dear sir,
    presently i am teaching the pg and ug students for the c omputer science , and i had choosen to work for my PH.D. from different university privately.
    This is an request kindly suggest the topics which is convenient in teaching and working both related to my ph.d. work.

  209. Thendral says:

    I like your collection, thanks for sharing this wonderful collection of themes with us.i am working in Cloud Erp Software Companies In Chennai

  210. Rasheed says:

    Hi Sir,
    I Rasheeduddin registered Ph.D In Data Mining Clustering My Title is

    “Improving level of efficiency through K Means Algorithm in Social Networking Data Base”.

    Actually i want to implement the K Means Algorithm in Social Networking Data Base.

    Please Suggest me any changes required to this Title and what type of Back Ground work is required to start.
    please guide me/ suggest me

  211. uday says:

    Sir, I have Qualified in OU Ph.D Entrance test. I want to do Ph.D on data mining. so please tell me the topic of research on data mining or any other area.

    Thank you sir

  212. Md. Zakir Hossain says:

    Could you please suggest me where I can get more information and where I can get the recent works on opinion mining?

  213. Md. Zakir Hossain says:

    Thanks again. Really I am grateful to you…………

  214. houssem says:

    my area of interest is bank data mining
    can you please suggest some topic on it ?

    • I don’t work on bank data, so it is hard for me to suggest something. But a typical topic for bank data is to evaluate the credit record of customer to detect whether they will pay back their debt or not for example. There exists some papers on that already. You may find using Google Scholar.

  215. houssem says:

    hi could you suggest some topic for ” Data mining for customer relationship management” please

  216. shaik says:

    Sir, I have Qualified in Ph.D Entrance test. I want to do Ph.D on data mining.
    I am interested to do phd in revenue management/system by using data mining technique . can u guide me this interest is good or bad for the project work

  217. Rabia says:

    I want to do Mtech Research in data mining. please suggest me a topic in data mining which is somehow related to networking.

  218. Rehan says:

    need opinion, target of my topic is “user identification over social network”.
    i just want to apply changes through these techniques “clustering and outliers”.
    sir kindly tell what is the scope of topic? if its fine then please generate impressive topic name for me.

  219. Rehan says:

    SIR what are the current problems in identification of user in social networks, which problems should i target?

  220. ko moe says:

    I want to get example title for my thesis in computer science.I am interested in data mining.Please sir.

  221. Ko moe says:

    Which method I can used for passport data analysis?
    Please guided me the latest or pooular method for that.

    • I never worked on passport analysis. It depends what you want to do. Analyzing a passport to do what? If you want to analyze the picture, then you could use some image processing techniques. But it really depends what you want todo. The best would be that you search what other people have been doing on this topic recently in Google Scholar for example. As I said, I don’t work on this topic, so i cannot tell you the method that people have been using.

      • Ko moe says:

        Hello sir
        I want to discover the travel pattern from passport data analysis.

      • Ko moe says:

        I want to discover the travel pattern from passport data analysis.So which method I can used sir?

        • Many methods could be applied:
          – Find people with similar travel patterns –> some clustering
          – Find people with abnormal travel patterns –> outlier detection
          – Find some frequent travel patterns –> pattern mining
          – Classify the travellers based on their travel patterns –> classification

          This is just some basic idea. Those area some of the most popular areas in data mining. Depending on how your data looks like etc, you may choose different techniques.

          • Ko moe says:

            Yes thank you sir.

            Please, guidence for the new method(update method) for this.My reference paper used like k-means, apriori algorithm but this methods are olds.So my supervisor told me to used new method.I want to apply the update method for this topics.

          • To apply some new method, you first need to know what the researchers have already done before you. So as I said previously, what you need to do is to read some research papers about discovering travel patterns. Then, you can understand what other people have done and do something new or something different. There is no way to avoid that step. If you ask me to recommend you something new, I cannot really tell you because I did not read the papers on discovering travel patterns, and I don’t know what people have done already. This, you need to do it by yourself.

  222. deeksha says:

    sir, i am going to join ph.d in data mining.
    can u suggest any research topic in data mining.
    please give me anyu problem definition

  223. reza says:

    HI
    I am a student at a major software and I want my dissertation in the field of data mining.
    In what context do better in your opinion?
    I thank you for your tips.

  224. mounika says:

    Hi Sir…I’m looking for a thesis problem on topic -time series data analysis using support vector machines.Can you suggest any?
    Thank you

  225. jumoke says:

    thanks for the advice so far sir. please i have interest in predicting students academic performance but with different articles have read am no still getting problems and limitations sir……I really need help

    • I did not read about that topic, so I cannot give very specific advice related to that topic. But I think that many papers may perhaps have been published already on this topic. Sometimes, it is easier to do research on a topic that is a little bit different or where you will add something new to the problem definition, rather than work on a well established topic that has been studied for many years, perhaps.

      For example, what about predicting the student performance using some type of data that other researcher did not consider? That could be a limitation. For example, if other researchers only considered the performance of previous years to predict the next year, but ignored data about the gender, location, nationality, etc., then it could be a limitation. I just say that as an example (I did not read on this topic).

      Another kind of limitation could be in terms of the techniques that previous researchers have used. For example, you could try to apply some novel techniques not used by previous researchers to solve the same problem and hopefully get some better results.

      Besides that, you could try to predict the performance in specific contexts that have not been studied. For example, there are different context for learning such as lectures, one-on-one learning, e-learning, self-learning, etc. If you find that your topic has not been studied in a specific context, then you may apply it to that new context.

      This is just some ideas. But I did not read about this topic since several years (I was working on e-learning before 2010), so I don’t really know what happened in that field in recent years.

  226. jumoke says:

    ok sir am so happy for the reply sir . but am a little bit confuse about the type of data sir. Would be glad if you can shed more light on that sir

    • For a research project, you need to do something novel (something new). So you need to find a way to make sure that your project has some novelty.

      As I said it could be:
      – using another method
      – using a different type of data
      – doing the same thing in another context

      For the data, OK, I will give you some more details. Maybe that many researchers have published papers on predicting student performances. But what kind of data have they used? Did they only use the grades of students from previous years? Did they also consider the student profile (age, gender)? Did they ask the students to fill some questionnaires about their personality to predict their performance? Actually, what I want to say is that many kind of data could be used to predict the performance of a student. Above, I have mentioned 3 types of data. You could check what kind of data the researchers have used in previous work and think if there is something that they did not use. What about using mobile phones? or other information?

      In any case, in a data mining project, you will need to think about the data. Because if you don’t have data, you cannot do your project. Either you can use some data from some websites such as PSLC datashop ( https://pslcdatashop.web.cmu.edu ) or you collect your own data. If you collect your own data, it takes more time but you can collect whatever you want or need. If you use the data of someone else, then you save time.

  227. jumoke says:

    thank you very much sir , i really appreciate. Will get back to you sir.
    Do enjoy your day sir

  228. Hello, I think it depends on what are the expectation of your university for this project that you need to do. This kind of project would not be very original if you just use the same method as someone else on different data.
    It this is the project of a master thesis in computer science, then for me, it would be a weak project. The contribution would not be in computer science (since you would not create anything new in computer science) but the contribution would only be about that data. Programming something that is already known is not a computer science research project. It is a programming project.

    But if your university is OK with this kind of project then it could be OK. But if it was at my university in Canada or China, I would not accept this kind of project. So you may want to ask the professors at your university about what are the expectations for the project at your university. Maybe that it could be ok.

    Actually, it would be best if you could reimplement but also improve the method with some additional features to address some limitations of that method, if you can. Then this would be more like a research project That would make a better project, and it would look more original. If you can think about some limitations of that method, and try to find solutions to improve it (and not just apply it on new data), it would be more interesting.

    • Hello, As I said on the blog post, those are just some examples. I did not give these examples to say that they are good topics. My point was that you should just topics that you like instead of some random topics such as those. Actually, I just wrote this without even checking if they are good topics or what has been done. It was just an example.

      If you are interested by pattern mining, you could check sequential rule mining. There are many possibilities related to that for doing project. Moreover, source code of some algorithms and datasets are available in my SPMF library : http://www.philippe-fournier-viger.com/spmf/

    • Yes, I understand. Then I guess that they don’t expect you to do some breakthrough research in that case. Indeed you cannot do a big research project in 1.5 months.

      • I cannot find a topic for you. You may ask your supervisor to help you. I can only give some general comments about how to find a topic.

      • Hello,

        This paper is not a good paper. It is published in some unknown journal. So I just want to say that it is not a good example of good research. But since you have limited time, I guess that they don’t expect you to do something very innovative anyway. You can confirm this with the professors that will evaluate you. If they are ok with a simple project like that, then it is fine. I think it really depends on what are the expectation of your school.

    • Yes, I understand. Sometimes, supervisors do not help too much.

      These research topics are fine. But if you want to do good research, you need to read what people have done on these topics, and then find something that you can improve. Community detection is an important research topic. There have been many papers on this topic in recent year. For a research project, you need to find something that you can improve. You could start by reading the recent papers on this topic to get some idea and try to get some new ideas.

      That is all I can say about this topic because I did not read about these topics.

  229. ko moe says:

    Hello sir
    Where I can get a referece paper, documentation and code for Artificial bee colony optimization? Please guidence me for this.

    • Have you searched using Google scholar? I guess that you can easily find some papers about that on Google Scholar.

      For code, I don’t use bee optimization in my research so I don’t know. But I guess that you can find some by searching a little bit. Besides, you can always try to contact the authors of papers to ask for their implementation. Often people will want to share their code of binary files.

  230. jumoke says:

    hi sir,mining educational data to reduce student profession redundancy meaning to predict students that graduate and don’t work in their course of study.
    my question is that is it a good idea

    • Yes, why not. If nobody has done it, then it is certainly useful and novel to do that. But if someone has done it already then you would need to find a way to do it better or add some other kind of novelties in your topic.

      I don’t know if someone has done it already or not. So you would need to read some articles to find out.

  231. jumoke says:

    thanks

  232. reza says:

    hi
    I need date set for my project… i need Instagram data set, where can i found it??!!!
    Thanks

    • You could get the data from…. Instragram by using a web crawler. But of course, you would need to make sure that it is not against their TOS (terms of service). Otherwise, they may ban your account.

      Another way is to look for public datasets about instagrams by searching using a web search engine. Also, if you have read some papers that have used an Instagram dataset, you may try to contact the authors to ask if they can share their data with you.

  233. reza says:

    I want to study in field of ” Detecting fake account on online social network by content analysis ” for Master’s Thesis. Is there the possibility of working in this area?

  234. Pingback: How to publish in top conferences/journals? (Part 2) - The opportunity Cost - The Data Mining Blog

  235. Harshi says:

    Sir, iam seraching for a thesis topic in Data Mining for my M.Tech course. I select Cloud computing but in this topic what i have to do i dont know. Please suggest me.

    • You can start by finding some recent papers related to what you are interested in. Then, you can think about how to improve their work, or combine two topics to create a new topics. It is not very easy to find a topic. But finding a topic should always start by reading some papers to know what other people have done recently in top conferences and journals in your field.

  236. Jumoke says:

    Sir, I read through online social networks and came across understanding user behaviour in online social network and also read an article which is just a survey ,they studied from different perspectives such as connection.traffic activity .user behavior and – malicious behaviour….then I went on to read on malicious behavior but found so many things that has been done in it…..please can you put me tru

    • That is good. Reading papers is the first step. But to do a project on social networks, you also need to have data. The easiest way to have data is to dowload public data from a website like the Stanford dataset repository (SNAP if i remember). Because if you don’t have data you cannot do the research project anyway. So thinking about the data is also very important to choose a topic. If you choose a topic that you cannot get data, then it is not a good topic Of course, you could also collect your own data but it is more difficult. Or you could also contact authors of the papers to ask them to share their data.

      For the other part, of choosing the topic itself, you need to read papers and you should also discuss with your supervisor who should be supposed to help you choose a topic. In my country, it is actually the job of the supervisor to help students to find the topics, as the supervisor should have the knowledge to search good topics.

      If you really cannot find anything then choose a paper that you like and try to improve the method by adding new features or design something better by using a different approach.

  237. Jumoke says:

    Got data from there aiidy

  238. Rey says:

    Good day sir Philippe, your blog really helps me decide what will be my research.
    I decided to modify a data mining algorithm to improve its accuracy. Just wanna ask your advice amongst the algorithm used in data mining that can still be improved its accuracy and somewhat not that complex to do… your help will be very much be appreciated. thank you

    • Hello, glad that the blog is helpful. I think most algorithms can be improved in one way or another. Sometimes an algorithm can be improved in terms of speed or memory. Sometimes, it can be improved in terms of accuracy or other measures. Sometimes, it can be improved in terms of adding new features such as modifying the algorithm such that it can take some constraints into account. Those are all different ways of improving an algorithm. Of course, some algorithms are more well-studied than others. For example, for some classic algorithms like K-Means, there exists probably hundreds of extensions of that algorithms. So proposing another version of k-means may not be the most original thing to do. But it could still be done. Which algorithms? It depends what you like. Data mining is a broad field and there are thousands of algorithms. So it is hard to say which one should be modified or improved. Personally, I work a lot on topics related to pattern mining. If you like that topic, you can check my software called SPMF, which provides the source code and datasets for hundreds of algorithms. You could start from some of those algorithm and quickly modify them with new features.

      • Rey says:

        thank you very much, sir, for your very quick and informative reply.. actually, i was advised by my adviser for making a research with modified algorithm especially in areas in AI, datamining, and info. security. but I am more inclined in doing data mining… I will check the link you gave. And hope I could still ask you more on this… thank you sir.

  239. Robishna says:

    Hello Sir, It was very much helpful when I came across your website for information in data mining. My field of interest includes Genetic algorithms in data mining. Could you help me find thesis topic in this area ? I would be very much grateful to you. Hope you could suggest some interesting topics. Thank you for your time.

    • Hello,
      Glad you like the website. Genetic algorithms and other evoluationary algorithms like PSO are used for finding approximate solutions to difficult problems. Personally, I work on the field of pattern mining, where we want to find patterns in databases. This is a difficult problem. For that problem, genetic algorithms and other evoluationary algorithms are used to preserve the privacy of people to hide patterns, or to discover patterns. For example, the idea of hiding patterns using this kind of algorithms is used in some papers by my colleague:

      Lin, J. C-W., Liu, Q., Fournier-Viger, P., Hong, T.-P., Pan, J. S. (2015). A Swarm-based Sanitization Approach for Hiding Confidential Itemsets. Proc. of the Eleventh Intern. Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 572-583.

      Lin, J. C., Hong, T.-P., Fournier-Viger, P., Liu, Q., Wong, J.-W., Zhan, J. (2017). Efficient Hiding of Confidential High-Utility Itemsets with Minimal Side Effects.. Journal of Theoretical and Experimental Artificial Intelligence, Taylor and Francis (to appear)

      Another possibility is to use the genetic algorithm to find the patterns. Here is some example paper by my colleague:

      Lin, J. C.-W., Yang, L., Fournier-Viger, P., Hong, T.-P., Voznak, M. (2016). A binary PSO approach to mine high-utility itemsets. Soft Computing, Springer. 19 pages (accepted, to appear).

      Besides, these papers, many people have worked on similar topics to find or hide various types of patterns.

      For some of the algorithms by my colleague, you can get the Java source code and datasets from my SPMF data mining software:

      http://www.philippe-fournier-viger.com/spmf/

      There you will get the code of the following algorithms:

      algorithm for mining high-utility itemsets in a transaction database using genetic algorithms
      – the HUIM-GA algorithm (Kannimuthu et al., 2014)
      – the HUIM-GA-tree algorithm (Lin et al, 2016)
      algorithm for mining high-utility itemsets in a transaction database using particle-swarm optimization
      – the HUIM-BPSO algorithm (Lin et al, 2016)
      – the HUIM-BPSO-tree algorithm (Lin et al, 2016)

      You could use some of these algorithms as starting point for your project by improving them or add some novel features for example. Using the code of these algorithms could save you a lot of time.

  240. ashu says:

    hello sir,
    I am interested sequential pattern mining, can u suggest me recent research topic ?
    and important book ,websites,journal for research. Thanks!!!

    • Hello,
      For an overview of the topic, you can read my recent survey of sequential pattern mining:

      Fournier-Viger, P., Lin, J. C.-W., Kiran, R. U., Koh, Y. S., Thomas, R. (2017). A Survey of Sequential Pattern Mining. Data Science and Pattern Recognition (DSPR), vol. 1(1), pp. 54-77.

      For implementations of the sequential pattern mining algorithms, you can check the SPMF library : http://www.philippe-fournier-viger.com/spmf/ It will give you source code and datasets, which will be useful for starting your research.

      For the topics, there is a lot of possibilities. Basically, you can combine two topics to create a new topic. For example, a topic is sequential rule mining. Another topic is negative patterns. So you can combine these two topics to obtain negative sequential rules and modify an existing algorithm to do that. So basically, you can look at the different topics in sequential pattern or sequential rule mining and combine two topics together to create new topics. Ideally, you want a topic that is a little bit challenging, that is where the combination of two topics will create some new challenges. If it is too easy, it may not be a good topic.

  241. prachi says:

    Hello Sir.
    I am an undergraduate student. I would like to know that is there any new concept apart from classification, clustering, prediction, association rules, etc. that I can work upon for my project. Like any completely new concept in data mining, maybe which is still under research?

    • In data mining, the main research areas are those that you mentioned. But you could also add topics such as outlier detection to that list, or some subtopics such as stream mining, graph analysis, spatial data mining, etc. Moreover, you can also address these topics from various angles : (1) new application, (2) faster, more memory efficient, more scalable, more accurate algorithms, (3) algorithms with more features, (4) theoretical contributions, (5) etc.
      All the topics that you have mentioned are actually core research areas in data mining. But they are still under research and there is still a huge amount of research opportunities on all these topics. If you dig deeper on any of these topics by reading recent conference and journal papers from top conferences, you can see what the researchers are currently working on, and find something that you can improve. Or you can create new topics by combining two existing topics in a non-trivial way. There is rarely anything that is completely new. For example, “big data” is just a buzzword to talk about scalable algorithms, which is a problem studied for decades. Another example: “deep learning” is another buzzword for a special type of “artificial neural network” (ANN), and work on ANNs started more than 50 years ago. So, what I want to say is that choose a research area that you like, and then try to read the current papers to find some up-to-date problems to solve or method to improve. This is my advice.

Leave a Reply

Your email address will not be published. Required fields are marked *