Hi all, This is to let you know the good news that the UDML workshop on Utility Driven Miningand Learning will be back this year, at IEEE ICDM 2020, for the third edition.
This is a good venue to submit your papers about data mining and machine learning, especially given that all accepted papers will be published in the IEEE ICDM workshop proceedings, just like last year! Also, we are planning to have a special issue in a good SCI/EI journals for the best papers of the workshop (to be confirmed).
But we are also looking for papers on other more general topics related to the concept of utility, such as to analyze/learning the important factors (eg, economic factors) in the data mining or machine learning process. Here is a non exhaustive list of some potential topics:
Theory and core methods for utility mining and learning
Utility patterns mining in large datasets, e.g., high-utility itemset mining, high-utility sequential patterns/rules mining, high-utility episode mining, and other novel patterns
Analysis and learning of novel utility factors in mining and learning process
Predictive modeling/learning, clustering and link analysis that incorporate utility factors
Incremental utility mining and learning
Utility mining and learning in streams
Utility mining and learning in uncertain systems
Utility mining and learning in big data
Knowledge representations for utility patterns
Privacy preserving utility mining/learning
Visualization techniques for utility mining/learning
Innovative applications in interdisciplinary domains, like finance, biomedicine, healthcare, manufacturing, e-commerce, social media, education, etc.
New, open, or unsolved problems in utility-driven mining
Submissions are limited to 10pages, and must be formatted according to the IEEE 2-column format(link) Papers will be evaluated based on the evaluation criteria of the main IEEE ICDM 2020 conference for research papers. In particular, papers must present original research that is not under consideration in other journals, conferences and workshops.
In this blog post, I will share another talk that I have recorded recently. This time, I will explain a new paper from my team about discovering cost-effective patterns using some algorithms called CEPB and CEPN. Mining cost-effective patterns is a new topic in pattern mining that combines the concept of utility with that of cost.
Hope you will enjoy this video! If you want more details about this topic, you can read this paper:
Today, I will share a short keynote talk (28 min) about discovering interpretable high utility patterns in data that I have presented at the CCNS 2020 conference. This talk gives an overview of techniques for finding interesting and useful patterns that can help to understand data.
Hope you will enjoy this video! If you want to know more about how to find interesting and useful patterns in data, I have written a series of blog posts on this topic.
I have also published various videos that you can find on this blog. Moreover, to apply this in your projects, you can use the SPMFopen-source data miningsofware (which I am the founder). It provides more than 150 algorithms for identifying useful patterns in data.
In this blog post, I will talk about how about to record a research talk on a computer as a video. This is an important topic for researchers for at least two reasons. First, sharing videos talking about your research can help to promote your research. Second, a researcher may be invited to send a video of his talk to a conference if he cannot attend it because of issues such as not obtaining a travel visa. Third, recording a video of a talk is useful as a backup plan when giving a talk online.
The steps to record a presentation as video on a computer are as follows.
Step 1. What kind of presentation do you want to give?
The first step is to decide on the type of presentation that you want to record. The most common types are:
A) Slides with voice-over: A person will record some slides with a voice-over.
B)Videoof a talk: A person will record a video of himself talking without slides.
C)Complex presentation: A person will combine multiple elements such as a presentation with slides, a video of himself, and audio.
Doing a presentation of type A) or B) is easier than of type C). But a more complex presentation may sometimes appear more interesting.
Step 2.Make sure that you have the right equipment
Recording a presentation can be done using very basic equipment like a cellphone or the microphone and webcam of a laptop computer. However, the quality of built-in webcams and microphones if often poor. To record video presentations, I use:
A professional microphone. I have bought one that is not so expensive and can be plugged by USB, and comes with a tripod (the SAMSON C01UPRO – see below). Using such microphone makes a huge difference in sound quality compared to the built-in microphone of my laptop. Some people will also buy additional accessories for their microphone like a pop filter, and a microphone shock mount.
A good webcam. I have also bought a good webcam (Logitech c922 Pro Stream), which can record in high definition with good colors. A nice feature is that the webcam can also be mounted on a tripod and that it has a free background removal feature that I will talk more about later.
Light. A good lighting source is also important if you are going to record videos of yourself using a camera and want to look good. Some cheap LED lamps or LED panels can for example be purchased and installed on your desk.
The above is perhaps the most important piece of equipments to increase the quality of recorded talks. Other equipment could also be added like tripods, a green screen for shooting videos, good headphones, etc. Here is a picture of my relatively simple setup for recording videos. I use two LED lamps, and an external webcam and microphone.
Step 3. Prepare your presentation
Before recording a talk, it is recommended to prepare your talk well and rehearse it a few times. This is true for any talks so I will not talk about this here.
Step4.Record the video
Depending on the type of presentation that you will make, it will be more or less complicated to record the presentation. I will discuss a few cases below.
For a presentation of type A) (slides + voice-over), it is quite simple. One can prepare his slides with a software such as Microsoft Powerpoint and then use the “Record slide show” feature to add voice to the presentation. This is done by clicking on the button below:
The result is a Powerpoint presentation that can be played with audio on any computer equipped with Powerpoint. Then, for more convenience, there is some software to convert a Powerpoint presentation to a video.
For a presentation of type B) (video from a camera), one can use some basic software to record from a camera such as a wecam. Some basic software to record a video come packaged with most operating systems (e.g. the Camera app in Windows 10). However, there also exist many other software programs that let you record videos but also add special effects, transitions, texts and other elements to your videos. Some video editing software are quite powerful and easy to use (e.g. Wondershare Filmora, Movavi Video editor) while some are harder to learn but are more powerful (e.g. Adobe Premiere).
For a presentation of type C)(slide + video of the person), it is more complicated to record because it requires to not only record your slides but also a video of yourself at the same time and then put them together in a video. Here is a picture of the result that we may want to achieve:
To do such video recording, I use the Camtasia software, which allows to record my screen or a Powerpoint presentation with a Webcam at the same time, and then to edit the resulting video with effects, transitions, text, etc. This software is not free, but it is very easy to use and powerful. Other alternative software could certainly be used.
I first open my slides with Microsoft Powerpoint.
Then I open the “Camtasia Recorder“. You can see the interface, below:
There, I first select the part of the screen that I want to record by clicking the “Custom” button. Then, I choose to record using my webcam by clicking “Camera on” and using my microphone by clicking “Audio on“. Then, to start recording, I click the “rec” button.
Then, after I finish recording my presentation, I click “Stop“.
After clicking “Stop” this opens the Camtasia editor and there I can edit the video that I have recorded. The interface of the Camtasia editor looks like this:
In the editor, it is possible to cut some part of the videos, add effects, and many other things. As you can see in the picture above, I have two different video tracks (at the bottom), one for the video recorded from the webcam (with a green background), and one for the presentation.
Then, since I have shot the video of myself with a green background, I can remove the background behind me. This is done by clicking on the video track of me and adding the “Remove a color” visual effect where I choose “green” as the color to be removed. (see a screenshot below):
This effect called “Chroma key” is a nice effect to have to do a nice presentation. It allows to have a transparent background so that I can overlay my video on top of my slides! If you also want to do this, you first need to shoot your video of yourself with a green background. There are two ways to do this. The traditional way is to shoot with a green screen behind you like this (source of the picture: Amazon).
However, buying a green screen is actually not necessary. A more simple solution is to use a virtual webcam software like ChromaCam that will use machine learning to automatically remove your background and put a green background behind you.
This is what I have done in the example above to avoid buying a real green screen. The latter would of course give a better effect but it would require additional space and money. The virtual webcam software Chromacam can be used for free but in that case, it will add a watermark to your videos. To remove the watermark, it is possible to buy a license. Or if you have a webcam like the C922 Pro Stream or Brio from Logitech, then ChromaCam will be free to use for the ChromaKey effect. So this is one of the reasons why I chose to buy the C922 Pro Stream for my setup. There are some other alternatives to ChromaCam like XSplit VCam but it is also not free and worse, it is based on a subscription model that requires to pay every month. There might be some other free alternatives to Chromacam but I did not find a good one that is easy to use and give good results. Here is a picture of the Chroma Key effect obtained using the Chroma Cam software:
As you can see above, it can remove the background quite well, although it may cut a bit of my hair and shoulder sometimes 😉
Another important thing that I do using the CamtasiaEditor is to add a “Cursor effect” so that my mouse pointer is highlighted in yellow in my videos. The result looks like this:
To do that, I click on the video track of my slides in Camtasia Editor and select a Cursor effect from the one offered:
Lastly, after recording the video, the last step is to encode it in an appropriate format. I usually choose MP4 because it is read by most browsers and devices. Then I publish the video. There exists various websites for publishing videos with different features. In my case, I already pay for a hosting company to host my website. Thus, I put my videos on that platform.
In this blog post, I have provided some tips about how to record a research talk on a computer. Hope this has been interesting and will be useful!
If you have some comments or other complementary advices, please leave a comment below.
Today, I presents the CPT and CPT+ sequence prediction models in a video. Sequence prediction is an important task in data mining which consists of predicting the next symbols of a sequence. It can be used for example to predict the next word that someone will type on a keyboard, or the next location where someone will go.
Fist I would like to wish a happy new year to all readers of this blog. I wish you health, hapiness and also success in your research projects! I am also thankful to all those who have used and/or contributed to the SPMF data mining software , which I have founded already a decade ago! Time goes fast, but the project is still active, and I am preparing a new release with about 10 new algorithms that will be released in one or two weeks. The new algorithms have been contributed by various people. By the way, if you would like to contribute code to SPMF, it is also welcome.
Now, I want to talk a little bit about the new year. The new year is a good time to think about past achievements and update ourgoals or set new goals. Having clear goals and working hard towards these goals is key to be successful.
That is all I wanted to say for today!
== Philippe Fournier-Viger is a full professor and the founder of the open-source data mining software SPMF, offering more than 170 data mining algorithms. If you like this blog, you can tweet about it and/or subscribe to my twitter account @philfv to get notified about new posts.
So you have a paper accepted for presentation at an academic conference and you wonder how to prepare for attending the conference? In this blog post, I will discuss this topic.
Making a travel plan
For an international conference, the first thing to do before attending the conference is to check for thetravel requirements. Travelling to several countries or territories require to apply for a visa and obtaining a visa can sometimes take a long time, and require to have various documents ready such as an invitation letter. Thus, it is better to start the process of applying for a visa early if needed. One may also require to obtain the approval from his university or company to attend a conference. If one cannot attend the organizers, he should also let the organizers know about it or arrange someone else to replace him.
After ensuring that you can enter the country/territory where the conference is held, the second most important thing is to have a transportation plan. For international conferences or domestic conferences that are far away, one should reserve an airplane/bus/train ticket early, as prices may increase and less choices may be available over time. Generally, I would recommend to arrive at least one day before the conference at the city where it is held.
You may also want to pay for a travel insurance and check if some vaccines are required. Travel insurance can sometimes be purchased with your airplane ticket.
Then, one should also book an hotel room early. When a conference is held in a famous city, sometimes the most affordable hotels or those that are the closest to the conference may become fully booked quickly.
Preparing your talk,and giving a good talk
If you are planning to give talk (a presentation of your research work) at a conference, you should prepare your presentation BEFORE the trip. I have previously written a blog post about how to give a good oral presentation at an academic conference and another one here. You may read these blog posts which gives many advices rabout how to prepare and deliver a good talk. Then, after your presentation is ready, if you are using electronic slides such as PPT slides, you want to put them on your laptop, on a USB drive and perhaps also keep a copy in your e-mail to avoid any problem.
If one has to present a poster at an academic conference, he should also prepare the poster in advance and keep some time for printing it.
Preparing a networking plan
In my opinion, the most important reason for attending an academic conference is to meet other researchers because all the papers presented at a conference can be read online anyway. To take advantage of the networking opportunities offered by a conference, you may look at the list of attendees before attending the conference and make a list of people that you would like to meet and discuss with. Meeting other researchers is important for the career of a researcher as it allows to exchange ideas and also develop collaborations and look for opportunities such as finding a post-doctoral, researcher or faculty position.
At a conference, many people will ask you where you are from?what kind of research are you doing? It is also good to have a short 30 second or 1 minute answer ready for these questions, as it may help to start some discussion. It is also good to bring your business cards if you have some, and it is useful to invite the people that you meet to connect on your profesionnal social network website like LinkedIn so you may want to install it on your phone. By the way, if you don’t already have a website, or profile on LinkedIn or on academic social networks like ResearchGate, it is a good idea to have one for your career so that people can find you online.
When you participate to an academic conference, you should also look at the schedule and make a plan of the activities that you want to attend to use your time well. And especially, you should not miss the networking activites like coffee breaks, banquet, reception, and poster sessions to talk with other people. Also don’t be shy. If you don’t know anyone, then remember that most people attending the conference also probably don’t know anyone and will be happy to talk with you.
Taking the airplane
If you fly to a conference, it is important to prepare your luggage well and what you will carry in the airplane. I generally prepare a luggage and also a backpack or small bag that I bring with me in the airplane. In that latter bag, I carry:
My passport, a printed copy of airplane tickets (because you may have to show your return ticket when arriving in another country), visa or other required travel documents, and travel insurance.
Computer and accesories (usb, charger, laser pointer, mouse, adapter to connect computer to a projector, etc.), cellphone.
Earplugs (for the noise in the airplane), headphones and adapter for using it in an airplane (because headphones provided in airplanes are sometimes quite bad),
Pens (always useful for filling forms when arriving in another country)
International power plug adapter (you should check if needed before travelling) to be able to use your electronic equipments
Cash, debit cards, credit cards, and other valuables items (jewelry, etc.).
Medicines (if needed)
Book (if I want to read in the airplane)
I also bring a very thin sport jacket to put in the airplane in case it is too cold (but you can also ask the air attendant for a blanket ).
I then put all other things in my luggage. For a conference, it is important to bring some nice clothings but it also does not need to be highly formal either.
Before entering the airplane, you should also choose your seat when checking in. In an airplane there are some good seats and some bad seats. For a long flight, I prefer to have an aisle seat (a seat beside the walking alley) because if I need to go to the washroom or walk a bit, I don’t need to ask other people to let me pass (they may be sleeping), and there is no one besides me on one side. The second best seat is the window seat, because there is also no one besides you on one side and you can lean on the window to have a rest. The worst seat are the seats where you sit between two persons because you may be squeezed between two persons and you can’t enjoy the window view and still need to ask other people to pass if you need to go to washroom or walk outside.
Arriving at the conference
When you will arrive at the conference, the first thing to do is to register at the registration desk. Then, you can enjoy the various activities of the conference.
In this short blog post, I gave some advices about attending a conference that I hope will be useful, especially to those attending an academic conference for the first time. If you have some questions or if you think that I forgot to mention something important, then please leave a comment below!
== Philippe Fournier-Viger is a full professor and the founder of the open-source data mining software SPMF, offering more than 170 data mining algorithms. If you like this blog, you can tweet about it and/or subscribe to my twitter account @philfv to get notified about new posts.
In this short blog post, I will answer the question: what is the difference between Machine Learning and Data Mining? I will first explain what is artificial intelligence, machine learning and data mining. Then, I will answer the question.
What is artificial intelligence and machine learning?
Artificial intelligence is a field of research, which aims at developing software that can do some tasks that require intelligence. What is a task that requires intelligence is open to debate and can be for example to play chess, translate documents, write a novel, or choose the best route to drive from one location to another. This broad definition of artificial intelligence that I have given is defined based on the behavior of a software program (what a software program can do rather than how it works). Some people define artificial intelligence in a stricter way by requiring that an artificial intelligence should also simulate the mechanisms that intelligent beings such as humans use for producing intelligent behavior. In another word, an intelligent program should not only appear to behave intelligently but should also mimic how the brain works, for example.
There exist many types of artificial intelligence techniques. Some early research on artificial intelligence proposed the so called expert systems where a human expert would give knowledge to the system (for example, as a set of IF-THEN rules), which the system would then apply to behave intelligently. A problem with this approach is that writing knowledge by hand is time-consuming and prone to error for complex tasks, and that it is not always easy for a human expert to encode his knowledge. Such systems have also been called knowledge-based systems.
Another type of artificial intelligence systems does not require knowledge or data. This is the case for example of algorithms such as A* (a-star), which are used for example to play games. Consider a simple game like Tic Tac Toe. All the possible moves in this game can be viewed as leading to different states, including some states where one wins or loses. Because the number of possible states for such games is rather small, a simple algorithm to play such games can search through all the possible states or a subset of them to select the best move to perform.
Other artificial intelligence systems are not preprogrammed and are designed to learn by themselves from data. The field of research aiming at designing such systems is machine learning. Some popular types of machine learning systems are artificial neural networks, which are very loosely inspired by the brain. Such systems are generally trained to do some specialized task using some training data indicating what is the expected behavior in a given situation. The system then generalizes from this data to take decisions in new but similar situations. This process is called supervised learning. This is for example the case of a system for reading handwritten texts. Such system can be trained using handwritten letters where correct answers are provided by a human. After training the system with many examples of letters, the system can then recognize new letter drawings. There also exist some artificial intelligence systems that can learn from data without knowing the correct answers beforehand. This is called unsupervised learning. To summarize, machine learning is a subfield of artificial intelligence where a software program can learn from data.
What is data mining?
Data mininghas a different focus. As the name implies, data is key to data mining. Without data, one cannot do data mining. The goal of data mining is to analyze data by discovering knowledge hidden in the data. For example, a classic data mining task is frequent pattern mining, which consists of finding the sets of values that frequently appear in data (e.g. discovering that many people buy bread with cheese and a chocolate bar at a supermarket). This task is unsupervised and has for only purpose of discovering something new in the data. Generally, such techniques can be used to understand the past or predict the future.
Some other data mining techniques are explicitly designed for extracting models from data that can then be used for making predictions. This is the case of techniques such as neural networks, decision trees, and regression models. Now, you probably remember that I already talked about neural networks as a machine learning technique. This is because data mining is actually overlapping with machine learning. In other words, some data mining techniques can also be called machine learning techniques.
What is the difference between machine learning and data mining?
Though, machine learning and data mining overlap, and both require data, data mining traditionally focus more on providing knowledge or models that are explainable or interpretable by humans, while machine learning studies are often more focused on what a model does. As a result, several machine learning models are designed to provide a high accuracy for some tasks such as handwritten character recognition, but appear to work like a black-box to humans. There is thus currently an important need to build more interpretable or explainable machine learning models. The problem of black-box machine learning models is illustrated in this funny picture from XKCD (credit: https://xkcd.com/1838/ ):
That is all for this blog post. I just wanted to discuss differences and similarities between machine learning and data mining. If you would like to add something to this, you can post a message in the comment section, below.
— Philippe Fournier-Viger is a full professor working in China and founder of the SPMF open source data mining software.
This week, I have attended the 7thBig Data Analytics conference(BDA 2019), which was held in Ahmedabad, India from the 17th to 20th December 2019. This was a great event with good keynote speeches, invited talks, research papers, tutorials, a workshop on IT for agriculture, a panel and social activities. In this blog post, I will give a brief report about the conference.
The Big Data Analytics (BDA) conference
The BDA conference is an international conference about Big Data Analytics, Data Mining, Machine Learning and related topic. This year is the 7th edition of the conference. BDA is held every year in different cities of India but it attracts papers from several countries. This year, authors from 13 countries published papers, and the program committee, invited talks and keynote speeches comprised experts from numerous countries, as well as local experts. There was about 150 to 200 persons attending the conference.
The proceedings of the Big Data Analytics (BDA 2019) conference are published by Springer in the LNCS (Lecture Notes in Computer Science) series, which ensures a good visibility to the published papers. The papers are indexed by EI, DBLP and other major indexes for computer science. This is the proceedings book, which is available electronically to attendees:
It was a pleasure for me to work as Program Committee co-chair for the conference to help select papers and build the program. This year, there was about 53 submissions, from which 13 were selected for publication (an acceptance rate of about 25%), and five invited papers were also published, for a total of 18 papers. The idea of having invited papers from top researchers was a good one, as it brought some really good papers.
Location of the
BDA 2019 conference
The conference was held at Ahmedabad University. It is a relatively new university (10 years old). The university is located in the city of Ahmedabad, in the state of Gujurat, India.
Ahmedabad is famous for being a place where Mahatma Gandhi had lived, among other things. It also has some historical buildings and structures in and around the city, that are quite interesting. People living in this city are mostly vegetarian, and in that state, all alcohol is prohibited (unlike in other parts of India). There is also some local language spoken by the population. It was interesting to visit the city.
The local organization
was very well done. Everything was well arranged. For example, an airport
pickup service was offered to all international attendees, and e-mails were always
answered very quickly by local organizers.
Day 1. Registration
On the first day, I registered and received a nice bag with a pen, notebook, schedule and other things inside.
The conference badges offered by the conference are of good quality. They are made of a wood-like material where names and affiliations appear to have been etched into the material.
Day 1. Tutorial
and Workshop on IT in Agriculture
On the first day of the conference, there was tutorials. Moreover, there was a workshop on IT in agriculture. I listened to the keynote by Prof. P. Krishna Reddy, which was quite interesting. It talked about how he has developed computer systems to provide advices to farmers in India, in various projects for more than 10 years. This is interesting as it is not just theory but has real practical applications that can change life of many people.
Day 2, 3, 4 – Paper presentations
The paper presentations were quite interesting. I will not report about the details of each paper. But the paper covered a wide range of topics from pattern mining, information extraction, online review helpfulness prediction, urban tree type classification to data warehousing.
As I am a researcher working onpattern mining, I am particularly interested by this topic. There was three papers on pattern mining:
Duong, H., Truong, T., Le, B., Fournier-Viger, P. (2019). An Explicit Relationship between Sequential Patterns and their Concise Representations. Proc. of 7th Intern. Conf. on Big Data Analytics (BDA 2019), Springer, pp. 341-361. (this is a paper about a new way of finding frequent sequential patterns using generator and closed sequential patterns).
P. P. C. Reddy, R. Uday Kiran, Koji Zettsu, Masashi Toyoda, P. Krishna Reddy, Masaru Kitsuregawa: Discovering Spatial High Utility Frequent Itemsets in Spatiotemporal Databases. 287-306 (this is a paper about extending high utility itemset mining for spatial data)
Day 2 – Cultural
performance and reception
On the evening of the second day, there was a music and dance show, performed by students of the Ahmedabad University. Although students may not be professional, the show was quite good. It presented some traditional dances and Indian songs. The show was followed by a dinner.
Day 3 – Panel:
Big Data Analytics is not AI
On the third day, there was a panel titled “Big Data Analytics is not AI” that has sparked a lot of discussion, organized by Anirban Mondhal. I was one of the panel members, along with Goce Trajcevski, Shashi Shekhar, Ladjel Bellatreche, Sanjay Madrias and others. Here is a picture (some panel members not shown):
The topic was the relationship between machine learning and big data analytics. Four questions were asked to panel members, and then the audience asked additional questions.
Should CS students learn theory and skills related to both BDA and ML? My answer: Artificial intelligence and big data analytics are popular. It is thus good for students to at least become familiar with these topics. Moreover, if one wants to become user of these techniques, he should not only learn how to utilize the many libraries available that are easy to use but also understand the theory, and the assumptions behind these techniques. This is important because if one does not understand the assumptions or theory behind these techniques, one may apply them wrongly. Also, before learning big data analytics and machine learning, it is better to have a strong foundation about the core concepts behind those such as databases, linear algebra and statistics.
Should researchers work across both BDA and ML or specialize in any one of these areas? My answer: As researchers, we always tend to specialize in some area. This is reasonable because we are expected to publish state-of-the-art research, which requires to know well research in a given field. Having said that, I would like to talk about the relationship between big data analytics and machine learning. Generally, the goal of artificial intelligence is to build some software that can perform some task(s) that are said to require intelligence. On the other hand, the goal of big data analytics or data mining is to discover some useful information or build some useful models from data to understand the past or predict the future. Thus, artificial intelligence and big data analytics have different goals. The main one is that many techniques from artificial intelligence require data to train models. The artificial intelligence techniques that are not explicitly programmed but instead learn from data are called machine learning. The requirements for cleaning, preparing, transforming, storing and handling data may be the same as big data analytics. But there exists some artificial intelligence techniques that do not require training data. For example, this is the case of some traditional AI techniques such as theorem provers, path planners and logic reasoners. There are also some differences between machine learning and big data analytics. An important one is that machine learning tends to focus on building models that do something well or are accurate but are often black boxes (a model works, but the user don’t know why or how the model do predictions – this is the case of many deep learning models for example). On the contrary, many big data analytics techniques focus on discovering interpretable insights and on the visualization of results. For AI researchers, there is a lot to learn from data science/data mining about building explainable and interpretable models. But also, it is to be said that machine learning and big data analytics/data mining are also some fields that are overlapping. Some techniques such as neural networks can be said to belong to both machine learning and big data analytics.
In the future, will the industry have separate roles for BDA and ML specialists? My answer: In the industry, it depends on the size of the company. Bigger companies tend to have persons doing more specialized tasks, while smaller companies may have persons doing many tasks. Recently, it has been interesting to see on some website like LinkedIn that many specialized job titles have been proposed such as: •Data scientist •Data engineer •Data architect •Data developer •Data analysist •Data warehouse software engineer •Database engineer •Statistician •Business analysis •Machine learning engineer •Predictive modeler… I personally don’t know very clearly the differences between all these job titles, and I often see contradictory definitions about these job titles.
From a long-term perspective, do you see BDA and ML converging as a single research area or will they grow independently? My answer: No. As I said previously, big data analytics and machine learning have many things in common but also some different goals. Besides, in academia, there exists some communities that are clearly defined such as statisticians, data mining, machine learning, and researchers tend to stay in their field and publish in the journals and conferences of their community. It would take some time and major effort to redefine these communities.
Day 3 –
On the evening of the third day, there was a banquet outside. There were some tables serving Indian food and some chairs for those who wanted to sit. Others would eat standing and talk with others. As always, banquets are good for networking with other researchers. I had some good discussions with friends and met some other international and local researchers. Moreover, I was happy to talk with some local students who attended the conference and asked me some questions about how to learn about data science and machine learning. Besides, I was happy to meet some professors from some local universities who told me that they were using my SPMF data mining software for teaching data mining.
Here is a group photo of BDA attendees:
Next year: BDA 2020
Next year, the BDA 2020 conference will be held in New Dehli, India. Then, BDA 2021 will be held in Allahabad, India.
In this blog post, I have given a brief report about the 7th Big Data Analytics conference (BDA 2019), from my perspective. On overall, it was a great conference, and I am very happy to have attended it. It was the first time that I went to India, and it has been a good experience. The quality of papers was quite high, and the invited speakers, tutorials and keynote speeches were very interesting. I will try to attend it again next year.