Brief report about the BIBM 2022 conference

This week, I have attended the BIBM 2022 (IEEE International Conference on Bioinformatics and Biomedicine) conference online. I will give a brief report about this conference, although I did not have time to attend many sessions.

BIBM 2022 was held from the 6th to 8th December 2022 in Las Vegas, USA and from Changsha, China. However, due to the COVID situation, attendance in Changsha has been changed to online attendance.

BIBM is an international conference for bioinformatics. Interestingly, it has been held for over 15 years while being most of the times organized either in USA or China.

A bilingual conference

BIBM is actually held partly in English and partly in Chinese. Some sessions from USA were completely in English while some session from China are completely in Chinese. But authors from USA and China could in theory watch the talks from the other location. Talks from USA could be watched using Zoom, while talks from Changsha could be watched using Tencent Meetings.

Proceedings of BIBM

The proceedings of the BIBM conference are published by IEEE. These proceedings contain regular papers, workshop papers as well as poster papers. It will be interesting for some authors that all these papers are published in the same proceedings.

The proceedings were quite large with numerous papers. It was a PDF file of over 1 GB and over 3900 pages! Luckily, the Edge browser of Windows on my computer can open such large file without problem. B

This year, there was also a good range of workshops to choose from, with over 25 workshops hosted at BIBM. So even if a paper was not accepted as regular paper, there was a good choice of workshop to choose from for publishing a paper.

Day 1 – Opening ceremony and first keynote in Las Vegas

On the first day, I stayed up late (until 1:30AM) to watch the conference opening in the USA, which was in the US. Unfortunately, there was some technical problem at the conference site in Las Vegas and we could not hear anything during the first maybe 45 minutes. The sound came back during the first keynote, which was about bioinformatics. Due to the technical problem I did not get much information from the opening. I will try to get the slides from the opening and update this post later with the information.

Day 1 – Opening ceremony and first keynote in Changsha

Then, after waking up, I attended the second opening ceremony for people attending from Changsha China that gave also some general information about the conference. The opening and the following keynote talks from that session were given in Chinese.

The opening was explained that some part of the registration fee will be refunded to attendees as the conference is online, which is good news. The details will be explained after the conference.

And also, from what I understood, a best paper award was announced.

Below, I show some pictures from the opening in Changsha. There was a lot of attendees (over 250 at some point!).

And here are a few slides from the first keynote on “Brain imaging pattern recognition methods and imaging representation of mental disorders” by Huafu Chen, which had some interesting content.

A second keynote was by Min Li about “Computational solutions to explore genomic 3D organization. Here are a few slides:

An interesting paper on sequential patterns for insurance claims

I also noticed a very interesting paper on utilizing sequential patterns with the CM-SPAM algorithm and periodic patterns with the LPP-Growth algorithm to analyze courses of medical treatment to obtain insights about anomalous insurance claims.

Kemp, J., Barker, C., Good, N., Bain, M. (2022) Sequential pattern detection for identifying courses of treatment and anomalous claim behaviour in medical insurance. Proc. of BIBM 2022.

That is a cool topic and a real application. Using the periodic patterns and sequential patterns, previously unknown anomalous claim patterns were found, which confirmed previously suspected anomalous claim pattern. Authors said that up to $486,617.60 in potentially recoverable costs were identified and a benefit of using a pattern mining approach is interpretability.

Analyzing COVID protein structures

My research team also published a paper about the analysis and classification of protein structures from COVID-19. You may read the paper below:

Nawaz, S., Fournier-Viger, P, He, Y. (2022). S-PDB: Analysis and Classification of SARS-COV2 Spike Protein Structures. HPC4COVID-19 workshop at IEEE BIBM 2022, to appear.

Registration fee

Although the program of the BIBM conference is good, the registration fee of BIBM is I would say very expensive at 985 USD to publish a single paper. I think this is the main drawback that I see from this conference, which would make me think twice about publishing there again (since some conferences are much cheaper). But with the refund due to the conference being online (which is expected to be around 300 USD), the price is now more reasonable.

Next year: BIBM 2023

It was announced that BIBM 2023 will be held in Istanbul, Turkey. That is a nice location.

Conclusion

This was a short blog post to talk about the BIBM 2022 conference, which I have attended. Hope it is interesting.

Posted in Conference | Tagged , , , , , | Leave a comment

SPMF Upcoming feature: Graph viewer

Today, I will give you a preview of another upcoming feature of SPMF, which will be released in the next version of SPMF (2.59). It is the Graph Viewer tool.

The Graph Viewer is a simple tool for visualizing graphs. The Graph Viewer is designed to display graphs that can be directed or undirected, and have labels. The Graph Viewer can also automatically choose an appropriate layout for visualizing a graph.

Why a Graph Viewer in SPMF? It will be used to allow users to visualize input files containing graphs and output files containing frequent subgraphs. This is useful to visualize the inpu files of frequent subgraph mining algorithms such as gSpan, cgSpan and TKG, as well as the patterns that are discovered by these algorithms (frequent subgraphs).

I have completely implemented the Graph Viewer in Java, without using external libraries so as to avoid dependencies and to make it as lightweight and fast as possible, a long-time design goal of SPMF. In fact, unlike many other data mining libraries and open-source projects, SPMF do not have any external dependencies and the code is well optimized. This ensure the stability of the project and avoid problems that could arise from relying on external libraries.

Let me now show you the current features of the Graph Viewer, which may still be updated or improved in the final release.

Opening a graph file

The first feature is to open an input file containing one or more graphs. This is done by selecting the Graph Viewer tool:

Then, let’s say that we open the example file contextTKG.txt offered in SPMF, which contains three graphs. The Graph viewer will display graphs in a window like this:

Here we see the third graph from the file. At the bottom, there are two buttons < > for navigating to the previous or the next graph. In the above picture, the third graph is shown (Graph 3 of 3). This graph has ID 3, and contains 4 nodes and 4 edges, as indicated in the bottom part of the Window. The nodes are displayed with a text of the form x:y where x is the node ID and y is the node label. Edges are displayed in blue color with their labels.

To display the graph in a pleasant way, I have implemented a forced directed graph layout algorithm, which is the Fruchterman/Reingold (1991) algorithm. It automatically places the nodes in an appropriate location so that the graph can be displayed in a beautiful way.

Opening a pattern file

We can also use the graph viewer tool to display the frequent subgraphs found by an algorithm such as TKG. For example, here I apply the TKG algorithm and select the “Graph Viewer” tool to open the result file.

The result is 16 frequent subgraphs, which are displayed by the Graph Viewer as follows:

In the above picture, we see the frequent subgraph 9. We can use the buttons <> to move to the previous or next frequent subgraph, and thus view all of the 18 subgraphs that have been found. The support of each subgraph is displayed.

Moving the graph nodes with the mouse

Another feature of the Graph Viewer is that we can move the nodes with the mouse by dragging them over the panel:

Running the graph viewer from the command line

It will be also be possible to call the Graph Viewer from the command line, just like almost all algorithms and tools from SPMF. For example, if we put the spmf.jar file in the same folder as the file contextTKG.txt, we can apply this command:

java -jar spmf.jar run Open_graph_database_file_with_graph_viewer contextTKG.txt

Then, this will start the Graph Viewer to display the file:

Displaying other types of graphs

The Graph Viewer is designed in a quite general way so that it could also display other types of graphs and be used for other functions in SPMF in the future. For example, below, I show an example graph that is created programmatically rather than by reading a file.

I use this example to show the display of directed and undirected edges. Also, we can also see that the automatic layout algorithm works quite well and display the graph in a proper way. Here is another example:

In the Java code, we can also change how the nodes are displayed. I did not offer this option in the user interface as I think it is less important though. What do you think?

Update: Choosing different types of graph layout

I had one hour of free time this morning, so I decided to add an option to choose different types of graph layout algorithm. For example, here we see three types of layout:

1) Using the Fruchterman/Reingold (1991) algorithm:

2) Using a random layout:

3) Using a grid layout:

4) Using a circle layout:

I might add more graph layout algorithms later. I think that these algorithms are quite interesting.

Conclusion

Hope that this blog post has been interesting. My goal was to show you some upcoming feature, which I think will be useful for those working on frequent subgraph mining. If you have some suggestions to improve this tool, you may let me know in the comments below. I will consider them. Also, I might still improve this tool before it is released.


Philippe Fournier-Viger is a professor of Computer Science and also the founder of the open-source data mining software SPMF, offering more than 120 data mining algorithms.

Posted in Data Mining, Data science, open-source | Tagged , , , , , , | Leave a comment

SPMF upcoming feature: Algorithm Explorer

Today, I want to show you an upcoming feature of SPMF, which is a new tool called the Algorithm Explorer to explore the list of algorithms and tools offered in SPMF. It is actually a simple tool, but I think it can be useful, as there are many algorithms.

Note that this is a preview of the tool, as it will released in the next version of SPMF (2.59).

To open the new Algorithm Explorer, in the GUI of SPMF, we have to choose:

Then, this will open the new tool called Algorithm Explorer, where algorithms are shown as a tree on the left, classified by categories, and we can see information about the selected algorithm on the right:

In the above picture, we selected the AFEM-Rules algorithm. Thus, we can see the name of the algorithm, the category, the authors of the implementation, the link to the example page, the input and output types as well as the parameters.

Searching for similar algorithms

Update: I also added two buttons that allows to search for algorithms that are similar to a selected algorithm. More precisely, we can search for (1) algorithms with the same input, output and mandatory parameters, and (2) algorithms with the same input and output but that may not have the same parameters.

For example, if we select the RuleGrowth algorithm for sequential rule mining on the left and click on this button:

It will highlight all algorithms that have the same input and output types as RuleGrowth:

And if we instead click on this button:

It will highlight the algorithms that not only have the same input and output types as RuleGrowth but also the same mandatory parameters:

In this case, we can notice that TRuleGrowth is not included anymore because although it has the same input type and output type as RuleGrowth, it has an extra parameter that is the window length.

Let me show you one more example. Let’s say that I choose a high utility itemset mining algorithm like EFIM. Then, I can quickly find that many algorithms have the same input and output types and also the same mandatory parameters:

Conclusion

That is the current version of this tool. I will think about other potential improvements. If you have any suggestions, you may tell me in the comments below, either to add more functions or improve the user interface. I will try to take them into account, when I have time.


Philippe Fournier-Viger is a full professor working in China and founder of the SPMF open source data mining software.


Posted in open-source | Tagged , , , | Leave a comment

Brief report about IEEE ICDM 2022

In this blog post, I will give a brief report about the IEEE ICDM 2022 (International Conference on Data mining), which I have attended. It started on November 28th 2022. I have attended ICDM as a workshop organizer this year and as a co-author.

What is ICDM?

ICDM is a good conference for data mining and machine learning research. I have attended it for several years. You can see for example my reports about ICDM 2021 and ICDM 2020 on this blog. The proceedings are published by IEEE, and the conference was held offline with some online participants (for those who could not attend in person due to the pandemic).

Proceedings and acceptance rate

The proceedings were made available online to authors on the first day of the conference.

For the main proceedings, at ICDM 2022, 885 submissions were received from 54 countries. All papers were reviewed using a triple-blind process. From this, 85 regular papers and 89 short papers were accepted. Thus, the full paper acceptance rate is 9.77% and the overall acceptance rate is 20%.

For the workshop proceedings at ICDM 2022, there were 326 submissions in 15 workshops. And 135 papers (41%) were accepted.

Day 1: Workshops, but some problems

The first day was dedicated to workshops. There were many workshops on a variety of emerging topics. In particular, I co-organized the UDML workshop on utility-driven mining and learning, which is the 5th edition this year, and has been quite successful this year with about 20 submissions, 7 accepted papers, and over 20 participants.

However, on Day 1, I faced problems with the online platform that was adopted for the conference. It is Zoom Events, which can be viewed as a special version of Zoom designed to host events such as conferences with more functionalities than the basic Zoom platform. Zoom events provide a message board and a schedule of all events as well as the possibility to chat with other participants. The problems that I and other participants encountered are as follow.

First, our workshop was supposed to start at 10:00 AM but there was no “start” button to allow us to start our workshop. Thus, hours before, I sent several messages to the organizers on the platform and by e-mail, but we got no answer before the start time of our workshop. There were also a few other workshop organizers that had the same problem and were talking about how to solve the problem in the chat. As we did not receive any answer, for our workshop we decided to create a public Zoom link and send it to authors by e-mail to start the workshop on time. Then, at around 10:40, I received an e-mail telling me that I could start our workshop but we had started it already using our Zoom link, 40 minutes earlier…

Second, multiple authors from our workshop did not receive instructions about how to login to the conference before our workshop started. From 7 papers in our workshop, this affected 5 papers. So before the workshop, I receive multiple messages from authors about how to login. In one case, there was also an author who could not login because he used a different e-mail for his zoom account than for registering to the conference. And all these problems were not solved before our workshop start time. So this is another reason why we had to use a public Zoom link for our workshop.

From these problems, I think there are two lessons to learn: (1) It is better to do a rehearsal in advance with workshop organizers if a new system is used (like it was done for ICDM in Singapore in ICDM 2020, for example) to avoid issues, and (2) it would have been much simpler to just use Zoom public links instead of using Zoom events, because this latter creates problems for logging due to the restriction on e-mails who can log in.

Day 2 – Opening ceremony and …

On the second day, I wanted to attend the opening ceremony and check the keynote talks as this is usually very interesting. However, I did not receive any Zoom Events link for accessing the second day of the conference in my e-mail. So it seems that only the first day of the conference is online. That is unfortunate but I can understand as it is supposed to be an in person conference. So that will be the end of this post about ICDM 2022.

Conclusion

This was a blog post about the ICDM 2022 conference. Hope it has been interesting. Will be looking forward to next year’s ICDM 2023 conference.


Philippe Fournier-Viger is a full professor working in China and founder of the SPMF open source data mining software.

Posted in Big data, Conference, Data Mining | Tagged , , , , , , | 4 Comments

How to package SPMF as an EXE file with JPackage?

In this blog post, I will explain how to use the jpackage tool that is provided with Java to (1) package the JAR file of SPMF into an EXE file for Windows, and (2) to create an installer for SPMF.

Requirements

It is necessary to have:

  • A computer with Windows (as I will give instructions for this platform, but you may do something similar on other operating system)
  • A recent version of the Java JDK (at least version 14) so that you have the jpackage command available.
  • The JAR file of SPMF. You can download it here: spmf.jar or from the website of SPMF.
  • The ICO file of SPMF (if you want to make an application that has an icon): SPMF.ico

How to create an EXE application of SPMF for Windows

Step 1. On a windows computer, create two folders /input/ and /output/ on the desktop.

Step 2. Put the files spmf.jar and SPMF.ico in the folder /input/ that you have just created.

Step 3. Open the command line of Windows and execute this command:

jpackage --input C:\Users\philippe\Desktop\input\ --dest C:\Users\philippe\Desktop\output\ -n “SPMF” --main-jar spmf.jar --main-class ca.pfv.spmf.gui.Main --type app-image --icon C:\Users\philippe\Desktop\input\SPMF.ico

where C:\Users\philippe\Desktop\ should be replaced by the path to your desktop on your computer.

After executing this command, an EXE file will have been created in the output directory \output\

Step 4. To launch the software, you can now click on SPMF.exe.

How to create an installer for SPMF for Windows?

You can follow the same steps as above but use this command:

jpackage --input C:\Users\philippe\Desktop\input\ --dest C:\Users\philippe\Desktop\output5\ -n “SPMF” --main-jar spmf.jar --main-class ca.pfv.spmf.gui.Main --type exe --icon C:\Users\philippe\Desktop\input\SPMF.ico --win-dir-chooser --win-menu --win-shortcut-prompt

Then, this will create an installer, which looks like this:

This is the installation process on Windows 11:

And those are the files after installation:

Customization and generating installers for other platforms

There are also many other options offered by jpackage, including generating packages for other platforms. For more information see the documentation of the jpackage command.

Conclusion

This was just a short blog post to show how to package SPMF.jar into a native application. I think the process of using jpackage is quite simple. In the past, I had used some other commercial tools to create EXE files for Java programs but the process was more complicated. I am thus happy to have found jpackage.


Philippe Fournier-Viger is a distinguished professor of computer science

Posted in Java, open-source | Tagged , , , | Leave a comment

New version of SPMF (2.58)

This is to announce that a new version of SPMF has been released on the 27th November 2022. This version has 7 new pattern mining algorithms:

  • the HUCI-Miner algorithm to mine closed high utility itemsets and generators at the same time (thanks to Jayakrushna Sahoo et al. for the original code )
  • the FHIM algorithm to mine all high utility itemsets (thanks to Jayakrushna Sahoo et al. for the original code)
  • the HGB algorithm to mine non redundant high utility association rules (thanks to Jayakrushna Sahoo et al. for the original code)
  • the HGB-all algorithm to derive all high utility association rules from the non redundant high utility association rules (thanks to Jayakrushna Sahoo et al. for the original code)
  • algorithms for mining sequential patterns with flexible constraints in a time-extended sequence database (eg. MOOC data)
    • the SPM-FC-L algorithm fi (Thanks to Wei Song et al. for the original code)
      • the SPM-FC-P algorithm (Thanks to Wei Song et al. for the original code)

Besides, it has several new features such as:

(1) An integrated text editor to open output file (to give an alternative to the system’s default text editor)

(2) Some improvements to the graphical user interface, such as shown below, such as colors to highlight algorithm categories and a window icon:

And some bugs have been fixed.

Besides a new MOOC.txt dataset of sequences of courses with timestamps has been added to the dataset page of SPMF.

Thanks again to all users and contributors to SPMF!


Philippe Fournier-Viger is a distinguished professor of computer science

Posted in Data Mining, Data science, open-source | Tagged , , , , , , | Leave a comment

Brief report about the MEDI 2022 conference

In this blog post, I will talk about the 11th International Conference on Model & Data Engineering (MEDI 2022), which I have attended this week. It was held from the 21st to 23rd November 2022, in Cairo, Egypt, and also online.

MEDI 2022 conference

MEDI is a good conference for modelling, database, and related topics. MEDI is ranked C in the CORE 2021 ranking. It has been held in various countries over the years such as: Egypt, Estonia, France, Morocco, Spain, Cyprus, Italy and Portugal.

The local organization of the 2022 edition of MEDI is by the Nile University.

Nile University

This year, I am glad to play the role of Program Committee (PC) Chair at this conference. Below, I give an overview of the event.

Opening ceremony

The opening ceremony was at 10:00 AM. The conference was introduced, and an overview of the program was given. Below, I present some slides from the opening ceremony with some more details.

Countries were MEDI was held:

MEDI conference by countries

Organizations. Here is an overview of the committees behind MEDI 2022:

MEDI Conference committees

Paper selection. About the program, this year 65 papers have been submitted to the main conference, and from that 18 were accepted for long presentation (28%) and 11 for short presentation (17%). The program committee consisted of 60 researchers from 23 countries, which provided 200 reviews, for an average of 3.5 reviews per paper.

MEDI 2022 statistics

Proceedings. Full presentation papers are published in a Springer LNCS volume, while short presentation papers are published together with workshop papers in a Springer CCIS volume.

MEDI 2022 conference proceedings

Totally, there was 190 authors and submissions were made from 18 countries on five main topics: (1) Modelling, (2) Image processing and diagnosis, (3) Machine learning and optimization, (4) Natural language processing and (5) Database systems.

Workshop. A workshop called DETECT was co-organized with MEDI 2022, and 4 papers were accepted in that workshop (44%).

Special issues. Two special issues are organized for best papers of MEDI 2022

MEDI 2022 special issues

Keynote talks

There was three keynote talks.

The first keynote talk was by Prof. Vincent S. Tseng from National Yang Ming Chiao Tung University about “Broad and Deep Learning of Big Heterogeneous Health Data for Medical AI: Opportunities and Challenges“.

Vincent S. Tseng keynote talk

The second talk was “A service-based approach to drone service delivery in Skyway networks” by Prof. Athman Bouguettaya from The University of Sydney, Australia

Athman Bouguettaya keynote talk

The third talk was “Safety and security are key considerations in the design of critical systems” by Dr. Colin Snook from University of Southampton, United Kingdom.

Best paper awards of MEDI 2022

The best paper awards were announced during the closing ceremony. These papers were selected based on the peer reviews and also based on the presentations given at the MEDI conference.

Conclusion

This is all for this blog post! Hope this has been interesting. Looking forward to MEDI 2023!
In about 1 week, I will talk to you about ICDM 2022.


Philippe Fournier-Viger is a distinguished professor of computer science

Posted in Conference, Database | Tagged , , , , , | Leave a comment

Brieft report about the MIWAI 2022 conference

In this blog post, I will talk about the 15th International Conference on Multi-disciplinary Trends in Artificial Intelligence (MIWAI 2022), which was held as a virtual event on November 18th, 2022. I have attended this conference to present two papers related to pattern mining.

MIWAI is a conference that has been held every year in the pacific/Asia region, for 15 years. In the past, MIWAI has been held in countries such as Vietnam, India, Thailand, China, Malaysia and Brunei. I have also attended MIWAI in Malaysia in 2019 (see my blog post about MIWAI 2019 here).

Proceedings of MIWAI 2022

This year, the MIWAI conference received 42 papers from which 19 were accepted, which includes 14 full papers (acceptance rate of 33% for full papers) and 5 short papers (acceptance rate of 45% for full + short papers).

The proceedings of the conference are published by Springer in a book from the LNAI series. Hence, the papers are well-indexed in various publication databases such as DBLP, which is good.

Schedule of the conference

The conference was organized during a single day (November 18) and online using the Zoom platform. It started with an opening session, followed by a keynote talk by Prof. Rapeepan Pitakaso from Thailand about “Artificial Multiple Intelligence System (AMIS) and its Applications”. Then, there was paper presentations organized as parallel sessions during the rest of the day.

Opening

During the opening, the organizers talked about the program, the review process, the organization, etc. It was nice to see several people that I knew already from MIWAI 2019. The organizers are very friendly and professional.

It was announced that next year, MIWAI 2023 will be in Hyderabad, India. The submission deadline is planned for 10th January 2023.

Then, MIWAI 2024 will be in Chongqing, China.

Awards were also announced. I am please that the best paper award was given to “LCIM: Mining Low Cost High Utility Itemsets“, which is my paper. There was also another paper who received an award.

Keynote talk

There was a keynote talk “AMIS – Artificial Multiple Intelligence System: Theory and Application” by Prof. Rapeepan Pitakaso from Ubon Ratchathani Unviersity, Thailand. The talk was about a system called Artificial Multiple Intelligent System (AMIS), that aims to combine multiple types of intelligence, just like humans do (verbal, linguistic intelligences, etc.). The system is based on heuristic algorithms and also combines CNN (Convolutional Neural Networks) and other techniques. Here are some pictures and slides from this presentation.

Papers on pattern mining

As readers of this blog knows, I am interested in the field of pattern mining. At the conference, there was two papers about pattern mining (my papers):

5121Philippe Fournier-Viger, M. Saqib Nawaz, Yulin He, Youxi Wu, Farid Nouioua and Unil YunMaxFEM: Mining Maximal Frequent Episodes in Complex Event Sequences [paper]
[source code] [ppt]
6226M. Saqib Nawaz, Philippe Fournier-Viger, Naji Alhusaini, Yulin He, Youxi Wu and Debdatta BhattacharyaLCIM: Mining Low Cost High Utility Itemsets [paper]
 [ppt][source code and data]

These papers are about episode mining and high utility pattern mining.

Affordable registration fee

A good thing about this conference is that the registration fee is cheap. For one paper, it costs 250$ USD and for two papers, I spent only 350$ USD. This is much cheaper than many other conferences published by Springer and also IEEE. For example, an alternative that I considered would have been to publish in some European conferences published by Springer such as DAWAK or DEXA but registering two papers for those conferences would have cost me over 1240 euros instead of 350$ USD! This is a big difference. There are also many IEEE top conferences that are very expensive and have an increasing price in recent years. For example, this year, IEEE ICDM has a registration price of 1300 USD, while the price 10 years ago was about only 500$ USD. Thus, I appreciate that MIWAI offers registration at a very reasonable price. I think it can allow researchers from all around the world to more easily publish their papers, especially when research funding is limited.

Conclusion

MIWAI 2022 was a good conference. It is not a very large conference but there are some interesting papers of good quality, and the conference is well-organized. I will try to attend again for MIWAI 2023.
Later this month, I will talk to you about the MEDI 2022 and ICDM 2022 conference.


Philippe Fournier-Viger is a distinguished professor of computer science

Posted in artificial intelligence, Conference, Data Mining | Tagged , , , | Leave a comment

How to change the language of Tableau Desktop from the registry?

This is a short post about how to change the language of Tableau if the option is not available in the Help Menu.

  1. Close Tableau
  2. On Windows, open Regedit
  3. Search for “Language code” to find the keys containing the languages options of Tableau.
  4. Change the three following keys: “Language code”, “Repository language” and “SamplesLanguage” to your preferred language. For example “en_US” is for English and “zh_CN” is for Chinese.

5. Start Tableau again and the language will have been changed.

Posted in Data science | Tagged , , | Leave a comment

An integrated text editor in SPMF

Today, I will talk about some upcoming feature for the next release of SPMF (2.58), which is a simple integrated text editor. This new release should happen in about 1 or 2 weeks (as I am very busy recently) and will contain some new algorithms. But also, it will contain an integrated text editor. This may seem very strange? Why a text editor? I will explain briefly the idea in this blog post to give an overview of this feature as I am testing it now. If you have comments, you may leave them in the comment section below

Why a text editor in SPMF?

Well, in previous version of SPMF, there was a few options to open the output files produced by the algorithms: using the system’s default text editor (e.g. Notepad on Windows) or using the Pattern Viewer tool of SPMF. Although using the system’s text editor can be good, I was thinking that having a customized text editor could be interesting and it is actually not difficult to implement. So, I will explain how it works below.

The SPMF text editor

In the next release of SPMF, there will be a new option to open the output file using the SPMF text editor instead of the default system text editor.

The SPMF text editor looks like this:

It has some useful features such as showing the line count and the line and column numbers in the status bar at the bottom. This bar can also be hidden.

Also a useful feature is to always highlight the current line (which is not done by NotePad, for example):

Besides, it is possible to activate a “night mode” from the menu:

And there is also a search bar that is very convenient for highlighting some keywords in a file, and works like the search bar in some Web browsers:

Also, there is of course the possibility of changing the font and the font size.

Besides, all the user-defined preferences are saved. So next time that you open SPMF, the same font, font size, night mode preference, window size and location, and other preferences of the SPMF text editor are kept.

The SPMF text editor can also display the size of the last file that is opened:

And here is an overview of the menus:

It has the basic important functions such as “Line wrap” and “Word wrap”.

Conclusion

Interesting? You will be able to try it in the next version of SPMF to be released soon.

For now, the features are quite simple because the aim is to provide an alternative to the system text editor specialized for opening the output files of SPMF. It is not designed to compete with a more complex text editor and to be used as a general-purpose text editor (although it could).

A limitation of the SPMF text editor is that text files are loaded in memory so it is restricted to opening files that are not too big.

Many other features could be added like highlighting keywords in the output file. So there is something to think about. Which features would be useful? You may let me know what you think. If some features are not too hard to implement and useful, I may add them. I will also do some more debugging before it can be released.

If you have any suggestions or ideas, you can let me know in the comment section or by e-mail at philfv AT qq.com

Posted in Data Mining, open-source | Tagged , , , | Leave a comment