Hi everyone,
In this blog post, I will discuss how I have applied an open-source tool that is named Code Analyzer ( http://sourceforge.net/projects/codeanalyze-gpl/ ) to analyze the source code of my open-source data mining software named SPMF.
I have applied the tool on the previous version (0.92c) of SPMF, and the tool prints the following result:
Metric Value
——————————- ——–
Total Files 360
Total Lines 50457
Avg Line Length 30
Code Lines 31901
Comment Lines 13297
Whitespace Lines 6583
Code/(Comment+Whitespace) Ratio 1,60
Code/Comment Ratio 2,40
Code/Whitespace Ratio 4,85
Code/Total Lines Ratio 0,63
Code Lines Per File 88
Comment Lines Per File 36
Whitespace Lines Per File 18
Now, what is interesting is the difference when I apply the same tool on the latest version of SPMF (0.93). It gives the following result:
Metric Value
——————————- ——–
Total Files 280
Total Lines 53165
Avg Line Length 32
Code Lines 25455
Comment Lines 23208
Whitespace Lines 5803
Code/(Comment+Whitespace) Ratio 0,88
Code/Comment Ratio 1,10
Code/Whitespace Ratio 4,39
Code/Total Lines Ratio 0,48
Code Lines Per File 90
Comment Lines Per File 82
Whitespace Lines Per File 20
As you can see by these statistics, I have done a lot of refactoring for the latest version. There is now 280 files instead of 360 files. Moreover, I have shrunk the code from 31901 lines to 25455 lines, without removing any functionnalities!
Also, I have added a lot of comments to SPMF. The “Code/Comment” ratio has thus changed from 2.40 to 1.10, and the “Comment Lines per files” went up from 36 to 82 lines. Totally, there is now around 10,000 more lines of comments than in the previous version (the number of lines of comments has increased from 13297 to 23208).
That’s all I wanted to write for today! If you like this blog, you can subscribe to the RSS Feed or my Twitter account (https://twitter.com/philfv) to get notified about future blog posts. Also, if you want to support this blog, please tweet and share it!
Pingback: The SPMF data mining library: a brief history and what's next? - The Data Mining & Research Blog
Pingback: Analyzing the source code of SPMF (5 years later) | The Data Mining Blog