Hi all! Today, I write a blog post to announce that I have decided to close the data mining forum, which was hosted on this website at http://forum2.philippe-fournier-viger.com/ . The forum was a small website that was connected to my other websites. The forum was used for discussing about data mining topics and was powered by a version of PhPBB. I will now explain why I decided to close the forum and what happened.
First, let’s go back in time to a few weeks ago, in early March 2025. I was trying to access my main website, and I noticed that my website periodically became unavailable with this error: Error 500: Internal Server Error.
I would try to connect to any pages of my website and this error would sometimes occur and sometimes it would not. I first thought that there was a problem with the server hosting my website. I pay a webhosting company to host my websites. So, I logged into the administration panel of my websites and looked. I did not found anything suspicious. Then, after a few hours, my website started to work again, so I thought that it was a temporary problem and that it was solved.
But no! Today, on April 7th 2025, the website went down again and was barely accessible for the whole morning. Then, I started to investigate again. I decided to download all the access logs from the server to see if I could get some idea about what was going on.
Here is what I found. First, I looked at the summary of the HTTP requests to my websites by months:
As you can see above, the number of requests was around 3 million per month in 2024 but suddenly in March it increased 10 times to around 32 million requests per month, which is extremely suspicious.
Then, I looked at the data for the first days of April, and I found that the number of requests even peaked at 6 million per day, which is a ridiculously high number for my small website.
Then, I looked into the detailed log and found that more than 90 % of the requests were coming from Brazil and were made to access different pages from my forum. Here is a sample of some of those requests:

As can be seen in the screenshot above, dozen of requests were sent from multiple IP addresses, mainly from Brazil with the same timestamp.
I then did a reverse lookup of some of these IP addresses to find where it came from and found that these IP addresses belong to some internet providers in Brazil.
It is not clear why this unusual traffic happened. But the most likely explanation is that some bots decided to try to spam my forum with advertisements and repeatedly tried to login and post. In my forum, the bots were unable to post since I required the manual approval for all new users. However, this did not discourage bots from accessing my webpage millions of times to the point of causing all my websites to go down.
Facing this situation, I had to decide whether to try to block all requests from Brazil, or to improve the security of the website or of the forum itself. But since all the requests were coming from different IPs, it is not simple. And I do not want to pay for some extra security service.
Thus, for this reason and because few people were using the forum in recent years, I have decided to just close it. As few people were using it, I think that it is not a big issue. In the future, I might prepare an alternative to the forum that will be more modern perhaps like a Reddit group or a WhatsApp group. If you have suggestions, feel free to let me know below in the comment section
And since I have closed the forum, the speed of this website and all my other websites has greatly increased!
So that’s the story about this! Hope this blog post has been interesting.
Update 1 (9th April – 1 day later) – traffic decrease: The number of HTTP requests has largely dropped after closing the forum:
This confirms that the forum was a magnet for bots and spam, and it was a good decision to close it.
Update 2 (12th April) – robots traffic, and CDN
I have done further analysis on the traffic to my websites, and it is also interesting to see that much traffic is by these bots:

And some bots do not bring any meaningful benefit to my websites. For instance, AhrefsBot and AwarioBot are primarily used for SEO monitoring and competitor analysis. Since I do not use these services, allowing their bots to crawl my website only consumes bandwidth without offering any benefits. Similarly, TurnitinBot index content for proprietary systems. Hence, to prevent these bots from crawling my website, I’ve added the following rewrite rule to my .htaccess
file:
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} (GPTBot|AwarioBot|TurnitinBot|AhrefsBot|SemrushBot|DotBot) [NC]
RewriteRule .* - [F,L]
This rule ensures that these bots receive a 403 Forbidden
response and are effectively blocked from accessing any part of the website. This should improve a little bit more the website performance.
Besides, today I also reactivated the CDN (Content Delivery Network) with CloudFare for this website to boost the speed.
That’s all for today! Thanks for reading.