Artikeldaten
Hi all,
I just wanted to give some information about the performance issues over the last few weeks and what was done to mitigate and hopefully contain them.
==========================
THE PROBLEM
At some point in May, the traffic to our website increased tenfold within a few days. Until then, our daily number of visits from unique IP addresses (mostly people, many search and AI bots) was in the 200,000 range with total number of hits (pages accessed) in the 5-6 million range. On May 31, the number of visits had climbed up to 1.6 million, but the hits remained somewhat steady between 6-7 million. This means that those additional 1.5 million visits were not actual users, but bots and other bad actors repeatedly hitting our server with access requests. Not just one IP either, but thousands of new IP addresses ("DDOS", Distributed Denial of Service).
This increased traffic proved to be too much for our single server, at which point a cascade of individual problems happened. These issues repeatedly occurred whenever load spikes where happening again and again:
- Server logs filled up the root partition on the server, resulting in no more disk space available for essential services (e.g. cronjobs).
- Flood protection thresholds at our hoster were triggered repeatedly, shutting down email and caching services, affecting core features of the site.
- Website performance due to the high server load at some points was impacted so dramatically that many users only got 503 permission response codes when trying to access the site.
==========================
THE CAUSE
At this point the origin of the attacks and their motivation is still entirely unclear.
However, now that we know more about those attacks and what kind of havoc they caused, we have a better understanding of the attack vectors and how we can (hopefully) contain those threats.
==========================
THE SOLUTION
As announced here in this space before, I have added a dedicated 3rd party firewall package to the site. In some ways it is probably a wonder that we were able to exist so long without it, considering how the internet has changed in the last 24 years since the site was born.
The firewall includes monitoring of certain services, a number additional of measures to shut down certain attack vectors from bad actors, and most importantly, DDOS protection. It took some time to set everything up, but the last week has already shown a marked improvement in terms of server load, although there were still a number of issues to work around in finding the right balance of functionality and restrictions.
Most importantly, beginning with July 1, around noon, all website traffic is now routed through the firewall with no exceptions. Our "visits" for yesterday are around 200,000, which is right at the number we had before all of this mess started. It's just one day, but it is already looking very promising.
==========================
FINAL THOUGHTS
Let me remind everyone reading this that CAGEMATCH is a hobby project, custom-built and coded by myself and maintained by a fantastic group of volunteers who dedicate some of their valuable free time towards adding or maintaining entries in our database. We are not a multi-million dollar corporation (and no, we are not getting paid by one either) that can just throw money and a bunch of developers at this problem (or any other problem). This has been a stressful couple of weeks, for sure, trying to figure out what is happening and how we can possibly fix this.
Right now I feel for the first time in weeks that we may have this ship back under control. Hopefully I am not tempting fate with this sentence, so I am going to knock on any wood I see for the rest of the day.
Thank you for your patience.
Philip