Intermittent mysql outages and lots of odd traffic
Recently I had been experiencing a number of spikes in Swap I/O rates which was leading to my SQL instance shutting down. I had just been restarting mysql which was temporarily fixing (48h+) the problem and then finally increased the size of my swap disk which seems to have addressed the issue.
I am now however experiencing very brief outtages of mysql. From reading around it sounded like I may be experiencing spam traffic that is causing the crashes. As a result I installed tcptrack and recorded (literally) the following traffic:
Excuse my ignorance, but this looks a lot like I'm being spammed heavily. Lots of opening and closing connections…
My apache conf Max Clients is set to 16 with KeepAlive disabled.
Does anyone have any advice for what I can be doing? And does this look like the route cause of these mysql crashes?
3 Replies
There are two types of attacks, one hits the xmlrpc/comments (trying to post something) and the other hits the wp-login (trying to login to the dashboard).
In both cases, the hits are translated to SQL queries that overload the server.
The solution is to use a plugin that cuts down automated bot access (captcha, javascript hints, etc) thus, only about 1% of the above brute attacks will actually become SQL queries.
That has been my experience, maybe its something similar or entirely different in your case, but its worth investigating the apache logs to see what they are actually hitting.
PS:
One mitigation trick, is to limit DoS'ing your SQL database. The default 500 connections is waaay too high for a linode server, you can get that down to 10-50 and your sites will still fully work (plus, mariadb uses less memory when connections are limited like that).
I did however make changes to the connection limits and that seems to have helped. I believe the Linode guide suggests 32, but for all 3 Linode's I've set up they've experienced the same SQL crash after small spikes of activity on the server. I have found <20 has resolved the issue, and of course ensuring KeepAlive is off.
My suggestion, is to enable it, but use a very low timeout, like 2 to 5 seconds. This way a client browser will use keep alive to receive web content faster, but will also close the socket fast enough so as not to waste server resources.
Take a look at this sample configuration: