Preventing (accidental) short-term DoS from tabs
I'm currently running a small, but PHP heavy, site, with relatively few users. My static files are served from nginx, and my dynamic from php (fastcgi). However, I'm only running 4 PHP-CGI processes, as each tends to take about 25MB, and that's all the memory I can spare [MySQL gets another 120 or so, and the rest of for misc. applications, squid, openvpn, etc].
My concern is that one person can easily DoS the site (by accident) just by opening a few tabs at once with long-ish PHP actions. Then, when things don't load immediately, they'll hit reload [or open some more tabs for while the first ones load], and things back of in the iowait state; system loads goes up to 5, 10, 15, etc, while all of the processes wait to try and run (mysql gets backed up, as does journeld, etc).
Any suggestions for preventing this? Should I run fewer PHP processes and let NGINX either return a proxy not available or hold the things in the queue? Is there any way I can assign priorities, or resources limitations, or something else?
Thanks
EDIT: I'm already running APC.
11 Replies
iptables seems like a good idea; I'll look into that. Hopefully, clients will handle it gracefully.
1. Use client side technologies wherever possible (aggregation, computation, …)
2. Server side caching, unless the data has to be always very fresh.
2+1. Server side caching of JSON data to be imported into pages. Yeah, you get more hits but those are to static files.*
3. Offload to external processes, ie. your web application starts an external application/process and client-side checks status via async js (aka ajax)
The proposed iptables solution should work, but imho is a hack. Especially because your pages might not load properly if each page additionally relies on a number of images, stylesheets or js files. Even if you do client-side caching, browsers might send a If-not-modified request.
*If you use sessions, a request locks the session file until it finishes, meaning all additional requests (under that session id) are serialized, waiting for the session file lock. So if the first hit is a hog, all other, however quick they are, will wait until the first one finishes, regardless of cpus or processes involved.
Some pseudocode, you'd put this early in the PHP page:
if (lock_exists) {
send "503 Service Unavailable" header
send "Retry-After: n" header
send error page text
exit 0
}
else {
acquire_lock
...the rest of your page goes here...
release_lock
}
This would permit only one pending request for this particular page at a time. The value of n depends on how long your page typically takes to complete whatever it does (could even be calculated based on load).
As Azathoth pointed out, the iptables method may cause problems because it's pretty common for browsers to open many simultaneous connections to load images, CSS files, and such.
However rejecting requests based on load might (in theory) be a good idea. I'd suggest using APC vars or memcached to track number of requests per second (simpy reset the counter if current $SERVER['REQUESTTIME'] is not the same as cached) and reject if too many, with a 503.
But still that's all hacking to patch temporarily. The solution is in reducing the processing time as much as possible, and after you're done with that, invest into more resources if needed.
P.S. one of my clients is a kodak photo frame which requests images in 128KB blocks using partial gets. The server is running G2 [gallery.menalto.com] which serves images from behind an image firewall by sending them through PHP. This is a bad combination… I'm hoping the squid will help here.
Do you do anything special with G2, or are you simply displaying your photos on your website? If that's all you're doing, take a look at G3. The next iteration in Gallery and it doesn't act as an image firewall, so there's an immediate performance boost right there.
If you need to continue using G2, here are some other docs about squeaking out more performance:
I haven't move to G3 yet, since (a) it's Apache only [although I do have my testing env. set up to proxy from nginx to apache and (b) many of the things I use are not yet supported. I probably will, but not just yet..
> I'm not using the caching in G2, since I have no more memory to allocate for mysql, and I found that the caching would most often just cache things that a spider accessed once.
All the more reason to use it. Gallery doesn't utilize the database for caching. It caches DB queries as well as derivatives, pages, comments, etc. The caching will cache everything that's viewed by your guests or registered users (depending on how you configure it)
As for G3, yeah, as long as you're using more advanced features available in G2, G3 isn't for you (yet). Lighttpd and Nginx probably won't ever be supported by the core team, but I've had G3 working under Lighty without problems and someone will eventually come up with rewrite rules that work for the image protection features.
For G2, you really should look into it's performance features, you'll reduce your DB and memory resources needed for the webserver. As for browsing, if you hack G2 to not do any view counting, the when people are browsing your site, there won't be any DB writes at all, just reads and even fewer if you use the caching. Check out the links I provided earlier.
@waldo:
> I'm not using the caching in G2, since I have no more memory to allocate for mysql, and I found that the caching would most often just cache things that a spider accessed once.All the more reason to use it. Gallery doesn't utilize the database for caching. It caches DB queries as well as derivatives, pages, comments, etc. The caching will cache everything that's viewed by your guests or registered users (depending on how you configure it)
Hrm. I remember problems with g2_cacheMap getting _huge_; i.e., hundreds of megabytes.
@waldo:
As for G3, yeah, as long as you're using more advanced features available in G2, G3 isn't for you (yet). Lighttpd and Nginx probably won't ever be supported by the core team, but I've had G3 working under Lighty without problems and someone will eventually come up with rewrite rules that work for the image protection features.
I actually had some success working with nginx and G3, but it started becoming more trouble than it was worth (especially when kohana and all of the AJAX calls began to be problematic). The actual permissions weren't so bad; I once sent a patch that support NGINX, subject to the rather annoying restriction of needing to send a HUP every time permissions changed.
Thanks for your comments; I look into their caching again.
> Hrm. I remember problems with g2_cacheMap getting _huge_; i.e., hundreds of megabytes.
hmmm, that shouldn't happen, but may depend on your settings. You may want to pipe in on this active thread on the G2 forums: