Preventing (accidental) short-term DoS from tabs

Hi,

I'm currently running a small, but PHP heavy, site, with relatively few users. My static files are served from nginx, and my dynamic from php (fastcgi). However, I'm only running 4 PHP-CGI processes, as each tends to take about 25MB, and that's all the memory I can spare [MySQL gets another 120 or so, and the rest of for misc. applications, squid, openvpn, etc].

My concern is that one person can easily DoS the site (by accident) just by opening a few tabs at once with long-ish PHP actions. Then, when things don't load immediately, they'll hit reload [or open some more tabs for while the first ones load], and things back of in the iowait state; system loads goes up to 5, 10, 15, etc, while all of the processes wait to try and run (mysql gets backed up, as does journeld, etc).

Any suggestions for preventing this? Should I run fewer PHP processes and let NGINX either return a proxy not available or hold the things in the queue? Is there any way I can assign priorities, or resources limitations, or something else?

Thanks

EDIT: I'm already running APC.

11 Replies

I think the best thing is to speed up whatever PHP page is taking so long. If that's not possible, limit who can access that page and/or how often they can load it. A blunt method would be to use the connlimit module in iptables to restrict the number of simultaneous connections.

I have two pages that are particularly slow [and cannot really be optimized]; one uses some external web connections, and one is third party code with several complex searches.

iptables seems like a good idea; I'll look into that. Hopefully, clients will handle it gracefully.

It's a problem of application design. You have several options:

1. Use client side technologies wherever possible (aggregation, computation, …)

2. Server side caching, unless the data has to be always very fresh.

2+1. Server side caching of JSON data to be imported into pages. Yeah, you get more hits but those are to static files.*

3. Offload to external processes, ie. your web application starts an external application/process and client-side checks status via async js (aka ajax)

The proposed iptables solution should work, but imho is a hack. Especially because your pages might not load properly if each page additionally relies on a number of images, stylesheets or js files. Even if you do client-side caching, browsers might send a If-not-modified request.

*If you use sessions, a request locks the session file until it finishes, meaning all additional requests (under that session id) are serialized, waiting for the session file lock. So if the first hit is a hog, all other, however quick they are, will wait until the first one finishes, regardless of cpus or processes involved.

4. Handle the overload condition in PHP rather than iptables.

Some pseudocode, you'd put this early in the PHP page:

if (lock_exists) {
  send "503 Service Unavailable" header
  send "Retry-After: n" header
  send error page text
  exit 0
}
else {
  acquire_lock

...the rest of your page goes here...

  release_lock
}

This would permit only one pending request for this particular page at a time. The value of n depends on how long your page typically takes to complete whatever it does (could even be calculated based on load).

As Azathoth pointed out, the iptables method may cause problems because it's pretty common for browsers to open many simultaneous connections to load images, CSS files, and such.

I am not sure how nice the idea with serializing requests to PHP is. It could backfire.

However rejecting requests based on load might (in theory) be a good idea. I'd suggest using APC vars or memcached to track number of requests per second (simpy reset the counter if current $SERVER['REQUESTTIME'] is not the same as cached) and reject if too many, with a 503.

But still that's all hacking to patch temporarily. The solution is in reducing the processing time as much as possible, and after you're done with that, invest into more resources if needed.

Thanks for all of your advice. As I said, I really can't optimize the applications at this time [nor change one of the clients; see the post script], so I added a limit of one simultaneous instance of the really slow script, and am currently looking into using squid as a reverse proxy for some dynamic data; that seems to offer the best "drop in" solution without rewriting existing packages to use memcached or some other solution.

P.S. one of my clients is a kodak photo frame which requests images in 128KB blocks using partial gets. The server is running G2 [gallery.menalto.com] which serves images from behind an image firewall by sending them through PHP. This is a bad combination… I'm hoping the squid will help here.

Have you optimized G2 to use it's performance caching options under Site Admin > Performance?

Do you do anything special with G2, or are you simply displaying your photos on your website? If that's all you're doing, take a look at G3. The next iteration in Gallery and it doesn't act as an image firewall, so there's an immediate performance boost right there.

If you need to continue using G2, here are some other docs about squeaking out more performance:

http://codex.gallery2.org/Gallery2:Performance_Tips

http://codex.gallery2.org/Gallery2:ACL_Performance

I'm not using the caching in G2, since I have no more memory to allocate for mysql, and I found that the caching would most often just cache things that a spider accessed once.

I haven't move to G3 yet, since (a) it's Apache only [although I do have my testing env. set up to proxy from nginx to apache and (b) many of the things I use are not yet supported. I probably will, but not just yet..

> I'm not using the caching in G2, since I have no more memory to allocate for mysql, and I found that the caching would most often just cache things that a spider accessed once.

All the more reason to use it. Gallery doesn't utilize the database for caching. It caches DB queries as well as derivatives, pages, comments, etc. The caching will cache everything that's viewed by your guests or registered users (depending on how you configure it)

As for G3, yeah, as long as you're using more advanced features available in G2, G3 isn't for you (yet). Lighttpd and Nginx probably won't ever be supported by the core team, but I've had G3 working under Lighty without problems and someone will eventually come up with rewrite rules that work for the image protection features.

For G2, you really should look into it's performance features, you'll reduce your DB and memory resources needed for the webserver. As for browsing, if you hack G2 to not do any view counting, the when people are browsing your site, there won't be any DB writes at all, just reads and even fewer if you use the caching. Check out the links I provided earlier.

@waldo:

> I'm not using the caching in G2, since I have no more memory to allocate for mysql, and I found that the caching would most often just cache things that a spider accessed once.

All the more reason to use it. Gallery doesn't utilize the database for caching. It caches DB queries as well as derivatives, pages, comments, etc. The caching will cache everything that's viewed by your guests or registered users (depending on how you configure it)

Hrm. I remember problems with g2_cacheMap getting _huge_; i.e., hundreds of megabytes.

@waldo:

As for G3, yeah, as long as you're using more advanced features available in G2, G3 isn't for you (yet). Lighttpd and Nginx probably won't ever be supported by the core team, but I've had G3 working under Lighty without problems and someone will eventually come up with rewrite rules that work for the image protection features.

I actually had some success working with nginx and G3, but it started becoming more trouble than it was worth (especially when kohana and all of the AJAX calls began to be problematic). The actual permissions weren't so bad; I once sent a patch that support NGINX, subject to the rather annoying restriction of needing to send a HUP every time permissions changed.

Thanks for your comments; I look into their caching again.

> Hrm. I remember problems with g2_cacheMap getting _huge_; i.e., hundreds of megabytes.

hmmm, that shouldn't happen, but may depend on your settings. You may want to pipe in on this active thread on the G2 forums:

http://gallery.menalto.com/node/94053

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct