Troubleshooting 504 Gateway Time-out on nginx

1GB Linode

Ubuntu 12.04 LTS

nginx 1.1.19 + PHP5-FPM + MySQL

I host 5 WordPress blogs. Lately, they are all consistently erroring out with 504 Gateway Time-out errors. When they're not erroring out, they're taking ridiculously long amounts of time to load. I've done about everything I know to try, and been googling for hours now… and I'm at a loss.

I have my nginx configured based on this article: http://codex.wordpress.org/Nginx

MySQLTuner…

-------- General Statistics --------------------------------------------------
[--] Skipped version check for MySQLTuner script
[OK] Currently running supported MySQL version 5.5.31-0ubuntu0.12.04.1-log
[OK] Operating on 32-bit architecture with less than 2GB RAM

-------- Storage Engine Statistics -------------------------------------------
[--] Status: +Archive -BDB -Federated +InnoDB -ISAM -NDBCluster 
[--] Data in MyISAM tables: 232M (Tables: 128)
[--] Data in InnoDB tables: 7M (Tables: 2)
[--] Data in PERFORMANCE_SCHEMA tables: 0B (Tables: 17)
[!!] Total fragmented tables: 6

-------- Security Recommendations  -------------------------------------------
[OK] All database users have passwords assigned

-------- Performance Metrics -------------------------------------------------
[--] Up for: 33m 4s (3K q [1.993 qps], 270 conn, TX: 114M, RX: 1M)
[--] Reads / Writes: 79% / 21%
[--] Total buffers: 208.0M global + 704.0K per thread (20 max threads)
[OK] Maximum possible memory usage: 221.8M (22% of installed RAM)
[OK] Slow queries: 0% (0/3K)
[OK] Highest usage of available connections: 10% (2/20)
[OK] Key buffer size / total MyISAM indexes: 16.0M/5.9M
[OK] Key buffer hit rate: 96.8% (19K cached / 640 reads)
[OK] Query cache efficiency: 26.7% (774 cached / 2K selects)
[OK] Query cache prunes per day: 0
[OK] Sorts requiring temporary tables: 3% (4 temp sorts / 131 sorts)
[OK] Temporary tables created on disk: 25% (95 on disk / 375 total)
[OK] Thread cache hit rate: 99% (2 created / 270 connections)
[OK] Table cache hit rate: 25% (179 open / 699 opened)
[OK] Open file limit used: 29% (313/1K)
[OK] Table locks acquired immediately: 100% (2K immediate / 2K locks)
[OK] InnoDB data size / buffer pool: 7.5M/128.0M

-------- Recommendations -----------------------------------------------------
General recommendations:
    Run OPTIMIZE TABLE to defragment tables for better performance
    MySQL started within last 24 hours - recommendations may be inaccurate

I an post anything else that would be helpful, too.

12 Replies

Gateway timeouts would suggest the cause is PHP-FPM so I'd say check your php-fpm logs? It'll probably say something about a reached worker limit for which you have to tweak the PHP-FPM config.

You were correct!

[25-Sep-2013 03:47:55] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 0 idle, and 9 total children
[25-Sep-2013 03:47:56] WARNING: [pool www] server reached pm.max_children setting (10), consider raising it
[25-Sep-2013 04:24:27] WARNING: [pool www] server reached pm.max_children setting (10), consider raising it

...

[25-Sep-2013 05:28:35] WARNING: [pool www] server reached pm.max_children setting (10), consider raising it
[25-Sep-2013 05:28:57] WARNING: [pool www] server reached pm.max_children setting (10), consider raising it
[25-Sep-2013 06:57:50] WARNING: [pool www] server reached pm.max_children setting (10), consider raising it
[25-Sep-2013 06:58:05] WARNING: [pool www] server reached pm.max_children setting (10), consider raising it
[25-Sep-2013 07:07:51] WARNING: [pool www] server reached pm.max_children setting (10), consider raising it

I've bumped up the max_children to 20, and now sites are loading much quicker. Is doubling that value (10 to 20) too much of an increase at first? Is it not enough of an increase?

I found a few articles/posts that seemed to caution against changing the pm.startservers/pm.minspareservers/pm.maxspare_servers. Should I consider increasing those at all?

Thanks for your help!

This really depends on your sites, the start/spare servers can be raised, if I understand it correctly it comes down to having workers "ready"/waiting for stuff to do to come in. It'll try to have workers ready so when the request comes in it can immediately be executed. (Otherwise the worker first has to become ready, small overhead I guess). However having lots of spare servers/workers will increase RAM usage, so you'll have to see how much you can support. If you have too many workers and they all become fully utilized and it runs out of memory it'll crash.

I bumped it back down to 15. (So, from 10 to 20 to 15). I was definitely seeing the ram take a hit:

total used free shared buffers cached

Mem: 1000 957 43 0 90 103

-/+ buffers/cache: 764 236

Swap: 1023 31 992

It is unrealistic to think that 1gb was enough to handle 5 sites?

Depends on the sites. You can easily run quite a lot of sites with just a little bit of traffic. But if you have sites that get a lot of traffic you'll notice of course that you might need a bit more ram to run as many workers as you need.

Wordpress also tends to be a memory hog. There's various things you can do to improve performance. At a basic level installing PHP APC and a caching plugin will help. A bit more complicated option would be to use memcached as a Wordpress object store, and use a separate caching system such as Varnish or Nginx's fastcgi_cache.

Basically if you can make your PHP requests as fast as possible or avoid them all together then you'll be able to handle more traffic on a 1G node.

Thanks for everyone's help thus far.

So far, here's what I have done:

  • went ahead and upgraded my Linode to the 2GB plan.

  • installed PHP-APC.

  • installed the nginx helper and we-total-cache plugins into wordpress.

I also started playing around with the pm.max_children value. It seems that no matter what value I set it to, its not high enough. Right now, my settings are this:

pm = dynamic
pm.max_children = 50
pm.start_servers = 5
pm.min_spare_servers = 4
pm.max_spare_servers = 5
pm.max_requests = 250

And I'm still seeing this in my php5-fpm log:

[26-Sep-2013 12:41:10] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 0 idle, and 12 total children
[26-Sep-2013 12:41:11] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 16 children, there are 0 idle, and 16 total children
[26-Sep-2013 12:41:12] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 20 total children
[26-Sep-2013 12:41:13] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 24 total children
[26-Sep-2013 12:41:14] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 28 total children
[26-Sep-2013 12:41:15] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 32 total children
[26-Sep-2013 12:41:16] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 36 total children
[26-Sep-2013 12:41:17] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 40 total children
[26-Sep-2013 12:41:18] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 44 total children
[26-Sep-2013 12:41:19] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 0 idle, and 48 total children
[26-Sep-2013 12:41:20] WARNING: [pool www] server reached pm.max_children setting (50), consider raising it

And I'm using almost the full 2GB of ram:

 ➜  ~  free -m
                   total       used       free     shared    buffers     cached
Mem:          2013       1890        123          0        307        763
-/+ buffers/cache:        819       1194
Swap:         1023          3       1020

I'm not sure what the right way to configure here is. I'm really just guessing. It seems like no matter how high I set the maxchildren, its not enough and I'm just using up RAM. Increasing the maxchildren value seems to help sites load quicker at first, until the RAM gets eaten up.

Help?

50 Max children is a bit high for a 2GB node. If each process uses 64MB (not unreasonable for wordpress) then you'd be using 3.2GB.

What happens is nginx passes requests to PHP, if there's a free child PHP will execute the request using that child, if there aren't any free children it will spawn a new child assuming you've not hit the max_children limit. If you have then the request will go into a queue and be processed when there is a free child.

Now in an ideal world you'd always have at least one free child, however this is impractical if you get hit with a lot of requests at once so you have to set a reasonable max_children based on the amount of RAM you have.

By any chance do you have some sort of dynamic image plugin such as timthumb? If so this would account for a lot of requests since each image would involve a PHP request.

Unfortunately.

Out of the 5 WordPress installs, one's theme uses timthumb, and a second one uses the Shopp plugin, which uses timthumb for the store images. And timthumb has been a constant frustration.

If you knew to ask about timthumb, I'm hoping you have some advice on how to deal with it. :)

A caching server such as Varnish or Nginx's fastcgi_cache would be best here. It can cache any images generated and whole pages. Even with wordpress total cache requests are still passed to PHP just quicker (basically total cache generates a static version of the page and serves that via PHP instead of regenerating the whole page).

@brdoco:

And I'm using almost the full 2GB of ram:

 ➜  ~  free -m
                   total       used       free     shared    buffers     cached
Mem:          2013       1890        123          0        307        763
-/+ buffers/cache:        819       1194
Swap:         1023          3       1020

Good news: you're using less than half your RAM. http://www.linuxatemyram.com/

@obs I installed Varnish last night, and things.. so far.. have been great. Sites are consistently loading very quickly.

@Vance Thanks for the tip! I didn't know about that.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct