Nginx + php-fpm + wordpress with very high CPU load
(1) - One load balancer doing a round robin with least connections to the following two nodes
(2) - 2GB nodes each running Nginx + php-fpm + APC and connected to yet another instance of mySQL running on a dedicated machine over a local IP address. Each of the nodes have (4) dedicated cores.
I am running a worpdess site on each node, basically each node runs an identical copy of the same wordpress PHP code.
Everything is running as it should but the load on the CPU is very high, we are averaging the following load on the nodes:
Node 1: Load average: 4.11 3.84 3.95
Node 2: Load average: 4.20 3.94 4.95
Why are the cores spiking and holding the load. The whole websites gets about 2.1 million requests a day balanced over the (2) nodes.
Is this an nginx configuration or php-fpm issue, or it's just a matter of needing to add one or two more nodes?
Thank you in advance for your help.
Dave
My nginx.conf
[ddavtian@mobilefood-1 nginx]$ more nginx.conf
#######################################################################
#
# This is the main Nginx configuration file.
#
# More information about the configuration options is available on
# * the English wiki - http://wiki.nginx.org/Main
# * the Russian documentation - http://sysoev.ru/nginx/
#
#######################################################################
#----------------------------------------------------------------------
# Main Module - directives that cover basic functionality
#
# http://wiki.nginx.org/NginxHttpMainModule
#
#----------------------------------------------------------------------
user nginx;
worker_processes 4;
worker_rlimit_nofile 30000;
error_log /var/log/nginx/error.log;
#error_log /var/log/nginx/error.log notice;
#error_log /var/log/nginx/error.log info;
pid /var/run/nginx.pid;
#----------------------------------------------------------------------
# Events Module
#
# http://wiki.nginx.org/NginxHttpEventsModule
#
#----------------------------------------------------------------------
events {
worker_connections 1024;
}
#----------------------------------------------------------------------
# HTTP Core Module
#
# http://wiki.nginx.org/NginxHttpCoreModule
#
#----------------------------------------------------------------------
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
#access_log /var/log/nginx/access.log main;
sendfile on;
#tcp_nopush on;
#keepalive_timeout 0;
keepalive_timeout 15;
#gzip on;
# Load config files from the /etc/nginx/conf.d directory
# The default server is in conf.d/default.conf
include /etc/nginx/conf.d/*.conf;
}
My fastcgi_params
fastcgi_param QUERY_STRING $query_string;
fastcgi_param REQUEST_METHOD $request_method;
fastcgi_param CONTENT_TYPE $content_type;
fastcgi_param CONTENT_LENGTH $content_length;
fastcgi_param SCRIPT_NAME $fastcgi_script_name;
fastcgi_param REQUEST_URI $request_uri;
fastcgi_param DOCUMENT_URI $document_uri;
fastcgi_param DOCUMENT_ROOT $document_root;
fastcgi_param SERVER_PROTOCOL $server_protocol;
fastcgi_param GATEWAY_INTERFACE CGI/1.1;
fastcgi_param SERVER_SOFTWARE nginx/$nginx_version;
fastcgi_param REMOTE_ADDR $remote_addr;
fastcgi_param REMOTE_PORT $remote_port;
fastcgi_param SERVER_ADDR $server_addr;
fastcgi_param SERVER_PORT $server_port;
fastcgi_param SERVER_NAME $server_name;
fastcgi_connect_timeout 60;
fastcgi_send_timeout 180;
fastcgi_read_timeout 180;
fastcgi_buffer_size 128k;
fastcgi_buffers 4 256k;
fastcgi_busy_buffers_size 256k;
fastcgi_temp_file_write_size 256k;
fastcgi_intercept_errors on;
# PHP only, required if PHP was built with --enable-force-cgi-redirect
fastcgi_param REDIRECT_STATUS 200;
My default.conf
[ddavtian@mobilefood-1 conf.d]$ more default.conf
#
# The default server
#
server {
listen 80;
server_name mobilefoodblog.com;
#charset koi8-r;
access_log off;
#access_log logs/host.access.log main;
location /nginx_status {
stub_status on;
access_log off;
allow 67.23.12.32;
deny all;
}
location / {
root /var/www/html/blog;
index index.php index.html index.htm;
try_files $uri $uri/ /index.php?q=$uri&$args;
}
error_page 404 /404.html;
location = /404.html {
root /usr/share/nginx/html;
}
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
# proxy the PHP scripts to Apache listening on 127.0.0.1:80
#
#location ~ \.php$ {
# proxy_pass http://127.0.0.1;
#}
# pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000
#
#location ~ \.php$ {
# root html;
# fastcgi_pass 127.0.0.1:9000;
# fastcgi_index index.php;
# fastcgi_param SCRIPT_FILENAME /scripts$fastcgi_script_name;
# include fastcgi_params;
#}
location ~ \.php$ {
root /var/www/html/blog;
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME /var/www/html/blog$fastcgi_script_name;
include fastcgi_params;
set_real_ip_from 192.168.255.0/24;
real_ip_header X-Forwarded-For;
}
# deny access to .htaccess files, if Apache's document root
# concurs with nginx's one
#
#location ~ /\.ht {
# deny all;
#}
}
My php-fpm.conf
[ddavtian@mobilefood-1 conf.d]$ more default.conf
#
# The default server
#
server {
listen 80;
server_name mobilefoodblog.com;
#charset koi8-r;
access_log off;
#access_log logs/host.access.log main;
location /nginx_status {
stub_status on;
access_log off;
allow 67.180.226.49;
deny all;
}
location / {
root /var/www/html/blog;
index index.php index.html index.htm;
try_files $uri $uri/ /index.php?q=$uri&$args;
}
error_page 404 /404.html;
location = /404.html {
root /usr/share/nginx/html;
}
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
# proxy the PHP scripts to Apache listening on 127.0.0.1:80
#
#location ~ \.php$ {
# proxy_pass http://127.0.0.1;
#}
# pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000
#
#location ~ \.php$ {
# root html;
# fastcgi_pass 127.0.0.1:9000;
# fastcgi_index index.php;
# fastcgi_param SCRIPT_FILENAME /scripts$fastcgi_script_name;
# include fastcgi_params;
#}
location ~ \.php$ {
root /var/www/html/blog;
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME /var/www/html/blog$fastcgi_script_name;
include fastcgi_params;
set_real_ip_from 192.168.255.0/24;
real_ip_header X-Forwarded-For;
}
# deny access to .htaccess files, if Apache's document root
# concurs with nginx's one
#
#location ~ /\.ht {
# deny all;
#}
}
My php-fpm.conf
[ddavtian@mobilefood-1 php-fpm.d]$ more
; Start a new pool named 'www'.
[www]
; The address on which to accept FastCGI requests.
; Valid syntaxes are:
; 'ip.add.re.ss:port' - to listen on a TCP socket to a specific address on
; a specific port;
; 'port' - to listen on a TCP socket to all addresses on a
; specific port;
; '/path/to/unix/socket' - to listen on a unix socket.
; Note: This value is mandatory.
listen = 127.0.0.1:9000
; Set listen(2) backlog. A value of '-1' means unlimited.
; Default Value: -1
;listen.backlog = -1
; List of ipv4 addresses of FastCGI clients which are allowed to connect.
; Equivalent to the FCGI_WEB_SERVER_ADDRS environment variable in the original
; PHP FCGI (5.2.2+). Makes sense only with a tcp listening socket. Each address
; must be separated by a comma. If this value is left blank, connections will be
; accepted from any ip address.
; Default Value: any
listen.allowed_clients = 127.0.0.1
; Set permissions for unix socket, if one is used. In Linux, read/write
; permissions must be set in order to allow connections from a web server. Many
; BSD-derived systems allow connections regardless of permissions.
; Default Values: user and group are set as the running user
; mode is set to 0666
;listen.owner = nobody
;listen.group = nobody
;listen.mode = 0666
; Unix user/group of processes
; Note: The user is mandatory. If the group is not set, the default user's group
; will be used.
; RPM: apache Choosed to be able to access some dir as httpd
user = apache
; RPM: Keep a group allowed to write in log dir.
group = apache
; Choose how the process manager will control the number of child processes.
; Possible Values:
; static - a fixed number (pm.max_children) of child processes;
; dynamic - the number of child processes are set dynamically based on the
; following directives:
; pm.max_children - the maximum number of children that can
; be alive at the same time.
; pm.start_servers - the number of children created on startup.
; pm.min_spare_servers - the minimum number of children in 'idle'
; state (waiting to process). If the number
; of 'idle' processes is less than this
; number then some children will be created.
; pm.max_spare_servers - the maximum number of children in 'idle'
; state (waiting to process). If the number
; of 'idle' processes is greater than this
; number then some children will be killed.
; Note: This value is mandatory.
pm = dynamic
; The number of child processes to be created when pm is set to 'static' and the
; maximum number of child processes to be created when pm is set to 'dynamic'.
; This value sets the limit on the number of simultaneous requests that will be
; served. Equivalent to the ApacheMaxClients directive with mpm_prefork.
; Equivalent to the PHP_FCGI_CHILDREN environment variable in the original PHP
; CGI.
; Note: Used when pm is set to either 'static' or 'dynamic'
; Note: This value is mandatory.
pm.max_children = 60
; The number of child processes created on startup.
; Note: Used only when pm is set to 'dynamic'
; Default Value: min_spare_servers + (max_spare_servers - min_spare_servers) / 2
pm.start_servers = 10
; The desired minimum number of idle server processes.
; Note: Used only when pm is set to 'dynamic'
; Note: Mandatory when pm is set to 'dynamic'
pm.min_spare_servers = 5
; The desired maximum number of idle server processes.
; Note: Used only when pm is set to 'dynamic'
; Note: Mandatory when pm is set to 'dynamic'
pm.max_spare_servers = 35
; The number of requests each child process should execute before respawning.
; This can be useful to work around memory leaks in 3rd party libraries. For
; endless request processing specify '0'. Equivalent to PHP_FCGI_MAX_REQUESTS.
; Default Value: 0
;pm.max_requests = 500
; The URI to view the FPM status page. If this value is not set, no URI will be
; recognized as a status page. By default, the status page shows the following
; information:
; accepted conn - the number of request accepted by the pool;
; pool - the name of the pool;
; process manager - static or dynamic;
; idle processes - the number of idle processes;
; active processes - the number of active processes;
; total processes - the number of idle + active processes.
; The values of 'idle processes', 'active processes' and 'total processes' are
; updated each second. The value of 'accepted conn' is updated in real time.
; Example output:
; accepted conn: 12073
; pool: www
; process manager: static
; idle processes: 35
; active processes: 65
; total processes: 100
; By default the status page output is formatted as text/plain. Passing either
; 'html' or 'json' as a query string will return the corresponding output
; syntax. Example:
; http://www.foo.bar/status
; http://www.foo.bar/status?json
; http://www.foo.bar/status?html
; Note: The value must start with a leading slash (/). The value can be
; anything, but it may not be a good idea to use the .php extension or it
; may conflict with a real PHP file.
; Default Value: not set
;pm.status_path = /status
; The ping URI to call the monitoring page of FPM. If this value is not set, no
; URI will be recognized as a ping page. This could be used to test from outside
; that FPM is alive and responding, or to
; - create a graph of FPM availability (rrd or such);
; - remove a server from a group if it is not responding (load balancing);
; - trigger alerts for the operating team (24/7).
; Note: The value must start with a leading slash (/). The value can be
; anything, but it may not be a good idea to use the .php extension or it
; may conflict with a real PHP file.
; Default Value: not set
;ping.path = /ping
; This directive may be used to customize the response of a ping request. The
; response is formatted as text/plain with a 200 response code.
; Default Value: pong
;ping.response = pong
; The timeout for serving a single request after which the worker process will
; be killed. This option should be used when the 'max_execution_time' ini option
; does not stop script execution for some reason. A value of '0' means 'off'.
; Available units: s(econds)(default), m(inutes), h(ours), or d(ays)
; Default Value: 0
;request_terminate_timeout = 0
; The timeout for serving a single request after which a PHP backtrace will be
; dumped to the 'slowlog' file. A value of '0s' means 'off'.
; Available units: s(econds)(default), m(inutes), h(ours), or d(ays)
; Default Value: 0
;request_slowlog_timeout = 0
; The log file for slow requests
; Default Value: not set
; Note: slowlog is mandatory if request_slowlog_timeout is set
slowlog = /var/log/php-fpm/www-slow.log
; Set open file descriptor rlimit.
; Default Value: system defined value
;rlimit_files = 1024
; Set max core size rlimit.
; Possible Values: 'unlimited' or an integer greater or equal to 0
; Default Value: system defined value
;rlimit_core = 0
; Chroot to this directory at the start. This value must be defined as an
; absolute path. When this value is not set, chroot is not used.
; Note: chrooting is a great security feature and should be used whenever
; possible. However, all PHP paths will be relative to the chroot
; (error_log, sessions.save_path, ...).
; Default Value: not set
;chroot =
; Chdir to this directory at the start. This value must be an absolute path.
; Default Value: current directory or / when chroot
;chdir = /var/www
; Redirect worker stdout and stderr into main error log. If not set, stdout and
; stderr will be redirected to /dev/null according to FastCGI specs.
; Default Value: no
;catch_workers_output = yes
; Pass environment variables like LD_LIBRARY_PATH. All $VARIABLEs are taken from
; the current environment.
; Default Value: clean env
;env[HOSTNAME] = $HOSTNAME
;env[PATH] = /usr/local/bin:/usr/bin:/bin
;env[TMP] = /tmp
;env[TMPDIR] = /tmp
;env[TEMP] = /tmp
; Additional php.ini defines, specific to this pool of workers. These settings
; overwrite the values previously defined in the php.ini. The directives are the
; same as the PHP SAPI:
; php_value/php_flag - you can set classic ini defines which can
; be overwritten from PHP call 'ini_set'.
; php_admin_value/php_admin_flag - these directives won't be overwritten by
; PHP call 'ini_set'
; For php_*flag, valid values are on, off, 1, 0, true, false, yes or no.
; Defining 'extension' will load the corresponding shared extension from
; extension_dir. Defining 'disable_functions' or 'disable_classes' will not
; overwrite previously defined php.ini values, but will append the new value
; instead.
; Default Value: nothing is defined by default except the values in php.ini and
; specified at startup with the -d argument
;php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i -f www@my.domain.com
;php_flag[display_errors] = off
php_admin_value[error_log] = /var/log/php-fpm/www-error.log
php_admin_flag[log_errors] = on
;php_admin_value[memory_limit] = 32M
29 Replies
pm.max_children = 60
That looks pretty high, even on a 2GB Linode. Imagine what would happen if a server tried to generate 60 web pages simultaneously. That's a lot of CPU, a lot of connections to the database, and potentially a lot of RAM, too.
What does your memory usage look like when the load spike happens? Post the output of
free -m
````
I'd normally recommend an aggressive caching plugin for WordPress, but caching gets tricky when you have more than one server that needs to produce identical results. Memcached might help, but I don't have any experience with WordPress caching plugins in a load-balanced situation so I'll leave that topic to someone else.
By the way, some nice recipes you've got there. I'm getting hungry!
I have tried lowering the:
> pm.max_children = 60
to a lower number, if the memory serves me correctly I have tried things like 30, 40 and 50 and the CPU load still stayed about the same as mentioned above. The servers seem to be using normal amount of memory or at least they are not swapping.
[ddavtian@mobilefood-1 php-fpm.d]$ free -m
total used free shared buffers cached
Mem: 1997 535 1462 0 33 268
-/+ buffers/cache: 234 1763
Swap: 2047 0 2047
[ddavtian@mobilefood-1 php-fpm.d]$
[ddavtian@mobilefoodblog-2 ~]$ free -m
total used free shared buffers cached
Mem: 1997 804 1193 0 38 463
-/+ buffers/cache: 302 1694
Swap: 2047 0 2047
[ddavtian@mobilefoodblog-2 ~]$
As for caching, we are using php based banner ads and I am always fearful that if I start caching the content (which will help with the CPU I am sure) it will start caching the banner ads as well.
Thank you on the content
Thanks
Dave
What does top show?
Do you have APC installed?
Is your database server working fine?
As for adjusting pm.max_children, did you restart php5-fpm after each adjustment, or did you restart nginx instead? Unlike with Apache, restarting nginx will not have any effect on PHP. (Pretty basic stuff here, but sometimes people miss it.)
You could also try disabling WordPress plugins, one at a time, for a few minutes each, on only one server. Do the same with the ad script if possible. See if the load average goes down. This could help identify any misbehaving component of your site.
Update: I disabled all of the plugins, leaving one WPTouch which is really needed for the content to be rendered for mobile phones and disabling the plugins didn't change the CPU load numbers.
Yes, memory consumption is quite well and neither of the nodes are swapping by any means, here's an output from htop, as you can see php-fpm is the one that seems to be consuming the most amount of CPU here.
1 [||||| 12.3%] Tasks: 53, 4 thr; 1 running
2 [|||| 8.8%] Load average: 2.34 2.28 2.20
3 [|||| 7.9%] Uptime: 2 days, 02:11:33
4 [|||| 8.2%]
Mem[|||||||||||||||| 379/1997MB]
Swp[ 0/2047MB]
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
11750 apache 20 0 508M 28264 16620 S 2.0 1.4 0:06.10 php-fpm: pool www
11727 apache 20 0 513M 33252 16968 S 2.0 1.6 0:08.99 php-fpm: pool www
11757 apache 20 0 508M 27308 15852 S 2.0 1.3 0:06.04 php-fpm: pool www
11762 apache 20 0 508M 27380 15852 S 2.0 1.3 0:05.75 php-fpm: pool www
11775 apache 20 0 508M 28352 16620 S 2.0 1.4 0:05.75 php-fpm: pool www
11806 apache 20 0 508M 28088 16620 S 2.0 1.4 0:01.48 php-fpm: pool www
11807 apache 20 0 508M 27340 15840 S 2.0 1.3 0:00.92 php-fpm: pool www
11758 apache 20 0 508M 28236 16636 S 1.0 1.4 0:06.32 php-fpm: pool www
11774 apache 20 0 508M 27536 15852 S 1.0 1.3 0:05.65 php-fpm: pool www
11742 apache 20 0 509M 29484 17200 S 1.0 1.4 0:08.75 php-fpm: pool www
11747 apache 20 0 508M 27408 15916 S 1.0 1.3 0:08.42 php-fpm: pool www
11761 apache 20 0 508M 28300 16620 S 1.0 1.4 0:05.96 php-fpm: pool www
11754 apache 20 0 508M 27572 15852 S 1.0 1.3 0:06.14 php-fpm: pool www
11779 apache 20 0 508M 27376 15908 S 1.0 1.3 0:03.87 php-fpm: pool www
11755 apache 20 0 508M 28100 16624 S 1.0 1.4 0:06.11 php-fpm: pool www
11751 apache 20 0 508M 27640 15896 S 0.0 1.4 0:06.06 php-fpm: pool www
11778 apache 20 0 508M 28336 16620 S 0.0 1.4 0:04.00 php-fpm: pool www
11743 apache 20 0 508M 27976 16024 S 0.0 1.4 0:08.50 php-fpm: pool www
11745 apache 20 0 508M 28176 16720 S 0.0 1.4 0:08.31 php-fpm: pool www
11780 apache 20 0 508M 27356 15908 S 0.0 1.3 0:03.56 php-fpm: pool www
11777 apache 20 0 508M 28108 16620 S 0.0 1.4 0:05.65 php-fpm: pool www
11748 apache 20 0 508M 28316 16792 S 0.0 1.4 0:08.50 php-fpm: pool www
11749 apache 20 0 508M 28336 16656 S 0.0 1.4 0:06.20 php-fpm: pool www
11759 apache 20 0 508M 28084 16620 S 0.0 1.4 0:06.11 php-fpm: pool www
11781 apache 20 0 508M 28052 16620 S 0.0 1.4 0:02.73 php-fpm: pool www
11756 apache 20 0 508M 28316 16624 S 0.0 1.4 0:06.16 php-fpm: pool www
11760 apache 20 0 508M 28096 16616 S 0.0 1.4 0:05.98 php-fpm: pool www
11776 apache 20 0 508M 28320 16620 S 0.0 1.4 0:05.66 php-fpm: pool www
11744 apache 20 0 508M 28104 16648 S 0.0 1.4 0:08.72 php-fpm: pool www
> Do you have APC installed?
Yes APC is installed on both nodes
> Is your database server working fine?
Yes, I even ran mysqltunner to make sure all the numbers are in tact.
> As for adjusting pm.max_children, did you restart php5-fpm after each adjustment, or did you restart nginx instead? Unlike with Apache, restarting nginx will not have any effect on PHP. (Pretty basic stuff here, but sometimes people miss it.)
Yes, after every change to php-fpm I did issue a restart of php-fpm to make sure the changes are reflected.
> You could also try disabling WordPress plugins, one at a time, for a few minutes each, on only one server. Do the same with the ad script if possible. See if the load average goes down. This could help identify any misbehaving component of your site.
Thank you will try this next.
Dave
Of course, if you are relying on long-lived PHP scripts or a remote database, the rules change.
Your htop output shows that you're only using ~40% CPU even though your load average is above 2. Does your Dashboard show a similar level of CPU usage? Does the server feel sluggish at all when the load average is above 4? Your site seemed to load pretty quickly when I checked it out earlier today. If your CPU usage and disk I/O are under control and the site doesn't feel slow, you might not need to worry about the load average all that much. The load average isn't a particularly accurate representation of system resource utilization anyway. It just means that you have a lot of processes competing for CPU time, so you might want to reduce the number of processes you're running. Which leads back to Guspaz's comment above:
@Guspaz:
Yikes, that's insane overkill. I've always been of the opinion that there's not much point running much more than 6-8 PHP processes on any size of linode. You've only got 4 cores, so as long as you're not blocked waiting on something else non-CPU related (such as a database on a different machine, disk IO, etc), and you don't have any long-living scripts you're not getting any real additional benefit except to unnecessarily increase contention and RAM usage.
Very good idea. If 30 didn't help, try reducing it even further. Since WordPress relies very heavily on database calls and you're not using any caching plugins, you might want to aim a little higher than Guspaz's suggestion: start with 12-16 and make adjustments over time to find your sweet spot. If you're skeptical about this experiment, fire up a smaller linode (768 should be more than enough) with the lower settings and add it to the load balancer to see how it performs. You might have been wasting money on a pair of oversized linodes:P
(Until recently, php5-fpm shipped with 150 children by default. That's even more insane than Apache's prefork MPM using 150 children by default, because at least some of those Apache children would be serving static requests. Fortunately, the default value was reduced in PHP 5.3.9, released just a few days ago.)
Node 1 CPU:
1 [|||||||||||||||||||||||||||||||||||||||94.4%] Tasks: 62, 9 thr; 17 running
2 [|||||||||||||||||||||||||||||||||||||| 82.9%] Load average: 4.81 4.78 4.11
3 [|||||||||||||||||||||||||||||||||||| 77.4%] Uptime: 2 days, 04:16:31
4 [||||||||||||||||||||||||||||||||| 72.6%]
Mem[||||||||||||||||||| 469/1997MB]
Swp[ 0/2047MB]
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
29186 apache 20 0 511M 30656 18032 S 11.0 1.5 1:02.53 php-fpm: pool www
29203 apache 20 0 511M 29308 17296 R 10.0 1.4 0:59.24 php-fpm: pool www
29208 apache 20 0 510M 32336 20564 R 9.0 1.6 0:58.67 php-fpm: pool www
29184 apache 20 0 511M 30372 17748 S 9.0 1.5 1:02.48 php-fpm: pool www
29210 apache 20 0 510M 32552 20780 S 9.0 1.6 0:58.96 php-fpm: pool www
29197 apache 20 0 510M 29088 17336 S 9.0 1.4 0:59.09 php-fpm: pool www
29196 apache 20 0 514M 40232 24704 R 9.0 2.0 0:58.52 php-fpm: pool www
29191 apache 20 0 512M 37376 23940 S 9.0 1.8 1:02.39 php-fpm: pool www
29282 apache 20 0 510M 28840 17296 R 9.0 1.4 0:11.29 php-fpm: pool www
29183 apache 20 0 512M 30644 17508 S 9.0 1.5 1:02.14 php-fpm: pool www
29189 apache 20 0 509M 28436 17488 R 9.0 1.4 1:03.10 php-fpm: pool www
Node 2 CPU:
1 [||||||||||||||||||||||||| 70.6%] Tasks: 61, 9 thr; 4 running
2 [||||||||||||||||||| 54.9%] Load average: 2.04 2.17 2.27
3 [||||||||||||||| 42.5%] Uptime: 2 days, 04:17:21
4 [||||||||||||| 34.7%]
Mem[||||||||||||||||| 427/1997MB]
Swp[ 0/2047MB]
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
12961 apache 20 0 510M 28228 16404 S 6.0 1.4 0:14.71 php-fpm: pool www
12956 apache 20 0 510M 31316 19488 S 6.0 1.5 0:15.21 php-fpm: pool www
12950 apache 20 0 511M 28728 16888 S 6.0 1.4 0:15.32 php-fpm: pool www
12967 apache 20 0 510M 28208 16388 S 6.0 1.4 0:13.10 php-fpm: pool www
12965 apache 20 0 510M 27684 15860 R 6.0 1.4 0:13.04 php-fpm: pool www
13056 apache 20 0 510M 27520 15840 S 6.0 1.3 0:01.41 php-fpm: pool www
12952 apache 20 0 510M 28028 16388 S 6.0 1.4 0:15.25 php-fpm: pool www
12968 apache 20 0 511M 28232 16396 S 5.0 1.4 0:13.28 php-fpm: pool www
12987 apache 20 0 510M 28216 16396 S 5.0 1.4 0:09.88 php-fpm: pool www
12959 apache 20 0 510M 28376 16556 S 5.0 1.4 0:14.91 php-fpm: pool www
13034 apache 20 0 510M 28000 16368 S 5.0 1.4 0:03.03 php-fpm: pool www
As for graphs:
Node 1: CPU
~~![](<URL url=)http://dl.dropbox.com/u/77881/node_1_cpu.png
Node 1 IO:
~~![](<URL url=)http://dl.dropbox.com/u/77881/node_1_op.png
Node 2 CPU:
~~![](<URL url=)http://dl.dropbox.com/u/77881/node_2_cpu.png
Node 2 IO:
~~![](<URL url=)http://dl.dropbox.com/u/77881/node_2_io.png
> Does the server feel sluggish at all when the load average is above 4?
Not really, things are loading quite fast
> Guspaz's suggestion: start with 12-16 and make adjustments over time to find your sweet spot
Right now php-fpm is running with 30, will try to reduce this more and more and keep an eye on things. I have now also installed memcached on both servers and looking to see if it makes any difference.
Thank you!
Dave~~~~
Meanwhile, here's another question: Why does node 2 only peak at 110% when node 1 peaks at 350-400%? Does node 1 get 3 times as much traffic as node 2 does? No wait, node 2 has nearly constant I/O regardless of CPU usage. Why the difference?
Other than reducing pm.max_children and playing with plugins, here are a few other changes that you can try:
Make the ads load separately from the pages themselves, using iframes or JavaScript. Then you can use aggressive caching on WordPress without worrying about your ads getting cached.
Use 4 Linode 1024s instead of 2 Linode 2048s. That way, you'll still have plenty of RAM, but your CPU usage will be more widely spread out. Using 350% on a single node is not nice to your neighbors.
> Meanwhile, here's another question: Why does node 2 only peak at 110% when node 1 peaks at 350-400%? Does node 1 get 3 times as much traffic as node 2 does? No wait, node 2 has nearly constant I/O regardless of CPU usage. Why the difference?
Excellent question, no idea, both nodes are configured with the same "weight" 100 within the node balancer, so in essence both nodes should get the same amount of traffic. The only thing is that the node balancer is configured to send traffic based on "least connections". It calculates (somehow) to see who has the least amount of active connections to it and routes the new connection to that node.
> Other than reducing pm.max_children and playing with plugins, here are a few other changes that you can try:
I have been playing this number all day on both of the nodes, if I reduce this number NGINX starts dropping the connection and starts throwing "no connections". If I increase the number we see that the CPU usage increases. It's like the chicken and the egg game now with this. I am currently at about 35 on both nodes.
> - Make the ads load separately from the pages themselves, using iframes or JavaScript. Then you can use aggressive caching on WordPress without worrying about your ads getting cached.
The ad provider ONLY provides a php based SDK for this, so javascript is really out of the question, unless I write my own javascript and have it call the php page itself. iFrame is a good solution, will give it a try.
> - Use 4 Linode 1024s instead of 2 Linode 2048s. That way, you'll still have plenty of RAM, but your CPU usage will be more widely spread out. Using 350% on a single node is not nice to your neighbors.
Agreed, 350% cpu is not being a friendly neighbor.
Thanks for all the help today, I appreciated it.
Dave
@ddavtian:
Excellent question, no idea, both nodes are configured with the same "weight" 100 within the node balancer, so in essence both nodes should get the same amount of traffic. The only thing is that the node balancer is configured to send traffic based on "least connections". It calculates (somehow) to see who has the least amount of active connections to it and routes the new connection to that node.
What do the traffic graphs look like? If traffic looks more or less the same, then there must be some other difference between the two nodes. If traffic is skewed, try using a different load balancing algorithm.
@ddavtian:
I have been playing this number all day on both of the nodes, if I reduce this number NGINX starts dropping the connection and starts throwing "no connections". If I increase the number we see that the CPU usage increases. It's like the chicken and the egg game now with this. I am currently at about 35 on both nodes.
502 Bad Gateway or 504 Gateway Timeout?
@ddavtian:
The ad provider ONLY provides a php based SDK for this, so javascript is really out of the question, unless I write my own javascript and have it call the php page itself. iFrame is a good solution, will give it a try.
iframes would work perfectly if the ads are not contextual, but the ad provider might notice that the referer is always the same, because it's always the page inside the iframe that gets used as the referer. In extreme cases, the linked page might load inside the iframe, which is not only useless but also looks a lot like you're trying to fake the clicks.
If you have jQuery, you can do something like
$('#ad_space').load('/path/to/ads.php');
and the ads will be loaded into thewith id="ad_space". Much cleaner than using an iframe, and the referer will be correct, too. But please do check with your ad provider whether this is permitted.
> Use 4 Linode 1024s instead of 2 Linode 2048s. That way, you'll still have plenty of RAM, but your CPU usage will be more widely spread out. Using 350% on a single node is not nice to your neighbors.
Thank you very much for your continuos help here. Took your advice (very valid one for that matter) and switched to (4) 1024 instances. Things are operating much better now, will know better tonight when the traffic picks up. On any case it should be better now as far as the CPU's are concerned since we went from 8 CPU's to 16 CPU's
> 502 Bad Gateway or 504 Gateway Timeout?
Was seeing 504's
Thanks again.
Dave
Do you have a wordpress caching plugin enabled?
Are you using plugins to deter trackback/spam ?
If you've got any plugins (php scripts) doing DNS lookups/IP translations, that'd be sure to spike your php5-fpm usage.
For giggles, I'd recommend you disable a batch of your anti-spam plugins (e.g. popular plugins like akismet, Simple Trackback Validation with Topsy Blocker, etc). Restart nginx/php5-fpm observe htop.
Also, bear in mind that the newest wordpress cores, especially 3.3.1 have deprecated many functions, so while older plugins may still work, they do so with significantly more overhead. So, in some cases, rolling back to the last known stable core is also particularly helpful.
> Do you have a wordpress caching plugin enabled?
Yes, W3 Total Cache is enabled with very minimal caching, i.e. object caching, database caching and a configured cloudflare account there.
> Are you using plugins to deter trackback/spam ?
Yes, Akismet has been configured to block spam.
Right now, we have spread the load over 4 nodes using a load balancer and things are ok, but I will take your advice and disable some of the above and see if anything changes as far as load.
Thank You!
Dave
Also, DB on dedicated? Did you get one in the same datacenter with Linodes?
@Azathoth:
Why is he a bad neighbor if he uses 400% of CPU on single node? Linode is not overselling, so the alloted and guaranteed CPU time is yours to use. Or am I missing something obvious here?
Linode does not guarantee 400% CPU, only a fair share of the host's resources. The fair share depends on what everyone else is doing, because the host probably has only 8-16 CPUs. Call it overselling if you want, but if you constantly use 400% CPU, they'll open a ticket and gently ask you to fix your server.
@fhumayun:
If you've got any plugins (php scripts) doing DNS lookups/IP translations, that'd be sure to spike your php5-fpm usage.
That would be worth looking into.
@fhumayun:
For giggles, I'd recommend you disable a batch of your anti-spam plugins (e.g. popular plugins like akismet, Simple Trackback Validation with Topsy Blocker, etc).
This shouldn't be a problem unless OP is getting tons of comments and trackbacks. Most blogs are read-heavy, so the amount of processing power devoted to comment filtering may be negligible.
@Azathoth:
Also, DB on dedicated? Did you get one in the same datacenter with Linodes?
The DB is running on another dedicated machine, same data center via a local IP address.
Dave
@ddavtian:
@Azathoth:Also, DB on dedicated? Did you get one in the same datacenter with Linodes?
The DB is running on another dedicated machine, same data center via a local IP address.
I think Azathoth misunderstood that. You're using a Linode that is dedicated to the database, not a "dedicated server" with another company, right?
@hybinet:
@ddavtian:
@Azathoth:Also, DB on dedicated? Did you get one in the same datacenter with Linodes?
The DB is running on another dedicated machine, same data center via a local IP address.
I think Azathoth misunderstood that. You're using a Linode that is dedicated to the database, not a "dedicated server" with another company, right?
Correct, a Linode running a dedicated mySQL server over a local ip address in the same data center
As for CPU…. I was under the impression that with Xen you can limit how much CPU single node will consume so it's not like OpenVZ or similar where one hog can bog down everyone else on the host.
So if we can't use more than alloted, and it is alloted fairly (of X nodes on the host, each can use 1/X of the CPU time), where's the problem? Or am I under wrong impression re: Xen or Linode's Xen setup for nodes?
@Azathoth:
So if we can't use more than alloted, and it is alloted fairly (of X nodes on the host, each can use 1/X of the CPU time), where's the problem? Or am I under wrong impression re: Xen or Linode's Xen setup for nodes?
The issue is that nodes can burst higher than their minimum allotted fair share, all the way up to 400%, and users usually expect to be able to do so. It's rude to take advantage of that excessively and force your neighbors into getting less than 400%. It's about sharing and the tragedy of the commons. (Or at least that's what your hapless neighbors will think.)
Edit: Copy editing.
Now, each node has access to 4 cores, resulting with 400% max on the graphs. It is my understanding that if you had 4 processes crunching numbers and the guest/host kernel spreads them each to its own core, the graphs will show 400%, but your node in total is actually using only 1/X of the host's CPU time*, where X is the number of nodes on the host doing the same thing.
*) CPU time is the keyword here. You can use over 9000 cores if available, but on each you're assigned 1/X of the CPU time in the situation where X nodes are requesting max CPU.
So the graphs may show 400% because the guest kernel scheduler sees 4 processes active every time it looks, each on its own core, but is oblivious to the actual time slots assigned by the host kernel scheduler.
Am I being wrong?
My understanding is that when a guest thinks it's using 400%, that means it's really using 400% of real CPUs.
I note that guests are aware of steal time, which is when the guest asks for CPU time but the host says no and schedules some other guest. (Look for an 'st' column in your favorite tool like vmstat.) I don't know how that fits into the guest's scheduling estimation and reporting, though.
I also note that the Linode manager graphs aren't generated by the guest, so they have access to what really happens, not whatever is going on in the mind of the guest's kernel.
Incidentally, Linode is believed to have more than 400% available on the hosts*, so a single node can't actually suck everything up on its own.
- At least I hope they're not paying for dual-capable CPUs and then putting only one in each host… (Or lying about what CPUs they use.)
Edit: Clarification, copy editing.
@mnordhoff:
I also note that the Linode manager graphs aren't generated by the guest, so they have access to what really happens, not whatever is going on in the mind of the guest's kernel.
True, but that still doesn't mean the 400% shown in graphs is not the sum of only available CPU time.
I seriously doubt that 400% on the graph means 400% of the underlaying hardware. Because then, on 8 core hosts (and I think I read somewhere Linode uses 8 cores), 10 nodes per host, each shouldn't be using more than 80% (800% being total, 1/x = 800 / 10 = 80%), or 40% with 20 nodes per host.
It would be great of someone from Linode could clarify this.
@Azathoth:
I seriously doubt that 400% on the graph means 400% of the underlaying hardware. Because then, on 8 core hosts (and I think I read somewhere Linode uses 8 cores), 10 nodes per host, each shouldn't be using more than 80% (800% being total, 1/x = 800 / 10 = 80%), or 40% with 20 nodes per host.
It would be great of someone from Linode could clarify this.
8)
Like I said, Linode doesn't limit you to your fair share. If there isn't contention, you can burst all the way up to 400%. Edit: The fair share is a worst-case scenario.
@mnordhoff:
Like I said, Linode doesn't limit you to your fair share. If there isn't contention, you can burst all the way up to 400%. Edit: The fair share is a worst-case scenario.
In which case, back to my original question. If fair sharing is enforced, how's anyone a bad neighbor? If you're using all of the CPU, and I don't need it, why should I complain. If I start needing it, I can rely on fair sharing to get what "belongs" to me.
If 20 of us content for the CPU on the same host (say we all run SETI@Home), it will be shared equally which is what we paid for (and everything above 1/X is "free" bonus). Unless Linode is overselling*, and we know they're not.
*) which would be, in this case, that the number of nodes per host is so great that individual nodes grind to a halt or have visibly poor performance, which I often saw with OpenVZ hosts, never here on Linode in 2 years that I'm here.
@Azathoth:
In which case, back to my original question. If fair sharing is enforced, how's anyone a bad neighbor? If you're using all of the CPU, and I don't need it, why should I complain. If I start needing it, I can rely on fair sharing to get what "belongs" to me.
I think it's a question of equality vs. expectations. Yes, if multiple guests are all going full bore with CPU, they'll all at least get an equal share of the available cores that they share* ). So in that sense, the CPU is being shared equally.
However, in practice, especially at the lower Linode sizes, there are lots of guests on a host, and it's pretty rare for them all to use maximum CPU simultaneously for long periods of time, and expectations for performance end up being established based on what actually happens. This is definitely different than a guarantee (e.g., on a Linode 512 you're only really guaranteed like 20% of a single core, assuming 20 guests per 4 cores* ), but part of what makes Linode work so well is that you virtually always have larger burst levels available just due to the statistics of sharing.
The concept of a "bad neighbor" is soft, so I don't think it can be precisely defined. But I do think that there are expectations that most Linodes can burst quite high CPU-wise, so if a small number of guests on a host are burning all available CPU (even if shared equally), the experience will suffer in general. Taken to the extreme, if everyone tended to use 100% of the CPU they could get (even if shared) I suspect Linodes would be a whole lot less attractive.
In other words, over the long haul we actually all benefit by no one guest (or small number of guests) trying to take as much as possible, even if fairly distributed. Except of course for someone who would otherwise really use the maximum CPU for a long period of time in which case that runs slower over the long term by not taking as much. But that's where the concept of good or bad neighbor can come in. The up thread reference to tragedy of the commons is reasonable I think, if imperfect since CPU isn't a resource to deplete except over short time periods.
The nice part of all this is that it usually "just works" - e.g., do what you need and it all sort of evens out. But when you start talking about stuff like SETI (which will take whatever CPU it can get for absolutely as long as possible) that begins to break the statistical sharing that, in reality, we all benefit from.
– David
* I believe that on a typical 8-core host, the available guests are all set to a maximum burst of 4 cores, so it's possible if two guests are actually on disjoint subsets of 4 cores they wouldn't interfere at all.</r>
@db3l:
I think it's a question of equality vs. expectations.
I think you're absolutely right. There have been many blog posts comparing the performance of a Linode with similarly priced Amazon EC2 instances and whatnot. Most of these benchmarks are CPU-bound. Linode always wins by a significant margin. Why? It's not like Linode can magically squeeze twice the performance out of the same hardware. The easiest answer is that Linode allows you to burst all the way up to 400%, unlike many other "cloud computing" services.
If one really wants a "non-oversold" service, one should try Amazon EC2. I don't think anyone will like their CPU performance, especially after having one's expectations spoiled by Linode. Overselling gets a bad name only because stupid hosts do it wrong. Some hosts overreact to this, and try to do away with overselling altogether. But both of these are extremes. Overselling is OK if the company is competent enough to manage it properly, and Linode has obviously mastered the art of CPU allocation over the last 8+ years. This gives us the best of both worlds. Everyone can use as much CPU as they need, as long as most people are courteous.
@Azathoth:
In which case, back to my original question. If fair sharing is enforced, how's anyone a bad neighbor?
I would not want to share a server with 40 other people who all hammer the CPU, even though I'd still get my fair share. Fairness alone can't get us the fantastic performance that we've come to expect from a Linode. It would be so much better for everyone if the host's load was low enough that the enforcement mechanism didn't even need to kick in. Just because the police exists to enforce law and order doesn't mean that I wouldn't rather live in a neighborhood where the police is seldom needed in the first place.
But we're going way off topic here . . . . .
> If fair sharing is enforced, how's anyone a bad neighbor?
To put it simply:
Bad neighbour in the sense that if everyone were like that, all the Linodes on the machine would get a fixed percentage of CPU time instead of being able to burst up to all available processing when needed, like to handle a surge in traffic for example.
I think Linode should group all the bad neighbours together on a host (still keeping the node/server ratio of course) and let them see what equally sharing the CPU is like.