disk IO is going insane
output from Free:
free
total used free shared buffers cached
Mem: 720956 705504 15452 0 1044 16748
-/+ buffers/cache: 687712 33244
Swap: 524280 233108 291172
output from top
top - 21:35:19 up 1 day, 22:56, 3 users, load average: 11.74, 16.23, 20.50
Tasks: 684 total, 58 running, 626 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.6%us, 61.4%sy, 0.0%ni, 29.0%id, 6.0%wa, 0.0%hi, 0.4%si, 0.6%st
Mem: 720956k total, 714392k used, 6564k free, 268k buffers
Swap: 524280k total, 320472k used, 203808k free, 11104k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6694 mysql 20 0 360m 5020 1912 S 4386 0.7 256:19.94 mysqld
183 root 20 0 0 0 0 R 99 0.0 8:45.06 kswapd0
7375 www-data 20 0 56052 14m 2892 R 15 2.0 0:02.22 apache2
8046 www-data 20 0 59672 14m 3008 S 14 2.0 0:00.54 apache2
7686 www-data 20 0 59676 14m 2968 R 13 2.0 0:00.78 apache2
7461 www-data 20 0 59672 14m 2952 S 12 2.0 0:00.77 apache2
7579 www-data 20 0 59580 14m 2964 D 9 2.0 0:00.43 apache2
7459 www-data 20 0 59708 14m 2952 D 8 2.0 0:00.86 apache2
8077 www-data 20 0 59672 14m 3016 S 7 2.0 0:00.93 apache2
8041 www-data 20 0 53004 14m 2956 R 6 2.1 0:00.71 apache2
8038 www-data 20 0 56144 14m 2940 R 6 2.1 0:00.36 apache2
7238 www-data 20 0 59672 14m 2944 S 5 2.0 0:00.53 apache2
7262 www-data 20 0 59580 13m 2964 R 5 2.0 0:00.35 apache2
7292 www-data 20 0 59672 13m 2960 R 5 2.0 0:00.44 apache2
7879 www-data 20 0 59672 14m 3020 R 5 2.0 0:00.18 apache2
7104 www-data 20 0 53004 14m 2808 R 5 2.0 0:00.70 apache2
7113 www-data 20 0 59672 13m 2936 R 5 2.0 0:01.10 apache2
7206 www-data 20 0 59672 14m 2964 R 5 2.0 0:01.22 apache2
7216 www-data 20 0 59448 14m 3148 R 5 2.1 0:00.57 apache2
7635 www-data 20 0 59672 14m 2952 R 5 2.0 0:00.22 apache2
7664 www-data 20 0 59676 14m 2932 R 5 2.0 0:00.79 apache2
7869 www-data 20 0 59680 14m 3020 D 5 2.0 0:00.39 apache2
7928 www-data 20 0 59552 14m 3244 R 5 2.1 0:00.50 apache2
7067 www-data 20 0 59672 14m 2948 S 5 2.0 0:00.63 apache2
7055 www-data 20 0 59672 13m 2960 S 4 2.0 0:00.50 apache2
7148 www-data 20 0 59832 14m 2964 D 4 2.0 0:00.60 apache2
7241 www-data 20 0 50668 14m 2796 R 4 2.0 0:00.64 apache2
7389 www-data 20 0 50640 14m 2904 R 4 2.1 0:00.50 apache2
7404 www-data 20 0 59672 13m 2964 S 4 1.9 0:00.69 apache2
7233 www-data 20 0 59572 13m 2940 R 4 1.9 0:00.21 apache2
7890 www-data 20 0 59680 14m 3008 D 4 2.0 0:00.32 apache2
7910 www-data 20 0 59672 14m 3020 S 4 2.0 0:00.15 apache2
7931 www-data 20 0 59568 13m 2428 S 4 1.9 0:00.11 apache2
7249 www-data 20 0 59672 14m 2948 R 4 2.0 0:00.33 apache2
7239 www-data 20 0 59572 13m 2948 R 3 1.9 0:00.59 apache2
7355 www-data 20 0 59672 13m 2968 R 3 2.0 0:00.44 apache2
7650 www-data 20 0 59672 14m 2956 D 3 2.0 0:00.87 apache2
8043 www-data 20 0 59696 14m 3008 D 3 2.0 0:00.55 apache2
7715 www-data 20 0 59580 13m 2984 R 3 1.9 0:00.39 apache2
7856 root 20 0 8164 1096 836 S 3 0.2 0:00.18 sshd
8042 www-data 20 0 59992 14m 3056 R 2 2.1 0:00.96 apache2
7077 www-data 20 0 59672 14m 2940 D 2 2.0 0:00.32 apache2
7269 www-data 20 0 59672 13m 2952 S 2 2.0 0:00.23 apache2
7343 www-data 20 0 59672 14m 2968 R 2 2.0 0:00.19 apache2
7457 www-data 20 0 59988 14m 2972 S 2 2.0 0:00.45 apache2
7525 www-data 20 0 59672 14m 2964 D 2 2.0 0:00.89 apache2
7199 www-data 20 0 59672 14m 2996 R 2 2.0 0:00.22 apache2
7347 www-data 20 0 59672 14m 2948 D 2 2.0 0:01.02 apache2
7468 www-data 20 0 59672 14m 2956 R 2 2.0 0:00.46 apache2
7486 www-data 20 0 59672 14m 2952 D 2 2.0 0:00.91 apache2
7745 www-data 20 0 59672 13m 2988 S 2 1.9 0:00.79 apache2
7203 www-data 20 0 59672 13m 2956 R 1 2.0 0:00.37 apache2
7590 www-data 20 0 59580 14m 2960 S 1 2.0 0:00.76 apache2
7699 www-data 20 0 59568 14m 2972 R 1 2.0 0:00.76 apache2
7922 www-data 20 0 59576 13m 2440 R 1 1.9 0:00.05 apache2
#
ServerRoot "/etc/apache2"
#
# The accept serialization lock file MUST BE STORED ON A LOCAL DISK.
#
# <ifmodule !mpm_winnt.c=""># <ifmodule !mpm_netware.c="">LockFile /var/lock/apache2/accept.lock
#</ifmodule>
#</ifmodule>
#
# PidFile: The file in which the server should record its process
# identification number when it starts.
# This needs to be set in /etc/apache2/envvars
#
PidFile ${APACHE_PID_FILE}
#
# Timeout: The number of seconds before receives and sends time out.
#
Timeout 300
#
# KeepAlive: Whether or not to allow persistent connections (more than
# one request per connection). Set to "Off" to deactivate.
#
KeepAlive On
#
# MaxKeepAliveRequests: The maximum number of requests to allow
# during a persistent connection. Set to 0 to allow an unlimited amount.
# We recommend you leave this number high, for maximum performance.
#
MaxKeepAliveRequests 80
#
# KeepAliveTimeout: Number of seconds to wait for the next request from the
# same client on the same connection.
#
KeepAliveTimeout 15
##
## Server-Pool Size Regulation (MPM specific)
##
# prefork MPM
# StartServers: number of server processes to start
# MinSpareServers: minimum number of server processes which are kept spare
# MaxSpareServers: maximum number of server processes which are kept spare
# MaxClients: maximum number of server processes allowed to start
# MaxRequestsPerChild: maximum number of requests a server process serves
<ifmodule mpm_prefork_module="">StartServers 5
MinSpareServers 5
MaxSpareServers 10
MaxClients 150
MaxRequestsPerChild 0</ifmodule>
# worker MPM
# StartServers: initial number of server processes to start
# MaxClients: maximum number of simultaneous client connections
# MinSpareThreads: minimum number of worker threads which are kept spare
# MaxSpareThreads: maximum number of worker threads which are kept spare
# ThreadsPerChild: constant number of worker threads in each server process
# MaxRequestsPerChild: maximum number of requests a server process serves
<ifmodule mpm_worker_module="">StartServers 2
MinSpareThreads 25
MaxSpareThreads 75
ThreadLimit 64
ThreadsPerChild 25
MaxClients 150
MaxRequestsPerChild 0</ifmodule>
# event MPM
# StartServers: initial number of server processes to start
# MaxClients: maximum number of simultaneous client connections
# MinSpareThreads: minimum number of worker threads which are kept spare
# MaxSpareThreads: maximum number of worker threads which are kept spare
# ThreadsPerChild: constant number of worker threads in each server process
# MaxRequestsPerChild: maximum number of requests a server process serves
<ifmodule mpm_event_module="">StartServers 2
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadLimit 64
ThreadsPerChild 25
MaxRequestsPerChild 0</ifmodule>
# These need to be set in /etc/apache2/envvars
#User ${APACHE_RUN_USER}
User www-data
#Group ${APACHE_RUN_GROUP}
Group www-data
#
# AccessFileName: The name of the file to look for in each directory
# for additional configuration directives. See also the AllowOverride
# directive.
AccessFileName .htaccess
#
# The following lines prevent .htaccess and .htpasswd files from being
# viewed by Web clients.
#
<files ~="" "^\.ht"="">Order allow,deny
Deny from all</files>
#
# DefaultType is the default MIME type the server will use for a document
# if it cannot otherwise determine one, such as from filename extensions.
# If your server contains mostly text or HTML documents, "text/plain" is
# a good value. If most of your content is binary, such as applications
# or images, you may want to use "application/octet-stream" instead to
# keep browsers from trying to display binary files as though they are
# text.
#
DefaultType text/plain
#
# HostnameLookups: Log the names of clients or just their IP addresses
# e.g., www.apache.org (on) or 204.62.129.132 (off).
# The default is off because it'd be overall better for the net if people
# had to knowingly turn this feature on, since enabling it means that
# each client request will result in AT LEAST one lookup request to the
# nameserver.
#
HostnameLookups Off
# ErrorLog: The location of the error log file.
# If you do not specify an ErrorLog directive within a <virtualhost># container, error messages relating to that virtual host will be
# logged here. If you *do* define an error logfile for a <virtualhost># container, that host's errors will be logged there and not here.
#
ErrorLog /dev/null
#
# LogLevel: Control the number of messages logged to the error_log.
# Possible values include: debug, info, notice, warn, error, crit,
# alert, emerg.
#
LogLevel alert
# Include module configuration:
Include /etc/apache2/mods-enabled/*.load
Include /etc/apache2/mods-enabled/*.conf
# Include all the user configurations:
Include /etc/apache2/httpd.conf
# Include ports listing
Include /etc/apache2/ports.conf
#
# The following directives define some format nicknames for use with
# a CustomLog directive (see below).
# If you are behind a reverse proxy, you might want to change %h into %{X-Forwa rded-For}i
#
#LogFormat "%v:%p %h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" vhost_combined
#LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" combi ned
#LogFormat "%h %l %u %t \"%r\" %>s %O" common
#LogFormat "%{Referer}i -> %U" referer
#LogFormat "%{User-agent}i" agent
#
# Define an access log for VirtualHosts that don't define their own logfile
# CustomLog /var/log/apache2/other_vhosts_access.log vhost_combined
# Include of directories ignores editors' and dpkg's backup files,
# see README.Debian for details.
# Include generic snippets of statements
Include /etc/apache2/conf.d/*.conf
# Include the virtual host configurations:
Include /etc/apache2/sites-enabled/</virtualhost></virtualhost>
# The MySQL database server configuration file.
#
# You can copy this to one of:
# - "/etc/mysql/my.cnf" to set global options,
# - "~/.my.cnf" to set user-specific options.
#
# One can use all long options that the program supports.
# Run program with --help to get a list of available options and with
# --print-defaults to see which it would actually understand and use.
#
# For explanations see
# http://dev.mysql.com/doc/mysql/en/server-system-variables.html
# This will be passed to all mysql clients
# It has been reported that passwords should be enclosed with ticks/quotes
# escpecially if they contain "#" chars...
# Remember to edit /etc/mysql/debian.cnf when changing the socket location.
[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock
# Here is entries for some specific programs
# The following values assume you have at least 32M ram
# This was formally known as [safe_mysqld]. Both versions are currently parsed.
[mysqld_safe]
socket = /var/run/mysqld/mysqld.sock
nice = 0
[mysqld]
#
# * Basic Settings
#
#
# * IMPORTANT
# If you make changes to these settings and your system uses apparmor, you may
# also need to also adjust /etc/apparmor.d/usr.sbin.mysqld.
#
user = mysql
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
port = 3306
basedir = /usr
datadir = /var/lib/mysql
tmpdir = /tmp
skip-external-locking
#
# Instead of skip-networking the default is now to listen only on
# localhost which is more compatible and is not less secure.
bind-address = 127.0.0.1
#
# * Fine Tuning
#
key_buffer = 128M
max_allowed_packet = 16M
thread_stack = 192K
thread_cache_size = 8
# This replaces the startup script and checks MyISAM tables if needed
# the first time they are touched
myisam-recover = BACKUP
max_connections = 200
table_cache = 128
thread_concurrency = 10
#
# * Query Cache Configuration
#
query_cache_limit = 1M
query_cache_size = 128M
#
# * Logging and Replication
#
# Both location gets rotated by the cronjob.
# Be aware that this log type is a performance killer.
# As of 5.1 you can enable the log at runtime!
#general_log_file = /var/log/mysql/mysql.log
#general_log = 1
#
# Error logging goes to syslog due to /etc/mysql/conf.d/mysqld_safe_syslog.cnf.
#
# Here you can see queries with especially long duration
log_slow_queries = /var/log/mysql/mysql-slow.log
long_query_time = 1
#log-queries-not-using-indexes
#
# The following can be used as easy to replay backup logs or for replication.
# note: if you are setting up a replication slave, see README.Debian about
# other settings you may need to change.
#server-id = 1
#log_bin = /var/log/mysql/mysql-bin.log
expire_logs_days = 10
max_binlog_size = 100M
#binlog_do_db = include_database_name
#binlog_ignore_db = include_database_name
#
# * InnoDB
#
# InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/.
# Read the manual for more InnoDB related options. There are many!
#
# * Security Features
#
# Read the manual, too, if you want chroot!
# chroot = /var/lib/mysql/
#
# For generating SSL certificates I recommend the OpenSSL GUI "tinyca".
#
# ssl-ca=/etc/mysql/cacert.pem
# ssl-cert=/etc/mysql/server-cert.pem
# ssl-key=/etc/mysql/server-key.pem
[mysqldump]
quick
quote-names
max_allowed_packet = 16M
[mysql]
#no-auto-rehash # faster start of mysql but no tab completition
[isamchk]
key_buffer = 32M
#
# * IMPORTANT: Additional settings that can override those from this file!
# The files must end with '.cnf', otherwise they'll be ignored.
#
Any help would be appreciated. I don't know what I did wrong but Disk IO is screaming high and it was never this bad when I was having a lot more traffic.
Thanks.
25 Replies
@sdlvx:
Any help would be appreciated. I don't know what I did wrong but Disk IO is screaming high and it was never this bad when I was having a lot more traffic.
Best guess is the I/O is a lot of swapping as you are over-committing your Linode. You're allowing far too many Apache client processes to exist than the memory your Linode has can support. With a Linode, you really want to avoid swapping if at all possible, and in your case, you're using swap at almost 1/3 your physical memory.
I don't know which Apache worker model you're using, but MaxClients is probably too large for any of them, as the first main knob.
There are several threads in the forums that discuss tuning Apache appropriately. The default settings that come with most distributions are nowhere near realistic for a VPS environment, where memory is limited and I/O overhead a significant performance hit.
I don't have a thread reference handy, but suspect that searching for MaxClients or apache will come up with some. If nothing else, starting by dropping your MaxClients down to 15-20 should be a quick start, but you'll need to experiment a little to find a best value for your specific Linode.
– David
EDIT:
Thanks, I'll search around. I was in a giant panic, today has been horrible. My internet is down and I've been doing this all through SSH over my cell phone. What a nightmare. q.q
@sdlvx:
it went back down to 700. It peaked at 20k. Is that even possible?
Sure. I think I/O is counted in 1K blocks, though not absolutely certain. But at 1K blocks, 20k I/O requests would only be 20MB/s. In tests I can easily generate write I/O on my Linode 360 at 5x+ that rate - though not all the data is probably flushed in that timing - and can get hdparm read timings (-t) at 8-9x+ that.
Of course, the Linode monitoring is also 5min average, so you could have peaked quite a bit higher at points.
On the bright side, when there's contention on a host for I/O, performance can really tank (I've had some occasions of seeing 40-50% iowait). So given that you had pretty high I/O rates, and at least in the posted top output iowait was relatively modest at 6%, during your overload you might not have been seriously affecting too many peers on your host.
Definitely worth tuning the configuration for though, since you're likely at least impacting your own performance.
– David
and vmstat
# vmstat 1 5
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 920 32772 56796 155204 0 0 1 1 1 0 0 0 100 0
1 0 920 32764 56796 155204 0 0 0 0 37 20 0 0 100 0
0 0 920 32824 56796 155204 0 0 0 0 34 34 0 0 100 0
0 0 920 32824 56796 155204 0 0 0 0 22 18 0 0 100 0
0 0 920 32824 56796 155204 0 0 0 0 31 30 0 0 100 0
If I set these values any lower my site slows to a crawl and it takes forever for a page to load.
I did lower my cache settings for mysql and that seemed to help, but I noticed it was taking longer for pages to generate.
In the end it looks like this is a serious problem. I already have a 720 because I needed to move up from the 360. I think that I don't have much overhead the way my linode is set up.
But when it comes to upgrading it shouldn't be a big deal as long as this ad company I'm using doesn't rip me off. If these settings in Apache need to be this high it pretty much means the only way to make things better is better hardware, right?
I'm pretty overwhelmed, I have two startups that I coded myself and I'm doing the IT work and keeping the server going, while I do the social stuff that goes along with one of the startups. I'm also kind of new to the server running thing but I have a bachelors in CS and everything we did was on Linux so I have a good idea of what's going on.
Anyways the problem is more than likely entirely in mysql for now. Disk IO right now is at 288, cpu @ ~20%, and outgoing @ ~700kbps
EDIT: A lot of the problem was getting hammered and then mysql rsorting to using a file sort on a long database (30k+ entries)
@sdlvx:
In the end it looks like this is a serious problem. I already have a 720 because I needed to move up from the 360. I think that I don't have much overhead the way my linode is set up.
But when it comes to upgrading it shouldn't be a big deal as long as this ad company I'm using doesn't rip me off. If these settings in Apache need to be this high it pretty much means the only way to make things better is better hardware, right?
I still think you really have Apache configured too high. It sounds to me like you're shrinking resource use by MySQL as an alternative to fixing the root resource issue being introduced by Apache. MySQL may well have been sorting to a file, but that wouldn't explain your swap usage, which was very high for a 720.
Have you perhaps rebooted and/or restarted apache along the way of doing your recent work too? If so, then you just got a temporary reprieve by recovering the resources of all of those prior apache processes. But it'll just happen again when you get enough simultaneous requests. Note that improving processing time for a single request (such as removing the use of the database) is another way to keep down the simultaneous apache processes, but doesn't fix the root problem, which is still waiting to occur again.
You know your application best, but are you absolutely certain you need to support that many Apache clients? Your current configuration just isn't tuned for your available resources.
Having fewer clients (and depending on worker model, more requests serviced per client as suggested in another response) need not drop the peak load your system can handle - and in fact can improve it because you aren't overloading and swapping.
And realize that your process listing you showed only had around 50 apache processes. So you were nowhere near your client limit of 150 even when overloaded, so I fail to see how that can be a valid setting. Basically, if MaxClients times the size of an apache process (with whatever plugins you have) is more than your physical memory - or the portion not used by other processes - it's set too high.
If you literally can't tune apache better, then yes, you're only recourse is going to be to increase your Linode to the point where the number of Apache client processes you need fit within memory. You can also improve things by relocating the database to a second Linode, but that's just another way to free memory up for Apache.
I don't really think that's the right solution for your case though, and I'd suggest revisiting your conclusion that MySQL was at fault, though it may have contributed indirectly by letting the apache configuration burn through all available resources.
– David
You only have four virtual CPUs. If you're running much more than one or two Apache processes per CPU, you're not able to handle any more users, you just slow down the access per user. Additional accesses will just get queued if there isn't a worker ready to handle it, I believe.
You've got Apache extremely misconfigured, and if your site is database heavy, you need MORE memory for MySQL, not less. Properly configuring Apache will free up that memory for more important things.
@db3l:
I still think you really have Apache configured too high. It sounds to me like you're shrinking resource use by MySQL as an alternative to fixing the root resource issue being introduced by Apache. MySQL may well have been sorting to a file, but that wouldn't explain your swap usage, which was very high for a 720.
Have you perhaps rebooted and/or restarted apache along the way of doing your recent work too? If so, then you just got a temporary reprieve by recovering the resources of all of those prior apache processes. But it'll just happen again when you get enough simultaneous requests. Note that improving processing time for a single request (such as removing the use of the database) is another way to keep down the simultaneous apache processes, but doesn't fix the root problem, which is still waiting to occur again.
You know your application best, but are you absolutely certain you need to support that many Apache clients? Your current configuration just isn't tuned for your available resources.
Having fewer clients (and depending on worker model, more requests serviced per client as suggested in another response) need not drop the peak load your system can handle - and in fact can improve it because you aren't overloading and swapping.
And realize that your process listing you showed only had around 50 apache processes. So you were nowhere near your client limit of 150 even when overloaded, so I fail to see how that can be a valid setting. Basically, if MaxClients times the size of an apache process (with whatever plugins you have) is more than your physical memory - or the portion not used by other processes - it's set too high.
If you literally can't tune apache better, then yes, you're only recourse is going to be to increase your Linode to the point where the number of Apache client processes you need fit within memory. You can also improve things by relocating the database to a second Linode, but that's just another way to free memory up for Apache.
I don't really think that's the right solution for your case though, and I'd suggest revisiting your conclusion that MySQL was at fault, though it may have contributed indirectly by letting the apache configuration burn through all available resources.
– David
I think there might be something wrong with my mpm after googling around. If my ServerLimit and MaxClients is not really high, it takes forever for pages to load. I have a timer in my application to tell how long the page took during PHP and mysql, and it's always a good time. I've been googling and it sounds like new connections are being queued.
After reading about MPM, I'm thinking I messed it up and I don't have any MPM, and each apache process is a single request. I was googling and I couldn't find out how to see which MPM my server is running. I recall reading that maximum simultaneous users should be MaxClients * number of threads in your mpm. It seems like I'm only getting Max Clients * 1.
@Guspaz:
The thing is you don't need Apache that high, and so it's consuming all your RAM leaving none for MySQL or other things.
You only have four virtual CPUs. If you're running much more than one or two Apache processes per CPU, you're not able to handle any more users, you just slow down the access per user. Additional accesses will just get queued if there isn't a worker ready to handle it, I believe.
You've got Apache extremely misconfigured, and if your site is database heavy, you need MORE memory for MySQL, not less. Properly configuring Apache will free up that memory for more important things.
I have two sites on it. Both of which are pretty popular (one of huge, but very simple) and I wasn't planning on them getting this big.
I ran ps -ef |grep apache2 |wc and I think it said I had 222 processes open.
EDIT: One last edit. I gave up and just restored backups and it seems to be working now that I set
MinSpareServers 5
MaxSpareServers 10
MaxClients 150
MaxRequestsPerChild 0
and I removed MaxClients and ServerLimit from httpd.conf
@sdlvx:
I think there might be something wrong with my mpm after googling around. If my ServerLimit and MaxClients is not really high, it takes forever for pages to load.
How much processing is required for you to deliver a page? It sounds like individual requests take a long time to render, in which case, yes, a configuration to only support a few simultaneous requests will cause queuing.
But that still doesn't mean that setting a value that permits your Linode to go into heavy swapping isn't going to be counter-productive. In other words, as long as you don't overflow memory, more processes are fine, but beyond that point you're actually going to hurt yourself.
> After reading about MPM, I'm thinking I messed it up and I don't have any MPM, and each apache process is a single request. I was googling and I couldn't find out how to see which MPM my server is running.
Well, you definitely have an MPM since any apache is going to have at least one MPM selected, but it's certainly true often the default is prefork which is a single process per request. Depending on distributions you can probably tell by what packages you installed. Or I believe "apache2 -V" should show the MPM configured into the server.
> I recall reading that maximum simultaneous users should be MaxClients * number of threads in your mpm. It seems like I'm only getting Max Clients * 1.
If you're using prefork, that would be right, as it only has one thread/client per process.
> I ran ps -ef |grep apache2 |wc and I think it said I had 222 processes open.
Given the resource demands per-worker in your prior posting, I have to imagine you were thrashing/swapping horribly with that many. Not sure how you hit 222 with a MaxClients of 150, but it could be you were thrashing so badly the system wasn't able to clean up the old processes fast enough, especially as with MaxRequestsPerChild it has to shut down and start a new child process for each incoming request.
> EDIT: One last edit. I gave up and just restored backups and it seems to be working now that I set
StartServers 5 MinSpareServers 5
MaxSpareServers 10
MaxClients 150
MaxRequestsPerChild 0
and I removed MaxClients and ServerLimit from httpd.conf
For what it's worth, that's still almost certainly too high a MaxClients value for your environment.
The fact that it's working now is more likely due to the reload/restart (which cleared all the processes) than any configuration correction. You're just sitting on a time bomb whose clock you have reset, but it's likely to blow up again as soon as you get enough load in terms of simultaneous requests being processed.
Again - your best bet is to tune Apache so that under full load you are barely swapping, taking into account all processes. That will most likely require dropping MaxClients.
You could increase MaxRequestsPerChild (as recommended by another poster) which won't change the simultaneous request limit nor peak memory footprint, but helps minimize process creation since it lets a child handle that many requests before restarting it. That should be safe unless you have "messy" code running per request or a buggy request processing chain.
You could also try switching to a threaded MPM model, since multiple threads per child process are a little lighter weight, but not all embedded interpreters may like that, nor do I suspect it'll make a massive difference in throughput, since if you're tying up a 720's worth of memory with apache processes, the individual rendering of a page is likely bottlenecked elsewhere.
The bottom line in all this is to find the sweet spot of configuration where you are maximally using your available resources, but not exceeding them. As you approach that point in tuning you will see your performance (requests/second you can service) steadily increase, but if you cross past it, your performance is going to tank like going off a cliff. So when in doubt, start conservatively. You may be a little more sluggish than you need to be, but at least you won't tank.
One thing you could do to provide yourself with more room for experimentation is to temporarily allocate a second 720. Clone your current box over (sounds like you have backups, so just restore to the new box and tweak hostnames and what not), and then experiment against that. Tools such as ab can help load down your server (be sure to request URLs that exercise your full database path), and help find an appropriate tuning point.
I really do suspect you'll be able to get quite good performance out of a 720 even with a reasonably inefficient rendering path, while still protecting yourself against becoming overloaded, but you really do want to find an appropriate configuration that won't let the box fall over the cliff if load gets too high. In such a case, it's better to queue (or even drop) requests since at least only a few suffer rather than killing everyone's performance and making your site essentially unusable.
– David
Here are some suggestions:
1. If you haven't installed APC yet, do it right now. That's a free performance boost for you.
2. Running Apache with the Prefork MPM and PHP as a module is usually quite bad for performance. Consider switching to something like nginx, or at least offload all the static files (css, js, images) to nginx. That could save you hundreds of MB of memory.
3. Minimize database calls with some smart caching. Memcached is easy, and it can be incredibly useful once you get the hang of it. Even if your data changes every 30 seconds, you should still cache it during the 30 seconds that it doesn't change.
I do not have APC installed AFAIK and I think that moving to NGINX with PHP and mySQL is my best bet. After looking into it it seems like that was the wisest choice. I just jumped on Apache2 because I considered it to be the defacto standard.
The traffic on my site has died way down and I'm not sure if it's going to pick back up again.
I will try setting up NGINX monday. Right now all I have for internet is my cell phone and it occasionally goes out for long periods of time. I don't want to end up with hours of downtime because I don't have internet.
And yes it is written in PHP. I tried to hack in mpm-worker since I saw it gives roughly double the performance but it doesn't like php and the instructions I were given must have been outdated or I am dumb.
@sdlvx:
Thanks again David. I think that apache2 is not efficient enough for my needs on a linode.
Doubtful. I tested apache handling over ten thousand requests a second for static content even on a Linode 360.
Of course, once you layer a processing intensive PHP application, and the I/O overhead (and processing) of a database, you'll get nowhere near that, but I'd hardly blame apache for a php or database overhead. Nor is such overhead likely to change by switching web servers, since your overhead is not in the web server. At least not without restructuring your PHP architecture, or applying other changes to it such as the caching previously mentioned.
It's true that using nginx for purely static content can help a bit, because then you have a smaller footprint for the static content (don't need to start an apache process with the PHP overhead), and can leave more memory for the PHP-based requests. But I still don't buy the fact that you really need that yet, if you were only to keep apache from using up all your memory under load. I still don't think you've given a decently tuned environment a fair shake without trying to make much larger scale changes to your setup.
> The traffic on my site has died way down and I'm not sure if it's going to pick back up again.
Well, my prior suggestion for explicitly stress testing your application on a separate, test, Linode you create for a short period still stands. It'll help you identify where your bottlenecks are and identify a good set of configuration parameters before changing the production system.
> I will try setting up NGINX monday. Right now all I have for internet is my cell phone and it occasionally goes out for long periods of time. I don't want to end up with hours of downtime because I don't have internet.
Why not do your experiments on a test Linode to remove any risk of downtime?
Good luck with your site - it doesn't seem like this thread has done much to influence your efforts, but I wish you the best in resolving your setup.
– David
@sdlvx:
I do not have APC installed AFAIK
DO THIS FIRST. APC is a php cache that is trivial to install and use. Try it before going through the trouble of switching web servers. You'll want it either way.
@glg:
DO THIS FIRST. APC is a php cache that is trivial to install and use.
Until it starts corrupting it's cache randomly.
The application itself is extremely simple. It's one of those stupid facebook like sites where people just type in a little blurb of text, it's saved in a mysql database, and then they like the page they made and it spreads like crazy. It's not resource intensive or anything at all.
I have whos.amung.us installed and I'm only seeing a peak of around 500 users on at once.
I bought more ram but I know something just flat out isn't right. This 720 had no problem reaching one million uniques in a day before, and now it's struggling to handle a hundred thousand.
Google isn't helping me much.
@sdlvx:
I installed APC and it seemed to help. However, my server still runs like crap. I still need to set the prefork_mpm maxclients to at least 100 to see decent performance.
The application itself is extremely simple. It's one of those stupid facebook like sites where people just type in a little blurb of text, it's saved in a mysql database, and then they like the page they made and it spreads like crazy. It's not resource intensive or anything at all.
I have whos.amung.us installed and I'm only seeing a peak of around 500 users on at once.
I bought more ram but I know something just flat out isn't right. This 720 had no problem reaching one million uniques in a day before, and now it's struggling to handle a hundred thousand.
Google isn't helping me much.
Apache is still misconfigured…
You may want to just give up and install lighttpd or nginx or some other web server, since they're configured sanely out of the box. Or perhaps switch to mpm_worker? I'm not that familiar with configuring Apache since I switched to lighttpd years ago.
From a database perspective, is your schema properly indexed? On large data sets, this can be the difference between 1ms per query and 1000ms per query.
@rsk:
You can't use modphp with worker, as PHP is not threadsafe. You'd need to use modfastcgi, and while it's possible, it takes a bit of work to get it almost right, and it's impossible to get it completely right unless you hack modfastcgi's code to not kill subprocesses on SIGUSR1. Yeah, modfastcgi is hell to get right. mod_fcgid is just a tiny bit better, but again much more hassle than just switching to nginx or lighttpd.
OP, did you try profiling your slow pages? Add little hooks here and there to measure the time taken to process each stage of page generation (request parsing, database access, templating, etc.) You could use microtime(true), or grab any open-source PHP profiling library. That could help you pinpoint the source of the slowness.
Also, check your MySQL tables using the CHECK command. Those beasts tend to get corrupted from time to time. Also try optimize and reindex.
@sdlvx:
I think there might be something wrong with my mpm after googling around. If my ServerLimit and MaxClients is not really high, it takes forever for pages to load. I have a timer in my application to tell how long the page took during PHP and mysql, and it's always a good time. I've been googling and it sounds like new connections are being queued.
Turn KeepAlives off. Or at the very least, turn the timeout down to 1 or 2 seconds. You're seeing that behavior because connections are being opened and Apache has to sit there waiting for about 15 seconds after it finishes sending data because the client might send another request down the same connection - this ties up that whole Apache process so it can't serve other requests. Turning off KeepAlives causes Apache to serve the data to the client, then close the connection so it can move on to serving up the next client's request.
-James
If you run PHP but a lot of requests you receive are non-PHP (ie static files) you should be able to reduce this to about 3 or 4MB each for non-PHP requests by switching from modphp to modfcgid and using PHP via FastCGI. Note that you actually have to disable the mod_php module though. You can then also switch your MPM from prefork to worker which can save a little overall overhead with Apache.
Takes a bit of setting up - maybe practice in a virtual machine. Ah yes, I notice that others above have complained about modfcgid being difficult to set up - note that you can solve it though, you just have to make sure PHP does not spawn any threads, and all that is handled by modfcgid. modfcgid is not compatible with PHP's behaviour of spawning fastcgi threads itself, modfcgid needs to do that.
I stumbled upon the server-status page and found out that I was very easily hitting 50 open connections, all of which were in the keep alive state. I disabled keepalive and saw that number drop to between 2 and 25, but it's hanging in the upper single digits mostly.
It's too early in the night to be stress testing the new setup, but I think this has made a very big difference.
Each process is using a ton of memory, still. top is reporting about 2.8% usage and I have a 720 slice with 360 extra ram.
> From a database perspective, is your schema properly indexed? On large data sets, this can be the difference between 1ms per query and 1000ms per query.
Yeah, I spent a long time making sure everything was properly indexed. Slow query log is empty (and I set the time to 1 second). The problem is not with pages rendering slowly, but actually being served. I have timer code at the start and end of every PHP page. If a page takes 5 seconds to load, it still reports being generated in about .01 seconds. I thought that was enough to rule out my application's performance.
> Turn KeepAlives off. Or at the very least, turn the timeout down to 1 or 2 seconds. You're seeing that behavior because connections are being opened and Apache has to sit there waiting for about 15 seconds after it finishes sending data because the client might send another request down the same connection - this ties up that whole Apache process so it can't serve other requests. Turning off KeepAlives causes Apache to serve the data to the client, then close the connection so it can move on to serving up the next client's request.
James this advice was A+. however, when I look at server-status page, it looks like everyone is in a state of waiting for the connection. Is this normal and acceptable? I'm guessing it is because instead of waiting around, apache is serving a page and moving on as soon as it's done.
> Takes a bit of setting up - maybe practice in a virtual machine. Ah yes, I notice that others above have complained about modfcgid being difficult to set up - note that you can solve it though, you just have to make sure PHP does not spawn any threads, and all that is handled by modfcgid. modfcgid is not compatible with PHP's behaviour of spawning fastcgi threads itself, modfcgid needs to do that.
I have a nearly identical VM locally I test on. I think I will try some different servers and see what I can do. I'm trying to run this stupid Facebook like page thing and a pretty complex startup website all by myself. Instead of fixing bugs and improving things I spend most of my time fiddling with the server. Not very helpful and there's a lot more better things I should be doing instead of breaking my server more and more while users say the site is slow and sucky.