IO: Wait - where to start investigating
We recently moved one of our dedicated servers to a Linode 4096 and the performance was great for the past week or two, with the site having a load average of less than 0.10 during the day and down to 0.05 in the evening.
This evening, the load is up to 0.50 under the same level of user activity (we've not had a traffic spike or anything)
So I'm wondering where we start investigating what's increasing the load.
This is what collectl -10 -sCDN reports
# SINGLE CPU STATISTICS
# Cpu User Nice Sys Wait IRQ Soft Steal Idle
0 3 0 1 0 0 0 0 94
1 0 0 0 0 0 0 0 99
2 0 0 0 0 0 0 0 98
3 0 0 0 0 0 0 0 99
# DISK STATISTICS (/sec)
# <---------reads---------><---------writes---------><--------averages--------> Pct
#Name KBytes Merged IOs Size KBytes Merged IOs Size RWSize QLen Wait SvcTim Util
xvda 1 0 0 4 67 5 12 6 5 0 79 3 4
xvdb 0 0 0 0 0 0 0 0 0 0 0 0 0
As you can see the CPU is doing pretty much nothing, but there's wait on the (non-swap) disk. The site is serving perhaps 1-2 pages a second. Conversely another smaller Linode server delivering 20+ banner ads per second has zero wait.
Here's the output of top:
top - 08:53:16 up 19:01, 4 users, load average: 0.63, 0.69, 0.99
Tasks: 113 total, 1 running, 112 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.3%us, 0.0%sy, 0.0%ni, 86.1%id, 13.4%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 4194508k total, 1791364k used, 2403144k free, 149148k buffers
Swap: 262136k total, 104808k used, 157328k free, 430576k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8803 mysql 15 0 717m 214m 4284 S 0.3 5.2 27:00.06 mysqld
16741 nobody 16 0 521m 94m 4752 S 0.3 2.3 0:01.24 httpd
15766 nobody 16 0 524m 97m 4896 S 0.2 2.4 0:01.59 httpd
16739 nobody 16 0 522m 94m 4756 S 0.2 2.3 0:00.56 httpd
16740 nobody 16 0 522m 94m 4768 S 0.2 2.3 0:00.41 httpd
15767 nobody 16 0 523m 95m 4888 S 0.1 2.3 0:01.82 httpd
16375 nobody 16 0 522m 94m 4792 S 0.1 2.3 0:00.93 httpd
17988 root 15 0 12744 1272 932 R 0.1 0.0 0:00.04 top
1 root 15 0 10348 188 160 S 0.0 0.0 0:09.98 init
2 root RT 0 0 0 0 S 0.0 0.0 0:00.24 migration/0
3 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/0
4 root RT 0 0 0 0 S 0.0 0.0 0:00.29 migration/1
5 root 34 19 0 0 0 S 0.0 0.0 0:00.00 ksoftirqd/1
Again, the CPU isn't doing much, and there's plenty of spare memory.
Do you know how I can find what process is associated with the disk wait ?
The server is running Centos 5.5, MySQL 5.5, Apache 2.2 + mod_perl.
Here's the my.cnf, SHOW PROCESSLIST just shows 12 connections (one for each httpd process)
key_buffer = 192M
table_cache = 512
sort_buffer_size = 2M
read_buffer_size = 1M
join_buffer_size = 1M
read_rnd_buffer_size = 4M
myisam_sort_buffer_size = 64M
thread_cache_size = 32
table_open_cache = 512
query_cache_type = 1
query_cache_size = 24M
query_cache_limit = 2M
max_heap_table_size = 56M
tmp_table_size = 48M
thread_concurrency = 8
Here's the httpd.conf, there are never more than 12-13 httpd processes, so it's not running out of those IMO.
StartServers 10
MinSpareServers 5
MaxSpareServers 15
MaxClients 45
MaxRequestsPerChild 200
11 Replies
Maybe your scripts does something with disk? I don't know, well… maybe some previews generating (just for example).
Your mysql-settings looks very modest for 4096 RAM
key_buffer can be increased, if necessary (as minimum take 256M for this conf, as max - 1024M, but I doubt that it's necessary).
Also can be increased sortbuffersize, readbuffersize, querycachesize and querycachelimit (definitely).
This script:
can analyze your mysql values and give more concrete tips about tuning.
Or this one:
but I prefer first variant
Also, check error logs of Apache - maybe somebody parses your site for vulnerability.
If it was that MySQL didn't have enough resources then surely the disks would be busy reading stuff that's not in memory.
My problem is that the CPU is idle, the disks are idle doing "waits"
I use mysqltuner.pl all the time and it didn't have any suggestions that would relate to my problem (IMO)
Here's what it says
[–] Skipped version check for MySQLTuner script
[OK] Currently running supported MySQL version 5.5.8
[OK] Operating on 64-bit architecture
–------ Storage Engine Statistics -------------------------------------------
[–] Status: +Archive -BDB -Federated +InnoDB -ISAM -NDBCluster
[–] Data in MyISAM tables: 642M (Tables: 97)
[–] Data in InnoDB tables: 560K (Tables: 11)
[–] Data in PERFORMANCE_SCHEMA tables: 0B (Tables: 17)
[–] Data in MEMORY tables: 2M (Tables: 6)
[–] Data in SPHINX tables: 0B (Tables: 2)
[!!] Total fragmented tables: 24
–------ Performance Metrics -------------------------------------------------
[–] Up for: 18h 39m 43s (1M q [22.085 qps], 2K conn, TX: 4B, RX: 296M)
[–] Reads / Writes: 28% / 72%
[–] Total buffers: 408.0M global + 8.2M per thread (151 max threads)
[OK] Maximum possible memory usage: 1.6G (40% of installed RAM)
[OK] Slow queries: 0% (10/1M)
[OK] Highest usage of available connections: 12% (19/151)
[OK] Key buffer size / total MyISAM indexes: 192.0M/410.1M
[OK] Key buffer hit rate: 100.0% (465M cached / 117K reads)
[OK] Query cache efficiency: 32.2% (178K cached / 552K selects)
[!!] Query cache prunes per day: 26487
[OK] Sorts requiring temporary tables: 0% (0 temp sorts / 96K sorts)
[OK] Temporary tables created on disk: 8% (5K on disk / 58K total)
[OK] Thread cache hit rate: 99% (19 created / 2K connections)
[OK] Table cache hit rate: 96% (335 open / 346 opened)
[OK] Open file limit used: 34% (405/1K)
[OK] Table locks acquired immediately: 99% (1M immediate / 1M locks)
[OK] InnoDB data size / buffer pool: 560.0K/128.0M
Variables to adjust:
querycachesize (> 24M)
Having a querycachesize too small wouldn't result in IO waits.
> [OK] Key buffer size / total MyISAM indexes: 192.0M/410.1M
Your buffer is smaller than indexes, set keybuffer to 512M
> [OK] Query cache efficiency: 32.2% (178K cached / 552K selects)
It's not 'OK', this value should be around 70-80%.
> Having a query
cache_size too small wouldn't result in IO waits.
mmm…. not enough buffers/cache size it's reason for full scan of table, it's reading from disk.
@OZ:
> [OK] Key buffer size / total MyISAM indexes: 192.0M/410.1M
Your buffer is smallest than indexes, set keybuffer to 512M> Having a querycache_size too small wouldn't result in IO waits.
mmm…. not enough buffers/cache size it's reason for full scan of table, it's reading from disk.
To be sure I've now increased the keybuffersize but I'm inclined to think it's something else because prior to moving it to Linode the server had a 192MB setting for the last 3yrs without issue (and the dataset is the same size)
If it was a case of not enough buffers and it was reading from disk then I would expect to see the disks busy in "svctime", rather than "wait"
DISK STATISTICS (/sec)
# <---------reads---------><---------writes---------><--------averages--------> Pct
#Name KBytes Merged IOs Size KBytes Merged IOs Size RWSize QLen Wait SvcTim Util
xvda 1 0 0 4 67 5 12 6 5 0 79 3 4
xvdb 0 0 0 0 0 0 0 0 0 0 0 0 0
After changing values in my.cnf mysql should be restarted.
If reason not in mysql, check error_logs of apache.
@OZ:
I can't (and don't want) to argue, I can only make assumptions.
Your help is appreciated
As last advice: try to use 'iotop' to monitor IO using.
sysstat
Here's an exerpt from iostat -x 10
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
xvda 0.00 24.60 1.60 18.20 18.40 325.60 17.37 2.51 124.97 12.78 25.31
xvdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
As I understand it, await is the time to satisfy IO requests including the time service the request (svctm), and according to manpages await is supposed to be very close to svctm.
Looking at my iostat figures, await - svctm is up to 100ms, a long time.
Turns out it was some disk performance issue.
I contacted Linode. Within 5 minutes I had a reply asking for some IOSTAT information. After supplying that information I had another reply within 5 minutes saying an engineer was being dispatched to investigate.
I enquired whether it was possible to migrate my Linode to another physical server, and again had an almost immediate reply saying that my Linode had been configured for migration and I just needed to press the button to get the server moved.
20 minutes later my Linode was running on another machine perfectly.
iostat -x 10 for the new Linode is:
await: 1.29 svctm: 1.29
as you can see, the response time from the disk is now 1.29ms of which all is spend servicing the request.
Compare that with the old server which spent over 100ms "waiting" over and above the service time (svctm)
The funny thing about monitoring is the data that collectl reports is identical to that of sysstat tools like sar, mpstat, iostat, etc, but in my opinion is much more readable. At least the columns for disk stats line up and if you try "collectl -sD –home", you'll see something more like 'top' format only for disks. If you've ever tried to use nfsstat, you'd immediately see the value in collectl for monitoring nfs servers/clients.
But it also does so much more, like making data available for plotting or loading into a spreadsheet PLUS it can collectl a whole lot more than just the basic stuff too.
The other real key is collectl can log to disk and play the data back, like sar, but again much more user friendly - at least in my opinion. Perhaps that's what some of the world's largest high performance clusters use it.
-mark