more host45 slowness...
The load will spike occasionally and I'll see it upwards of 4+. Anyone else experiencing slowneess on host45?
% cat /proc/io_status
iocount=83319305 iorate=0 iotokens=500 tokenrefill=50 token_max=500
% vmstat 2 5
procs –---------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 4888 29564 12600 57752 0 0 7 2 6 1 3 3 94 0
0 0 4888 29560 12600 57752 0 0 0 0 103 64 0 1 99 0
0 0 4888 29560 12600 57752 0 0 0 0 102 69 1 0 99 0
0 0 4888 29552 12608 57752 0 0 0 45 122 73 0 0 100 0
0 0 4888 29552 12608 57752 0 0 0 0 103 84 0 0 100 0
10 Replies
Yeah, host45's been really slow for me all weekend. Only today did it get better.
@Bdragon:
Are you the one that was thrashing host45 all weekend?
:x Yeah, host45's been really slow for me all weekend. Only today did it get better.
Gosh, I don't think so. Like I said… I shutdown everything and it was still slow and never recovered. Finally rebooted a day later and, like you, it is better. Before I rebooted lsof only showed a handful of files open and I had absolutely no io_tokens. I don't understand why they didn't come back.
@edavis:
Before I rebooted lsof only showed a handful of files open and I had absolutely no io_tokens. I don't understand why they didn't come back.
default values: tokenrefill = 512; tokenmax = 400000
your values: tokenrefill = 50; tokenmax = 500
In your first post you have iotokens=500, i.e. the maximum value currently allowed for your Linode. You need to raise a ticket to have tokenrefill and token_max reset to the default values once you have fixed whatever caused them to be reduced in the first place.
top - 10:11:53 up 1 day, 21:33, 1 user, load average: 11.68, 13.23, 10.56
Tasks: 114 total, 1 running, 112 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.0% user, 1.0% system, 0.0% nice, 99.0% idle
Mem: 195620k total, 185532k used, 10088k free, 5688k buffers
Swap: 263160k total, 12k used, 263148k free, 72868k cached
iocount=4243005 iorate=8 iotokens=399968 tokenrefill=512 token_max=400000
iocount=4251010 iorate=13 iotokens=399982 tokenrefill=512 token_max=400000
@edavis:
Waited for the load to settle down. It got to .5-ish. I then looked at my iostatus and then performed an ls -l… top running in the background… the load immediately shot up to 3+ and ls took 30+ seconds to return. Here was my iostatus at the time:
iocount=4251010 iorate=13 iotokens=399982 tokenrefill=512 token_max=400000
In the bad old days when the host that I was on was suffering from other Linodes hogging all of the disk I/O, this is the behavior that I would get. All of my processes that wanted to touch the disk would get stuck, and the load average would go up, I guess those processes were somehow counted as runnable and adding to the load average instead of sleeping on I/O for some reason.
Anyway, it was the I/O tokens mechanism that solved most of this problem by keeping other Linodes from thrashing the box so hard, and the remainder of the problem was mostly solved by moving me to a quieter host. Now there are rare occasions where this kind of poor performance happens, but it's not very frequent and doesn't last very long.
My guess, and it's just a guess, is that other Linodes on your host are totally swamping the I/O. If 4 or 5 Linodes are all constantly doing disk I/O even at the maximum rate allowed by the token system, performance for everyone else on your host will suffer. Sounds like this is happening to you.
If it's high, you have processes waiting for disk io from the host to complete. If it's 98-100 for more than a couple readings, the host is probabaly thrashing somewhat…
(I think this column only works properly on 2.6 kernels)