Disk Performance and being limited with Io token_refill
This all came to my attention when our linode on server host6 seemed to suffer from lack of being able to get to disk. After digging around the forums it appears that I'm not the only person seeing this issue and it seems that linodes can have different Tokenrefill and tokenmax values.
I've read that some people can have a token_refill=100 for being a hassle and using too much IO … then later to have it set to a higher limit after someone deeming it appropriate to do so. Did this happen to me? If so… no big deal… but I want to avoid having issues with Disk IO and if I need to change something, knowing that I have been a problem with Disk IO, would help me understand that I have an issue that needs further attention..
My linode at this time shows this, which shows that it’s a lower amount then other linodes. (Again I don’t know anything about this stuff… so lower numbers could be indeed better… who knows!!?)
> iocount=888185 iorate=0 iotokens=19989 tokenrefill=100 token_max=20000
I guess I have 2 more questions.
1. How does it get decided what values are used for our linode?
2. How do I know that I've surpassed “appropriate" disk IO usage?
Thanks for any light you could shed on this.
Doubleg
4 Replies
@doubleg:
so bare with me… I'm just learning about this IO limits function of UML.
More info here:http://www.theshore.net/~caker/patches/
@doubleg:
This all came to my attention when our linode on server host6 seemed to suffer from lack of being able to get to disk. After digging around the forums it appears that I'm not the only person seeing this issue and it seems that linodes can have different Tokenrefill and tokenmax values.
I've read that some people can have a token_refill=100 for being a hassle and using too much IO … then later to have it set to a higher limit after someone deeming it appropriate to do so. Did this happen to me? If so… no big deal… but I want to avoid having issues with Disk IO and if I need to change something, knowing that I have been a problem with Disk IO, would help me understand that I have an issue that needs further attention..
It's usually one of two reasons: either you're using too much I/O because of swap-thrashing, or I tweak the values at certain times when the hosts are very busy to more evenly distribute disk bandwidth. Host kernel upgrades (coming soon) have some decent improvements for sharing I/O, so I'm looking forward to that.
Anyway, if this is the account I'm thinking of, your swap was completely full. Not a good sign. You might want to double check swap usage next time you witness slowness (cat /proc/swaps).
@doubleg:
My linode at this time shows this, which shows that it’s a lower amount then other linodes. (Again I don’t know anything about this stuff… so lower numbers could be indeed better… who knows!!?)
iocount=888185 iorate=0 iotokens=19989 tokenrefill=100 token_max=20000
If iotokens is zero or negative, you're effectively being limited to no more than tokenrefill IO operations/sec.
@doubleg:
1. How does it get decided what values are used for our linode?
The defaults are tokenrefill of 512, tokenmax of 400k
@doubleg:
2. How do I know that I've surpassed “appropriate" disk IO usage?
Look for negative io_tokens.
Hope that helps,
-Chris
And then you can add my io_token monitoring plugin discussed in this thread:
Good luck!
Bryan
since i've noticed a change in our settings for the IO_status file I assume you were correct when you were taking notice about our linode taking up all our swap space. Thanks for the adjustment. You posted some very useful information. It has helped me learn a bit about this today.
But that leads me to another question more out of just being curious vs. learning anything…
Do you have a program that is watching disk I/O times and adjusts them on the fly or do you modify these settings on your own?
You must have spent a great deal of time writing this…. impressive
DoubleG
@doubleg:
Do you have a program that is watching disk I/O times and adjusts them on the fly or do you modify these settings on your own?
I have the beginnings of an automated script, but it's not deployed yet. Usually I just monitor the hosts and adjust when necessary…
@doubleg:
You must have spent a great deal of time writing this…. impressive
Thanks!
-Chris