Linode getting stuck on iowait

general

forum:glg 14 years, 1 month ago

In the last week or so, one of my linodes has been repeatedly getting stuck in some kind of iowait loop.

The system is unresponsive, but one of the things I have on it responds to a connection stating that the connection is refused due to high load (33 this time)

I was on the box during one of the occurrences, I ran top, and nothing was using any CPU, it was all in iowait. I left a user logged into the console, and when it just happened now, I couldn't even get w to run to see the current load.

I have apache on this box, but it's very lightly used. It wasn't tuned before, but I went ahead and tuned it just in case.

I'm at a loss trying to troubleshoot this one. Ideas for what I should put in to help trace this?

6 Replies

forum:db3l 14 years, 1 month ago

~~@glg:~~

I was on the box during one of the occurrences, I ran top, and nothing was using any CPU, it was all in iowait. I left a user logged into the console, and when it just happened now, I couldn't even get w to run to see the current load.
Sounds like you could have been heavily swapping - did you save the top output? If not, the next time it occurs I'd look more towards memory usage than cpu.

An untuned Apache configuration could certainly in theory cause this - even if the box is usually unloaded, a brief spike in traffic that was enough to push your box into swapping due to Apache processes might take a while to clear.

I suppose alternatively it could be that other guests on your host are getting into periods of heavy disk use which in turn is blocking your Linode, but that shouldn't have too much impact if your Linode is lightly loaded unless you're still trying to do a decent amount of I/O yourself.

– David

forum:glg 14 years, 1 month ago

~~@db3l:~~

Sounds like you could have been heavily swapping - did you save the top output? If not, the next time it occurs I'd look more towards memory usage than cpu.

yeah, possibly an OOM situation.

~~@db3l:~~

An untuned Apache configuration could certainly in theory cause this - even if the box is usually unloaded, a brief spike in traffic that was enough to push your box into swapping due to Apache processes might take a while to clear.

That can be ruled out though, as I did tune apache on Monday and it's happened again.

~~@db3l:~~

I suppose alternatively it could be that other guests on your host are getting into periods of heavy disk use which in turn is blocking your Linode, but that shouldn't have too much impact if your Linode is lightly loaded unless you're still trying to do a decent amount of I/O yourself.

It could be users. I guess I'm looking for suggestions of something I can look at now or something install that would capture some information later.

I installed munin, but it's not showing anything abnormal other than a gap right when it happened.

Sorry, I did forget to mention one thing. I did upgrade this server from ubuntu 9.10 to 10.04 on 10/22. First occurrence of this lockup was 10/30.

forum:obs 14 years, 1 month ago

Can you tell us what else is on the box? i.e. databases? wordpress?etc.

Try running iotop (apt-get install iotop)

Also what kernel are you running? (uname -a)

forum:glg 14 years, 1 month ago

~~@obs:~~

Can you tell us what else is on the box? i.e. databases? wordpress?etc.

inn2 is the big thing and user shell accounts.

~~@obs:~~

Try running iotop (apt-get install iotop)

Also what kernel are you running? (uname -a)

installing iotop now, thanks.

The 64-bit latest paravirt:

Linux ftupet 2.6.35.4-x8664-linode16 #1 SMP Mon Sep 20 16:03:34 UTC 2010 x8664 GNU/Linux

Just happened again. Here's the upper part of top:

top - 13:07:37 up 3:23, 1 user, load average: 60.59, 57.33, 49.50

Tasks: 247 total, 1 running, 245 sleeping, 0 stopped, 1 zombie

Cpu(s): 0.1%us, 0.0%sy, 0.0%ni, 0.0%id, 99.9%wa, 0.0%hi, 0.0%si, 0.0%st

Mem: 504916k total, 459008k used, 45908k free, 27772k buffers

Swap: 524284k total, 3772k used, 520512k free, 128632k cached

Doesn't look like it's swapping much, if at all.

here's iostat:

avg-cpu: %user %nice %system %iowait %steal %idle

0.09 0.15 0.19 29.39 0.03 70.16

Device: tps Blkread/s Blkwrtn/s Blkread Blkwrtn

xvda 0.67 17.01 1.52 207842 18624

xvdb 0.01 0.06 0.63 768 7680

xvdc 0.16 2.92 0.83 35696 10096

xvdd 4.41 66.78 31.58 816104 385944

forum:hoopycat 14 years, 1 month ago

That's not very high. I'd say it's probably trouble ticket time.

forum:glg 14 years, 1 month ago

~~@hoopycat:~~

That's not very high. I'd say it's probably trouble ticket time.

I opened one, but they looked and said OOM. I think I'll open another.

Reply

Description

Please enter an answer

Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Compute

Storage

Networking

Databases

Services

Developer Tools

Industries

Pricing

Community

Engage With Us

Linode getting stuck on iowait

6 Replies

Reply

Tips: