high cpu usage - linode unavailable

Hi all!

I am a happy Linode owner since a few days and I'm experimenting with it to increase my Linux skills.

I've turned it into an Ubuntu webserver (LAMP) & mailserver (firewalled by shorewall) and everything seemed to work fine.

Until this early morning when for some reason something went berserk

and CPU usage was that high I had to reboot the linode to be able to access it:

~~![](<URL url=)http://www.ilashed.com/cpu_usage.jpg" />

Network trafic and disk i/o were zero.

I have no idea where I should look to find out what caused this?

Thanks a lot!

Sam~~

12 Replies

I had a problem like that a couple times, I had run out of memory. Check your logs for oom-killer

Any joy finding the cause of the problem ?

Cheers,

Michael.

I had this happen to my linode today, I'm still looking into why..

as far as I can tell the server was totally unloaded, It serves as 2 1000 man teamspeak servers

but both had under 10 users a peice.

What kernel are you running? I just had to reboot my linode this weekend because it had run out of memory (the same kernel memory leak I've mentioned before). Fortunately I caught it before the OOM killer started breaking things; I'd only just started swapping.

the latest flavor of Ubuntu, unless memory spiked within a hour of my last check is was below 20% usage with zero swap usage.

@sweh:

What kernel are you running?
@MrRx7:

the latest flavor of Ubuntu
Your kernel is supplied by the Linode host, not the distro you are running.

Interesting; I'm using 2.6.23.17-linode43

The memory leak shows up in "free" output:

             total       used       free     shared    buffers     cached
Mem:        356116     253748     102368          0      18320     194404
-/+ buffers/cache:      41024     315092
                                  ^^^^^^
Swap:       263160        576     262584

The highlighted number is the important one. (the line above that one should be low; it indicates free memory is being used for buffer/cache and improving performance).

When the memory leak occurs this number goes down and even stopping almost every process on the system doesn't free it up. The only way to fix this is to reboot.

It doesn't happen very often, but it happens enough.

If you're not seeing this (ie you have plenty of free memory) then you've got a different problem, and could well have had a beserkoid process.

Output from my "free"

             total       used       free     shared    buffers     cached
Mem:        737484     141652     595832          0      34644      45212
-/+ buffers/cache:      61796     675688
Swap:       262136          0     262136

and ps aux

root@none:~ # ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   2056   708 ?        Ss   Aug30   0:00 init [2]
root         2  0.0  0.0      0     0 ?        S    Aug30   0:00 [migration/0]
root         3  0.0  0.0      0     0 ?        SN   Aug30   0:00 [ksoftirqd/0]
root         4  0.0  0.0      0     0 ?        S    Aug30   0:00 [migration/1]
root         5  0.0  0.0      0     0 ?        SN   Aug30   0:00 [ksoftirqd/1]
root         6  0.0  0.0      0     0 ?        S    Aug30   0:00 [migration/2]
root         7  0.0  0.0      0     0 ?        SN   Aug30   0:00 [ksoftirqd/2]
root         8  0.0  0.0      0     0 ?        S    Aug30   0:00 [migration/3]
root         9  0.0  0.0      0     0 ?        SN   Aug30   0:00 [ksoftirqd/3]
root        10  0.0  0.0      0     0 ?        S<   Aug30   0:00 [events/0]
root        11  0.0  0.0      0     0 ?        S<   Aug30   0:00 [events/1]
root        12  0.0  0.0      0     0 ?        S<   Aug30   0:00 [events/2]
root        13  0.0  0.0      0     0 ?        S<   Aug30   0:00 [events/3]
root        14  0.0  0.0      0     0 ?        S<   Aug30   0:00 [khelper]
root        15  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kthread]
root        17  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xenwatch]
root        18  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xenbus]
root        27  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kblockd/0]
root        28  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kblockd/1]
root        29  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kblockd/2]
root        30  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kblockd/3]
root        31  0.0  0.0      0     0 ?        S<   Aug30   0:00 [cqueue/0]
root        32  0.0  0.0      0     0 ?        S<   Aug30   0:00 [cqueue/1]
root        33  0.0  0.0      0     0 ?        S<   Aug30   0:00 [cqueue/2]
root        34  0.0  0.0      0     0 ?        S<   Aug30   0:00 [cqueue/3]
root        36  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kseriod]
root       116  0.0  0.0      0     0 ?        S    Aug30   0:00 [pdflush]
root       117  0.0  0.0      0     0 ?        S    Aug30   0:00 [pdflush]
root       118  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kswapd0]
root       119  0.0  0.0      0     0 ?        S<   Aug30   0:00 [aio/0]
root       120  0.0  0.0      0     0 ?        S<   Aug30   0:00 [aio/1]
root       121  0.0  0.0      0     0 ?        S<   Aug30   0:00 [aio/2]
root       122  0.0  0.0      0     0 ?        S<   Aug30   0:00 [aio/3]
root       124  0.0  0.0      0     0 ?        S<   Aug30   0:00 [jfsIO]
root       125  0.0  0.0      0     0 ?        S<   Aug30   0:00 [jfsCommit]
root       126  0.0  0.0      0     0 ?        S<   Aug30   0:00 [jfsCommit]
root       127  0.0  0.0      0     0 ?        S<   Aug30   0:00 [jfsCommit]
root       128  0.0  0.0      0     0 ?        S<   Aug30   0:00 [jfsCommit]
root       129  0.0  0.0      0     0 ?        S<   Aug30   0:00 [jfsSync]
root       130  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xfslogd/0]
root       131  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xfslogd/1]
root       132  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xfslogd/2]
root       133  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xfslogd/3]
root       134  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xfsdatad/0]
root       135  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xfsdatad/1]
root       136  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xfsdatad/2]
root       137  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xfsdatad/3]
root       746  0.0  0.0      0     0 ?        S<   Aug30   0:00 [net_accel/0]
root       747  0.0  0.0      0     0 ?        S<   Aug30   0:00 [net_accel/1]
root       748  0.0  0.0      0     0 ?        S<   Aug30   0:00 [net_accel/2]
root       749  0.0  0.0      0     0 ?        S<   Aug30   0:00 [net_accel/3]
root       756  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kpsmoused]
root       759  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kcryptd/0]
root       760  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kcryptd/1]
root       761  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kcryptd/2]
root       762  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kcryptd/3]
root       763  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kmirrord]
root       773  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kjournald]
root       873  0.0  0.0   2308   616 ?        S

I've only experienced the issue once so far, not sure on the trigger

well according to Linode my linode is at 104% cpu at the moment and has been for 6 hours now.

but I am able to get into the box with no issues and none of my processes are using much of any cpu at all.

Is this just a reporting bug?

PsAUX

root@none:~ # ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   2056   708 ?        Ss   Aug30   0:01 init [2]
root         2  0.0  0.0      0     0 ?        S    Aug30   0:00 [migration/0]
root         3  0.0  0.0      0     0 ?        SN   Aug30   0:00 [ksoftirqd/0]
root         4  0.0  0.0      0     0 ?        S    Aug30   0:00 [migration/1]
root         5  0.0  0.0      0     0 ?        SN   Aug30   0:00 [ksoftirqd/1]
root         6  0.0  0.0      0     0 ?        S    Aug30   0:00 [migration/2]
root         7  0.0  0.0      0     0 ?        SN   Aug30   0:00 [ksoftirqd/2]
root         8  0.0  0.0      0     0 ?        S    Aug30   0:00 [migration/3]
root         9  0.0  0.0      0     0 ?        SN   Aug30   0:00 [ksoftirqd/3]
root        10  0.0  0.0      0     0 ?        S<   Aug30   0:00 [events/0]
root        11  0.0  0.0      0     0 ?        S<   Aug30   0:00 [events/1]
root        12  0.0  0.0      0     0 ?        S<   Aug30   0:00 [events/2]
root        13  0.0  0.0      0     0 ?        S<   Aug30   0:00 [events/3]
root        14  0.0  0.0      0     0 ?        S<   Aug30   0:00 [khelper]
root        15  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kthread]
root        17  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xenwatch]
root        18  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xenbus]
root        27  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kblockd/0]
root        28  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kblockd/1]
root        29  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kblockd/2]
root        30  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kblockd/3]
root        31  0.0  0.0      0     0 ?        S<   Aug30   0:00 [cqueue/0]
root        32  0.0  0.0      0     0 ?        S<   Aug30   0:00 [cqueue/1]
root        33  0.0  0.0      0     0 ?        S<   Aug30   0:00 [cqueue/2]
root        34  0.0  0.0      0     0 ?        S<   Aug30   0:00 [cqueue/3]
root        36  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kseriod]
root       116  0.0  0.0      0     0 ?        S    Aug30   0:00 [pdflush]
root       117  0.0  0.0      0     0 ?        S    Aug30   0:00 [pdflush]
root       118  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kswapd0]
root       119  0.0  0.0      0     0 ?        S<   Aug30   0:00 [aio/0]
root       120  0.0  0.0      0     0 ?        S<   Aug30   0:00 [aio/1]
root       121  0.0  0.0      0     0 ?        S<   Aug30   0:00 [aio/2]
root       122  0.0  0.0      0     0 ?        S<   Aug30   0:00 [aio/3]
root       124  0.0  0.0      0     0 ?        S<   Aug30   0:00 [jfsIO]
root       125  0.0  0.0      0     0 ?        S<   Aug30   0:00 [jfsCommit]
root       126  0.0  0.0      0     0 ?        S<   Aug30   0:00 [jfsCommit]
root       127  0.0  0.0      0     0 ?        S<   Aug30   0:00 [jfsCommit]
root       128  0.0  0.0      0     0 ?        S<   Aug30   0:00 [jfsCommit]
root       129  0.0  0.0      0     0 ?        S<   Aug30   0:00 [jfsSync]
root       130  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xfslogd/0]
root       131  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xfslogd/1]
root       132  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xfslogd/2]
root       133  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xfslogd/3]
root       134  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xfsdatad/0]
root       135  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xfsdatad/1]
root       136  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xfsdatad/2]
root       137  0.0  0.0      0     0 ?        S<   Aug30   0:00 [xfsdatad/3]
root       746  0.0  0.0      0     0 ?        S<   Aug30   0:00 [net_accel/0]
root       747  0.0  0.0      0     0 ?        S<   Aug30   0:00 [net_accel/1]
root       748  0.0  0.0      0     0 ?        S<   Aug30   0:00 [net_accel/2]
root       749  0.0  0.0      0     0 ?        S<   Aug30   0:00 [net_accel/3]
root       756  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kpsmoused]
root       759  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kcryptd/0]
root       760  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kcryptd/1]
root       761  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kcryptd/2]
root       762  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kcryptd/3]
root       763  0.0  0.0      0     0 ?        S<   Aug30   0:00 [kmirrord]
root       773  0.0  0.0      0     0 ?        S<   Aug30   0:03 [kjournald]
root       873  0.0  0.0   2308   616 ?        S

~~TOP

top - 04:48:26 up 6 days, 12:11,  1 user,  load average: 0.14, 0.08, 0.29
Tasks:  86 total,   1 running,  85 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.2%us,  0.1%sy,  0.0%ni, 99.6%id,  0.0%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:    737484k total,   727084k used,    10400k free,    36840k buffers
Swap:   262136k total,       48k used,   262088k free,   607040k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2931 root      34  19 99.7m 7980 1748 S    1  1.1  25:48.66 server_linux
    1 root      15   0  2056  708  608 S    0  0.1   0:01.54 init

Clues?~~

Does htop yield different results (or more clues) that top?

"sudo apt-get install htop" if you don't have it yet.

James

Htop is pretty nice,

but yeah, I'm showing all cores at 0%, with the random jumps to 10% thanks to sql/teamspeak

Well the graphs are back to normal, but I see no difference from before so no clue :-)

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct