high cpu usage - linode unavailable
I am a happy Linode owner since a few days and I'm experimenting with it to increase my Linux skills.
I've turned it into an Ubuntu webserver (LAMP) & mailserver (firewalled by shorewall) and everything seemed to work fine.
Until this early morning when for some reason something went berserk
and CPU usage was that high I had to reboot the linode to be able to access it:
~~![](<URL url=)http://www.ilashed.com/cpu_usage.jpg
Network trafic and disk i/o were zero.
I have no idea where I should look to find out what caused this?
Thanks a lot!
Sam~~
12 Replies
Cheers,
Michael.
as far as I can tell the server was totally unloaded, It serves as 2 1000 man teamspeak servers
but both had under 10 users a peice.
@sweh:
What kernel are you running?
@MrRx7:the latest flavor of Ubuntu
Your kernel is supplied by the Linode host, not the distro you are running.
The memory leak shows up in "free" output:
total used free shared buffers cached
Mem: 356116 253748 102368 0 18320 194404
-/+ buffers/cache: 41024 315092
^^^^^^
Swap: 263160 576 262584
The highlighted number is the important one. (the line above that one should be low; it indicates free memory is being used for buffer/cache and improving performance).
When the memory leak occurs this number goes down and even stopping almost every process on the system doesn't free it up. The only way to fix this is to reboot.
It doesn't happen very often, but it happens enough.
If you're not seeing this (ie you have plenty of free memory) then you've got a different problem, and could well have had a beserkoid process.
total used free shared buffers cached
Mem: 737484 141652 595832 0 34644 45212
-/+ buffers/cache: 61796 675688
Swap: 262136 0 262136
and ps aux
root@none:~ # ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 2056 708 ? Ss Aug30 0:00 init [2]
root 2 0.0 0.0 0 0 ? S Aug30 0:00 [migration/0]
root 3 0.0 0.0 0 0 ? SN Aug30 0:00 [ksoftirqd/0]
root 4 0.0 0.0 0 0 ? S Aug30 0:00 [migration/1]
root 5 0.0 0.0 0 0 ? SN Aug30 0:00 [ksoftirqd/1]
root 6 0.0 0.0 0 0 ? S Aug30 0:00 [migration/2]
root 7 0.0 0.0 0 0 ? SN Aug30 0:00 [ksoftirqd/2]
root 8 0.0 0.0 0 0 ? S Aug30 0:00 [migration/3]
root 9 0.0 0.0 0 0 ? SN Aug30 0:00 [ksoftirqd/3]
root 10 0.0 0.0 0 0 ? S< Aug30 0:00 [events/0]
root 11 0.0 0.0 0 0 ? S< Aug30 0:00 [events/1]
root 12 0.0 0.0 0 0 ? S< Aug30 0:00 [events/2]
root 13 0.0 0.0 0 0 ? S< Aug30 0:00 [events/3]
root 14 0.0 0.0 0 0 ? S< Aug30 0:00 [khelper]
root 15 0.0 0.0 0 0 ? S< Aug30 0:00 [kthread]
root 17 0.0 0.0 0 0 ? S< Aug30 0:00 [xenwatch]
root 18 0.0 0.0 0 0 ? S< Aug30 0:00 [xenbus]
root 27 0.0 0.0 0 0 ? S< Aug30 0:00 [kblockd/0]
root 28 0.0 0.0 0 0 ? S< Aug30 0:00 [kblockd/1]
root 29 0.0 0.0 0 0 ? S< Aug30 0:00 [kblockd/2]
root 30 0.0 0.0 0 0 ? S< Aug30 0:00 [kblockd/3]
root 31 0.0 0.0 0 0 ? S< Aug30 0:00 [cqueue/0]
root 32 0.0 0.0 0 0 ? S< Aug30 0:00 [cqueue/1]
root 33 0.0 0.0 0 0 ? S< Aug30 0:00 [cqueue/2]
root 34 0.0 0.0 0 0 ? S< Aug30 0:00 [cqueue/3]
root 36 0.0 0.0 0 0 ? S< Aug30 0:00 [kseriod]
root 116 0.0 0.0 0 0 ? S Aug30 0:00 [pdflush]
root 117 0.0 0.0 0 0 ? S Aug30 0:00 [pdflush]
root 118 0.0 0.0 0 0 ? S< Aug30 0:00 [kswapd0]
root 119 0.0 0.0 0 0 ? S< Aug30 0:00 [aio/0]
root 120 0.0 0.0 0 0 ? S< Aug30 0:00 [aio/1]
root 121 0.0 0.0 0 0 ? S< Aug30 0:00 [aio/2]
root 122 0.0 0.0 0 0 ? S< Aug30 0:00 [aio/3]
root 124 0.0 0.0 0 0 ? S< Aug30 0:00 [jfsIO]
root 125 0.0 0.0 0 0 ? S< Aug30 0:00 [jfsCommit]
root 126 0.0 0.0 0 0 ? S< Aug30 0:00 [jfsCommit]
root 127 0.0 0.0 0 0 ? S< Aug30 0:00 [jfsCommit]
root 128 0.0 0.0 0 0 ? S< Aug30 0:00 [jfsCommit]
root 129 0.0 0.0 0 0 ? S< Aug30 0:00 [jfsSync]
root 130 0.0 0.0 0 0 ? S< Aug30 0:00 [xfslogd/0]
root 131 0.0 0.0 0 0 ? S< Aug30 0:00 [xfslogd/1]
root 132 0.0 0.0 0 0 ? S< Aug30 0:00 [xfslogd/2]
root 133 0.0 0.0 0 0 ? S< Aug30 0:00 [xfslogd/3]
root 134 0.0 0.0 0 0 ? S< Aug30 0:00 [xfsdatad/0]
root 135 0.0 0.0 0 0 ? S< Aug30 0:00 [xfsdatad/1]
root 136 0.0 0.0 0 0 ? S< Aug30 0:00 [xfsdatad/2]
root 137 0.0 0.0 0 0 ? S< Aug30 0:00 [xfsdatad/3]
root 746 0.0 0.0 0 0 ? S< Aug30 0:00 [net_accel/0]
root 747 0.0 0.0 0 0 ? S< Aug30 0:00 [net_accel/1]
root 748 0.0 0.0 0 0 ? S< Aug30 0:00 [net_accel/2]
root 749 0.0 0.0 0 0 ? S< Aug30 0:00 [net_accel/3]
root 756 0.0 0.0 0 0 ? S< Aug30 0:00 [kpsmoused]
root 759 0.0 0.0 0 0 ? S< Aug30 0:00 [kcryptd/0]
root 760 0.0 0.0 0 0 ? S< Aug30 0:00 [kcryptd/1]
root 761 0.0 0.0 0 0 ? S< Aug30 0:00 [kcryptd/2]
root 762 0.0 0.0 0 0 ? S< Aug30 0:00 [kcryptd/3]
root 763 0.0 0.0 0 0 ? S< Aug30 0:00 [kmirrord]
root 773 0.0 0.0 0 0 ? S< Aug30 0:00 [kjournald]
root 873 0.0 0.0 2308 616 ? S
I've only experienced the issue once so far, not sure on the trigger
but I am able to get into the box with no issues and none of my processes are using much of any cpu at all.
Is this just a reporting bug?
PsAUX
root@none:~ # ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 2056 708 ? Ss Aug30 0:01 init [2]
root 2 0.0 0.0 0 0 ? S Aug30 0:00 [migration/0]
root 3 0.0 0.0 0 0 ? SN Aug30 0:00 [ksoftirqd/0]
root 4 0.0 0.0 0 0 ? S Aug30 0:00 [migration/1]
root 5 0.0 0.0 0 0 ? SN Aug30 0:00 [ksoftirqd/1]
root 6 0.0 0.0 0 0 ? S Aug30 0:00 [migration/2]
root 7 0.0 0.0 0 0 ? SN Aug30 0:00 [ksoftirqd/2]
root 8 0.0 0.0 0 0 ? S Aug30 0:00 [migration/3]
root 9 0.0 0.0 0 0 ? SN Aug30 0:00 [ksoftirqd/3]
root 10 0.0 0.0 0 0 ? S< Aug30 0:00 [events/0]
root 11 0.0 0.0 0 0 ? S< Aug30 0:00 [events/1]
root 12 0.0 0.0 0 0 ? S< Aug30 0:00 [events/2]
root 13 0.0 0.0 0 0 ? S< Aug30 0:00 [events/3]
root 14 0.0 0.0 0 0 ? S< Aug30 0:00 [khelper]
root 15 0.0 0.0 0 0 ? S< Aug30 0:00 [kthread]
root 17 0.0 0.0 0 0 ? S< Aug30 0:00 [xenwatch]
root 18 0.0 0.0 0 0 ? S< Aug30 0:00 [xenbus]
root 27 0.0 0.0 0 0 ? S< Aug30 0:00 [kblockd/0]
root 28 0.0 0.0 0 0 ? S< Aug30 0:00 [kblockd/1]
root 29 0.0 0.0 0 0 ? S< Aug30 0:00 [kblockd/2]
root 30 0.0 0.0 0 0 ? S< Aug30 0:00 [kblockd/3]
root 31 0.0 0.0 0 0 ? S< Aug30 0:00 [cqueue/0]
root 32 0.0 0.0 0 0 ? S< Aug30 0:00 [cqueue/1]
root 33 0.0 0.0 0 0 ? S< Aug30 0:00 [cqueue/2]
root 34 0.0 0.0 0 0 ? S< Aug30 0:00 [cqueue/3]
root 36 0.0 0.0 0 0 ? S< Aug30 0:00 [kseriod]
root 116 0.0 0.0 0 0 ? S Aug30 0:00 [pdflush]
root 117 0.0 0.0 0 0 ? S Aug30 0:00 [pdflush]
root 118 0.0 0.0 0 0 ? S< Aug30 0:00 [kswapd0]
root 119 0.0 0.0 0 0 ? S< Aug30 0:00 [aio/0]
root 120 0.0 0.0 0 0 ? S< Aug30 0:00 [aio/1]
root 121 0.0 0.0 0 0 ? S< Aug30 0:00 [aio/2]
root 122 0.0 0.0 0 0 ? S< Aug30 0:00 [aio/3]
root 124 0.0 0.0 0 0 ? S< Aug30 0:00 [jfsIO]
root 125 0.0 0.0 0 0 ? S< Aug30 0:00 [jfsCommit]
root 126 0.0 0.0 0 0 ? S< Aug30 0:00 [jfsCommit]
root 127 0.0 0.0 0 0 ? S< Aug30 0:00 [jfsCommit]
root 128 0.0 0.0 0 0 ? S< Aug30 0:00 [jfsCommit]
root 129 0.0 0.0 0 0 ? S< Aug30 0:00 [jfsSync]
root 130 0.0 0.0 0 0 ? S< Aug30 0:00 [xfslogd/0]
root 131 0.0 0.0 0 0 ? S< Aug30 0:00 [xfslogd/1]
root 132 0.0 0.0 0 0 ? S< Aug30 0:00 [xfslogd/2]
root 133 0.0 0.0 0 0 ? S< Aug30 0:00 [xfslogd/3]
root 134 0.0 0.0 0 0 ? S< Aug30 0:00 [xfsdatad/0]
root 135 0.0 0.0 0 0 ? S< Aug30 0:00 [xfsdatad/1]
root 136 0.0 0.0 0 0 ? S< Aug30 0:00 [xfsdatad/2]
root 137 0.0 0.0 0 0 ? S< Aug30 0:00 [xfsdatad/3]
root 746 0.0 0.0 0 0 ? S< Aug30 0:00 [net_accel/0]
root 747 0.0 0.0 0 0 ? S< Aug30 0:00 [net_accel/1]
root 748 0.0 0.0 0 0 ? S< Aug30 0:00 [net_accel/2]
root 749 0.0 0.0 0 0 ? S< Aug30 0:00 [net_accel/3]
root 756 0.0 0.0 0 0 ? S< Aug30 0:00 [kpsmoused]
root 759 0.0 0.0 0 0 ? S< Aug30 0:00 [kcryptd/0]
root 760 0.0 0.0 0 0 ? S< Aug30 0:00 [kcryptd/1]
root 761 0.0 0.0 0 0 ? S< Aug30 0:00 [kcryptd/2]
root 762 0.0 0.0 0 0 ? S< Aug30 0:00 [kcryptd/3]
root 763 0.0 0.0 0 0 ? S< Aug30 0:00 [kmirrord]
root 773 0.0 0.0 0 0 ? S< Aug30 0:03 [kjournald]
root 873 0.0 0.0 2308 616 ? S
~~TOP
top - 04:48:26 up 6 days, 12:11, 1 user, load average: 0.14, 0.08, 0.29
Tasks: 86 total, 1 running, 85 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.2%us, 0.1%sy, 0.0%ni, 99.6%id, 0.0%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 737484k total, 727084k used, 10400k free, 36840k buffers
Swap: 262136k total, 48k used, 262088k free, 607040k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2931 root 34 19 99.7m 7980 1748 S 1 1.1 25:48.66 server_linux
1 root 15 0 2056 708 608 S 0 0.1 0:01.54 init
Clues?~~
"sudo apt-get install htop" if you don't have it yet.
James
but yeah, I'm showing all cores at 0%, with the random jumps to 10% thanks to sql/teamspeak
Well the graphs are back to normal, but I see no difference from before so no clue