Consistently Experiencing Lag on my Linode

Hello,

I have been experiencing sporadic lag on my Linode for a while now. Even simple tasks such as SSH, downloading/searching on apt, etc. are slow. I opened a support ticket last night, and ultimately I showed them an iostat report where my wait was at 14.3%. Support suggested I migrate to a new host and if the problem persists ask the Forums. It seems a bit faster after the migration, the Wait is way down, but lag still comes and goes. One thing we noticed was that the Linode is swapping more than it should. It seems to even be doing it when there is Ram available. I don't know a lot about this stuff, but here is the output of some commands:

top - 00:30:13 up 22:08, 3 users, load average: 0.86, 0.56, 0.37

Tasks: 116 total, 2 running, 114 sleeping, 0 stopped, 0 zombie

Cpu(s): 10.6%us, 3.8%sy, 0.0%ni, 85.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Mem: 509084k total, 448428k used, 60656k free, 64072k buffers

Swap: 262140k total, 120236k used, 141904k free, 213672k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

1978 zgp 20 0 99.1m 90m 1564 R 100 18.2 255:42.54 moo

1506 mysql 20 0 137m 6480 4556 S 0 1.3 15:13.18 mysqld

1677 list 20 0 9664 2832 2456 S 0 0.6 0:10.32 python

1681 list 20 0 10584 5428 2676 S 0 1.1 0:11.73 python

1690 root 20 0 21876 6920 2764 S 0 1.4 0:40.20 orbited

8098 root 20 0 2592 1108 840 R 0 0.2 0:00.02 top

1 root 20 0 2164 584 584 S 0 0.1 0:01.28 init

2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd

3 root 20 0 0 0 0 S 0 0.0 0:00.73 ksoftirqd/0

4 root 20 0 0 0 0 S 0 0.0 0:02.34 kworker/0:0

5 root 20 0 0 0 0 S 0 0.0 0:00.03 kworker/u:0

6 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/0

7 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/1

8 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/1:0

9 root 20 0 0 0 0 S 0 0.0 0:00.60 ksoftirqd/1

10 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/2

11 root 20 0 0 0 0 S 0 0.0 0:00.00 kworker/2:0

zanosoft:~# free -m

total used free shared buffers cached

Mem: 497 438 59 0 62 208

-/+ buffers/cache: 166 330

Swap: 255 117 138

zanosoft:~#

zanosoft:~# iostat

Linux 3.0.4-linode38 (zanosoft.net) 10/23/2011 i686 (4 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle

2.16 0.00 0.57 0.13 0.02 97.12

Device: tps Blkread/s Blkwrtn/s Blkread Blkwrtn

xvda 4.22 19.92 202.14 1589458 16127584

xvdb 2.32 11.35 9.01 905824 718856

zanosoft:~# 8

We ran mtr reports to and from the Linode and they seem fine.

Any help would be appreciated.

13 Replies

Try dumping "ps aux" out so everyone can see what processes are running.

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND

root 1 0.0 0.1 2164 584 ? Ss Oct22 0:01 init [2]

root 2 0.0 0.0 0 0 ? S Oct22 0:00 [kthreadd]

root 3 0.0 0.0 0 0 ? S Oct22 0:00 [ksoftirqd/0]

root 4 0.0 0.0 0 0 ? S Oct22 0:02 [kworker/0:0]

root 5 0.0 0.0 0 0 ? S Oct22 0:00 [kworker/u:0]

root 6 0.0 0.0 0 0 ? S Oct22 0:00 [migration/0]

root 7 0.0 0.0 0 0 ? S Oct22 0:00 [migration/1]

root 8 0.0 0.0 0 0 ? S Oct22 0:00 [kworker/1:0]

root 9 0.0 0.0 0 0 ? S Oct22 0:00 [ksoftirqd/1]

root 10 0.0 0.0 0 0 ? S Oct22 0:00 [migration/2]

root 11 0.0 0.0 0 0 ? S Oct22 0:00 [kworker/2:0]

root 12 0.0 0.0 0 0 ? S Oct22 0:00 [ksoftirqd/2]

root 13 0.0 0.0 0 0 ? S Oct22 0:00 [migration/3]

root 14 0.0 0.0 0 0 ? S Oct22 0:00 [kworker/3:0]

root 15 0.0 0.0 0 0 ? S Oct22 0:00 [ksoftirqd/3]

root 16 0.0 0.0 0 0 ? S< Oct22 0:00 [khelper]

root 17 0.0 0.0 0 0 ? S Oct22 0:00 [kworker/u:1]

root 21 0.0 0.0 0 0 ? S Oct22 0:00 [xenwatch]

root 22 0.0 0.0 0 0 ? S Oct22 0:00 [xenbus]

root 148 0.0 0.0 0 0 ? S Oct22 0:00 [sync_supers]

root 150 0.0 0.0 0 0 ? S Oct22 0:00 [bdi-default]

root 152 0.0 0.0 0 0 ? S< Oct22 0:00 [kblockd]

root 162 0.0 0.0 0 0 ? S< Oct22 0:00 [md]

root 246 0.0 0.0 0 0 ? S< Oct22 0:00 [rpciod]

root 248 0.0 0.0 0 0 ? S Oct22 0:02 [kworker/0:1]

root 279 0.0 0.0 0 0 ? S Oct22 0:08 [kswapd0]

root 280 0.0 0.0 0 0 ? SN Oct22 0:00 [ksmd]

root 281 0.0 0.0 0 0 ? S Oct22 0:00 [fsnotify_mark]

root 285 0.0 0.0 0 0 ? S Oct22 0:00 [ecryptfs-kthrea]

root 287 0.0 0.0 0 0 ? S< Oct22 0:00 [nfsiod]

root 290 0.0 0.0 0 0 ? S Oct22 0:00 [jfsIO]

root 291 0.0 0.0 0 0 ? S Oct22 0:00 [jfsCommit]

root 292 0.0 0.0 0 0 ? S Oct22 0:00 [jfsCommit]

root 293 0.0 0.0 0 0 ? S Oct22 0:00 [jfsCommit]

root 294 0.0 0.0 0 0 ? S Oct22 0:00 [jfsCommit]

root 295 0.0 0.0 0 0 ? S Oct22 0:00 [jfsSync]

root 296 0.0 0.0 0 0 ? S< Oct22 0:00 [xfsmrucache]

root 297 0.0 0.0 0 0 ? S< Oct22 0:00 [xfslogd]

root 298 0.0 0.0 0 0 ? S< Oct22 0:00 [xfsdatad]

root 299 0.0 0.0 0 0 ? S< Oct22 0:00 [xfsconvertd]

root 300 0.0 0.0 0 0 ? S< Oct22 0:00 [glock_workqueue]

root 301 0.0 0.0 0 0 ? S< Oct22 0:00 [delete_workqueu]

root 302 0.0 0.0 0 0 ? S< Oct22 0:00 [gfs_recovery]

root 303 0.0 0.0 0 0 ? S< Oct22 0:00 [crypto]

root 865 0.0 0.0 0 0 ? S Oct22 0:00 [khvcd]

root 979 0.0 0.0 0 0 ? S< Oct22 0:00 [kpsmoused]

root 1004 0.0 0.0 0 0 ? S Oct22 0:03 [kworker/1:1]

root 1007 0.0 0.0 0 0 ? S Oct22 0:02 [kjournald]

root 1044 0.0 0.0 0 0 ? S Oct22 0:02 [kworker/2:1]

root 1045 0.0 0.0 0 0 ? S Oct22 0:03 [kworker/3:1]

root 1208 0.0 0.0 0 0 ? S Oct22 0:03 [flush-202:0]

root 1223 0.0 0.0 2236 424 ? Ss Oct22 0:00 dhclient3 -pf /var/run/dhclient.eth0.pid -lf /var/lib/dhcp3/dhclient.eth0.leases eth0

root 1323 0.0 0.2 28488 1264 ? Sl Oct22 0:00 /usr/sbin/rsyslogd -c3

113 1336 0.0 0.0 2716 104 ? Ss Oct22 0:00 /usr/bin/dbus-daemon –system

root 1347 0.0 0.1 5624 792 ? Ss Oct22 0:00 /usr/sbin/sshd

root 1358 0.0 0.4 8372 2136 ? Ss Oct22 0:00 sshd: zgp [priv]

amavis 1364 0.0 0.3 29972 1836 ? Ss Oct22 0:00 amavisd (master)

amavis 1376 0.0 0.2 30996 1164 ? S Oct22 0:00 amavisd (virgin child)

amavis 1377 0.0 0.2 30996 1148 ? S Oct22 0:00 amavisd (virgin child)

root 1391 0.0 0.2 2924 1116 ? S Oct22 0:00 /bin/sh /usr/bin/mysqld_safe

mysql 1506 1.1 1.2 140812 6432 ? Sl Oct22 15:36 /usr/sbin/mysqld –basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306

root 1507 0.0 0.0 1804 468 ? S Oct22 0:00 logger -t mysqld -p daemon.error

zgp 1523 0.0 0.2 8532 1028 ? S Oct22 0:00 sshd: zgp@pts/0

zgp 1524 0.0 0.2 5584 1404 pts/0 Ss Oct22 0:00 -bash

root 1662 0.0 0.5 8616 2660 ? S Oct22 0:03 python /usr/sbin/denyhosts --daemon --purge --config=/etc/denyhosts.conf --config=/etc/denyhosts.conf

list 1675 0.0 0.1 10072 940 ? Ss Oct22 0:00 /usr/bin/python /usr/lib/mailman/bin/mailmanctl -s -q start

list 1676 0.0 1.2 11392 6472 ? S Oct22 0:12 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=ArchRunner:0:1 -s

list 1677 0.0 0.5 9664 2836 ? S Oct22 0:10 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=BounceRunner:0:1 -s

list 1679 0.0 1.0 10400 5488 ? S Oct22 0:12 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=IncomingRunner:0:1 -s

list 1680 0.0 0.5 9772 2788 ? S Oct22 0:10 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=NewsRunner:0:1 -s

list 1681 0.0 1.0 10584 5412 ? S Oct22 0:12 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=OutgoingRunner:0:1 -s

list 1682 0.0 0.5 9936 2964 ? S Oct22 0:11 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=VirginRunner:0:1 -s

list 1683 0.0 0.5 9652 2764 ? S Oct22 0:00 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=RetryRunner:0:1 -s

root 1690 0.0 1.1 21876 5792 ? Sl Oct22 0:40 /usr/bin/python /usr/bin/orbited --config=/etc/orbited.cfg

list 1710 0.0 0.5 9784 2972 ? S Oct22 0:11 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=CommandRunner:0:1 -s

root 1792 0.0 0.3 5916 1604 ? Ss Oct22 0:01 /usr/lib/postfix/master

postfix 1805 0.0 0.3 6120 1708 ? S Oct22 0:00 qmgr -l -t fifo -u

root 1818 0.0 0.0 8800 360 ? Ss Oct22 0:00 /usr/sbin/saslauthd -a shadow -c -m /var/run/saslauthd -n 5

root 1819 0.0 0.0 8800 4 ? S Oct22 0:00 /usr/sbin/saslauthd -a shadow -c -m /var/run/saslauthd -n 5

root 1820 0.0 0.0 8800 4 ? S Oct22 0:00 /usr/sbin/saslauthd -a shadow -c -m /var/run/saslauthd -n 5

root 1822 0.0 0.0 8800 4 ? S Oct22 0:00 /usr/sbin/saslauthd -a shadow -c -m /var/run/saslauthd -n 5

root 1823 0.0 0.0 8800 4 ? S Oct22 0:00 /usr/sbin/saslauthd -a shadow -c -m /var/run/saslauthd -n 5

root 1833 0.0 0.1 4100 712 ? S Oct22 0:00 /usr/sbin/vsftpd

root 1869 0.0 0.1 4076 552 ? Ss Oct22 0:03 /usr/sbin/dovecot -c /etc/dovecot/dovecot.conf

root 1871 0.0 0.3 10160 1676 ? S Oct22 0:00 dovecot-auth

root 1883 0.0 0.3 10160 1596 ? S Oct22 0:00 dovecot-auth -w

daemon 1884 0.0 0.0 2296 220 ? Ss Oct22 0:00 /usr/sbin/atd

root 1904 0.0 0.1 4436 896 ? Ss Oct22 0:00 /usr/sbin/cron

root 1919 0.0 0.9 37812 4596 ? Ss Oct22 0:03 /usr/sbin/apache2 -k start

www-data 1941 0.0 0.8 40484 4560 ? S Oct22 0:01 /usr/sbin/apache2 -k start

www-data 1942 0.0 0.9 38492 4712 ? S Oct22 0:00 /usr/sbin/apache2 -k start

www-data 1943 0.0 0.9 40108 4916 ? S Oct22 0:01 /usr/sbin/apache2 -k start

root 1945 0.0 0.3 13180 1844 ? Ss Oct22 0:01 /usr/bin/perl /usr/share/webmin/miniserv.pl /etc/webmin/miniserv.conf

root 1947 0.0 0.0 1820 464 hvc0 Ss+ Oct22 0:00 /sbin/getty 38400 hvc0

zgp 1978 19.6 17.8 101524 90996 ? S Oct22 267:38 ./moo -l moo.log moo.db moo.db.new 3500

zgp 1980 0.0 0.0 4956 396 ? S Oct22 0:00 (MOO name-lookup master)

www-data 1983 0.0 0.9 41004 4960 ? S Oct22 0:01 /usr/sbin/apache2 -k start

zgp 1984 0.0 0.1 4912 736 pts/0 S+ Oct22 0:00 screen

zgp 1985 0.0 0.1 5204 648 ? Ss Oct22 0:00 SCREEN

zgp 1986 0.0 0.2 5600 1428 pts/2 Ss Oct22 0:00 /bin/bash

root 2005 0.0 0.1 4600 980 pts/2 S Oct22 0:00 su

root 2007 0.0 0.2 5068 1376 pts/2 S+ Oct22 0:00 bash

www-data 2023 0.0 0.8 41116 4572 ? S Oct22 0:01 /usr/sbin/apache2 -k start

www-data 2025 0.0 0.9 38648 4868 ? S Oct22 0:00 /usr/sbin/apache2 -k start

www-data 2026 0.0 0.9 38548 4692 ? S Oct22 0:00 /usr/sbin/apache2 -k start

www-data 2034 0.0 0.9 38372 4776 ? S Oct22 0:00 /usr/sbin/apache2 -k start

www-data 2035 0.0 0.8 38480 4448 ? S Oct22 0:00 /usr/sbin/apache2 -k start

postfix 2070 0.0 0.3 6364 2028 ? S Oct22 0:00 tlsmgr -l -t unix -u -c

www-data 5811 0.0 0.8 38416 4512 ? S Oct22 0:00 /usr/sbin/apache2 -k start

root 6306 0.0 0.4 8372 2376 ? Ss Oct22 0:00 sshd: root@pts/3

root 6308 0.0 0.2 5060 1400 pts/3 Ss+ Oct22 0:00 -bash

postfix 7824 0.0 0.2 5932 1488 ? S Oct22 0:00 pickup -l -t fifo -u -c

zgp 8052 0.0 0.1 5108 836 ? S 00:16 0:00 (MOO name-lookup slave)

root 8263 0.0 0.5 8528 2680 ? Ss 01:04 0:00 sshd: root@notty

root 8265 0.0 0.3 5032 1576 ? Ss 01:04 0:00 /usr/lib/openssh/sftp-server

root 8272 0.0 0.5 8372 2680 ? Ss 01:05 0:00 sshd: root@notty

root 8274 0.0 0.2 5032 1352 ? Ss 01:05 0:00 /usr/lib/openssh/sftp-server

root 8275 0.1 0.5 8372 2772 ? Ss 01:05 0:00 sshd: root@pts/4

root 8277 0.0 0.3 5048 1748 pts/4 Ss 01:05 0:00 -bash

root 8283 0.0 0.2 4484 1084 pts/4 R+ 01:06 0:00 ps aux

Just a tip: Post output from commands in between code tags. You can do that by clicking the Code button, pasting the output, then clicking Code again, or put "[ code ]" (no spaces or quotes) at the start and "[ /code ]" (no spaces or quotes). That will format your text and separate the output from the rest of the post. Using your output as an example of this:

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND 
root 1 0.0 0.1 2164 584 ? Ss Oct22 0:01 init [2] 
root 2 0.0 0.0 0 0 ? S Oct22 0:00 [kthreadd] 
root 3 0.0 0.0 0 0 ? S Oct22 0:00 [ksoftirqd/0] 
root 4 0.0 0.0 0 0 ? S Oct22 0:02 [kworker/0:0] 
root 5 0.0 0.0 0 0 ? S Oct22 0:00 [kworker/u:0] 
root 6 0.0 0.0 0 0 ? S Oct22 0:00 [migration/0] 
root 7 0.0 0.0 0 0 ? S Oct22 0:00 [migration/1] 
root 8 0.0 0.0 0 0 ? S Oct22 0:00 [kworker/1:0] 
root 9 0.0 0.0 0 0 ? S Oct22 0:00 [ksoftirqd/1] 
root 10 0.0 0.0 0 0 ? S Oct22 0:00 [migration/2] 
root 11 0.0 0.0 0 0 ? S Oct22 0:00 [kworker/2:0] 
root 12 0.0 0.0 0 0 ? S Oct22 0:00 [ksoftirqd/2] 
root 13 0.0 0.0 0 0 ? S Oct22 0:00 [migration/3] 
root 14 0.0 0.0 0 0 ? S Oct22 0:00 [kworker/3:0] 
root 15 0.0 0.0 0 0 ? S Oct22 0:00 [ksoftirqd/3] 
root 16 0.0 0.0 0 0 ? S< Oct22 0:00 [khelper] 
root 17 0.0 0.0 0 0 ? S Oct22 0:00 [kworker/u:1] 
root 21 0.0 0.0 0 0 ? S Oct22 0:00 [xenwatch] 
root 22 0.0 0.0 0 0 ? S Oct22 0:00 [xenbus] 
root 148 0.0 0.0 0 0 ? S Oct22 0:00 [sync_supers] 
root 150 0.0 0.0 0 0 ? S Oct22 0:00 [bdi-default] 
root 152 0.0 0.0 0 0 ? S< Oct22 0:00 [kblockd] 
root 162 0.0 0.0 0 0 ? S< Oct22 0:00 [md] 
root 246 0.0 0.0 0 0 ? S< Oct22 0:00 [rpciod] 
root 248 0.0 0.0 0 0 ? S Oct22 0:02 [kworker/0:1] 
root 279 0.0 0.0 0 0 ? S Oct22 0:08 [kswapd0] 
root 280 0.0 0.0 0 0 ? SN Oct22 0:00 [ksmd] 
root 281 0.0 0.0 0 0 ? S Oct22 0:00 [fsnotify_mark] 
root 285 0.0 0.0 0 0 ? S Oct22 0:00 [ecryptfs-kthrea] 
root 287 0.0 0.0 0 0 ? S< Oct22 0:00 [nfsiod] 
root 290 0.0 0.0 0 0 ? S Oct22 0:00 [jfsIO] 
root 291 0.0 0.0 0 0 ? S Oct22 0:00 [jfsCommit] 
root 292 0.0 0.0 0 0 ? S Oct22 0:00 [jfsCommit] 
root 293 0.0 0.0 0 0 ? S Oct22 0:00 [jfsCommit] 
root 294 0.0 0.0 0 0 ? S Oct22 0:00 [jfsCommit] 
root 295 0.0 0.0 0 0 ? S Oct22 0:00 [jfsSync] 
root 296 0.0 0.0 0 0 ? S< Oct22 0:00 [xfs_mru_cache] 
root 297 0.0 0.0 0 0 ? S< Oct22 0:00 [xfslogd] 
root 298 0.0 0.0 0 0 ? S< Oct22 0:00 [xfsdatad] 
root 299 0.0 0.0 0 0 ? S< Oct22 0:00 [xfsconvertd] 
root 300 0.0 0.0 0 0 ? S< Oct22 0:00 [glock_workqueue] 
root 301 0.0 0.0 0 0 ? S< Oct22 0:00 [delete_workqueu] 
root 302 0.0 0.0 0 0 ? S< Oct22 0:00 [gfs_recovery] 
root 303 0.0 0.0 0 0 ? S< Oct22 0:00 [crypto] 
root 865 0.0 0.0 0 0 ? S Oct22 0:00 [khvcd] 
root 979 0.0 0.0 0 0 ? S< Oct22 0:00 [kpsmoused] 
root 1004 0.0 0.0 0 0 ? S Oct22 0:03 [kworker/1:1] 
root 1007 0.0 0.0 0 0 ? S Oct22 0:02 [kjournald] 
root 1044 0.0 0.0 0 0 ? S Oct22 0:02 [kworker/2:1] 
root 1045 0.0 0.0 0 0 ? S Oct22 0:03 [kworker/3:1] 
root 1208 0.0 0.0 0 0 ? S Oct22 0:03 [flush-202:0] 
root 1223 0.0 0.0 2236 424 ? Ss Oct22 0:00 dhclient3 -pf /var/run/dhclient.eth0.pid -lf /var/lib/dhcp3/dhclient.eth0.leases eth0 
root 1323 0.0 0.2 28488 1264 ? Sl Oct22 0:00 /usr/sbin/rsyslogd -c3 
113 1336 0.0 0.0 2716 104 ? Ss Oct22 0:00 /usr/bin/dbus-daemon --system 
root 1347 0.0 0.1 5624 792 ? Ss Oct22 0:00 /usr/sbin/sshd 
root 1358 0.0 0.4 8372 2136 ? Ss Oct22 0:00 sshd: zgp [priv] 
amavis 1364 0.0 0.3 29972 1836 ? Ss Oct22 0:00 amavisd (master) 
amavis 1376 0.0 0.2 30996 1164 ? S Oct22 0:00 amavisd (virgin child) 
amavis 1377 0.0 0.2 30996 1148 ? S Oct22 0:00 amavisd (virgin child) 
root 1391 0.0 0.2 2924 1116 ? S Oct22 0:00 /bin/sh /usr/bin/mysqld_safe 
mysql 1506 1.1 1.2 140812 6432 ? Sl Oct22 15:36 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306 
root 1507 0.0 0.0 1804 468 ? S Oct22 0:00 logger -t mysqld -p daemon.error 
zgp 1523 0.0 0.2 8532 1028 ? S Oct22 0:00 sshd: zgp@pts/0 
zgp 1524 0.0 0.2 5584 1404 pts/0 Ss Oct22 0:00 -bash 
root 1662 0.0 0.5 8616 2660 ? S Oct22 0:03 python /usr/sbin/denyhosts --daemon --purge --config=/etc/denyhosts.conf --config=/etc/denyhosts.conf 
list 1675 0.0 0.1 10072 940 ? Ss Oct22 0:00 /usr/bin/python /usr/lib/mailman/bin/mailmanctl -s -q start 
list 1676 0.0 1.2 11392 6472 ? S Oct22 0:12 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=ArchRunner:0:1 -s 
list 1677 0.0 0.5 9664 2836 ? S Oct22 0:10 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=BounceRunner:0:1 -s 
list 1679 0.0 1.0 10400 5488 ? S Oct22 0:12 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=IncomingRunner:0:1 -s 
list 1680 0.0 0.5 9772 2788 ? S Oct22 0:10 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=NewsRunner:0:1 -s 
list 1681 0.0 1.0 10584 5412 ? S Oct22 0:12 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=OutgoingRunner:0:1 -s 
list 1682 0.0 0.5 9936 2964 ? S Oct22 0:11 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=VirginRunner:0:1 -s 
list 1683 0.0 0.5 9652 2764 ? S Oct22 0:00 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=RetryRunner:0:1 -s 
root 1690 0.0 1.1 21876 5792 ? Sl Oct22 0:40 /usr/bin/python /usr/bin/orbited --config=/etc/orbited.cfg 
list 1710 0.0 0.5 9784 2972 ? S Oct22 0:11 /usr/bin/python /var/lib/mailman/bin/qrunner --runner=CommandRunner:0:1 -s 
root 1792 0.0 0.3 5916 1604 ? Ss Oct22 0:01 /usr/lib/postfix/master 
postfix 1805 0.0 0.3 6120 1708 ? S Oct22 0:00 qmgr -l -t fifo -u 
root 1818 0.0 0.0 8800 360 ? Ss Oct22 0:00 /usr/sbin/saslauthd -a shadow -c -m /var/run/saslauthd -n 5 
root 1819 0.0 0.0 8800 4 ? S Oct22 0:00 /usr/sbin/saslauthd -a shadow -c -m /var/run/saslauthd -n 5 
root 1820 0.0 0.0 8800 4 ? S Oct22 0:00 /usr/sbin/saslauthd -a shadow -c -m /var/run/saslauthd -n 5 
root 1822 0.0 0.0 8800 4 ? S Oct22 0:00 /usr/sbin/saslauthd -a shadow -c -m /var/run/saslauthd -n 5 
root 1823 0.0 0.0 8800 4 ? S Oct22 0:00 /usr/sbin/saslauthd -a shadow -c -m /var/run/saslauthd -n 5 
root 1833 0.0 0.1 4100 712 ? S Oct22 0:00 /usr/sbin/vsftpd 
root 1869 0.0 0.1 4076 552 ? Ss Oct22 0:03 /usr/sbin/dovecot -c /etc/dovecot/dovecot.conf 
root 1871 0.0 0.3 10160 1676 ? S Oct22 0:00 dovecot-auth 
root 1883 0.0 0.3 10160 1596 ? S Oct22 0:00 dovecot-auth -w 
daemon 1884 0.0 0.0 2296 220 ? Ss Oct22 0:00 /usr/sbin/atd 
root 1904 0.0 0.1 4436 896 ? Ss Oct22 0:00 /usr/sbin/cron 
root 1919 0.0 0.9 37812 4596 ? Ss Oct22 0:03 /usr/sbin/apache2 -k start 
www-data 1941 0.0 0.8 40484 4560 ? S Oct22 0:01 /usr/sbin/apache2 -k start 
www-data 1942 0.0 0.9 38492 4712 ? S Oct22 0:00 /usr/sbin/apache2 -k start 
www-data 1943 0.0 0.9 40108 4916 ? S Oct22 0:01 /usr/sbin/apache2 -k start 
root 1945 0.0 0.3 13180 1844 ? Ss Oct22 0:01 /usr/bin/perl /usr/share/webmin/miniserv.pl /etc/webmin/miniserv.conf 
root 1947 0.0 0.0 1820 464 hvc0 Ss+ Oct22 0:00 /sbin/getty 38400 hvc0 
zgp 1978 19.6 17.8 101524 90996 ? S Oct22 267:38 ./moo -l moo.log moo.db moo.db.new 3500 
zgp 1980 0.0 0.0 4956 396 ? S Oct22 0:00 (MOO name-lookup master) 
www-data 1983 0.0 0.9 41004 4960 ? S Oct22 0:01 /usr/sbin/apache2 -k start 
zgp 1984 0.0 0.1 4912 736 pts/0 S+ Oct22 0:00 screen 
zgp 1985 0.0 0.1 5204 648 ? Ss Oct22 0:00 SCREEN 
zgp 1986 0.0 0.2 5600 1428 pts/2 Ss Oct22 0:00 /bin/bash 
root 2005 0.0 0.1 4600 980 pts/2 S Oct22 0:00 su 
root 2007 0.0 0.2 5068 1376 pts/2 S+ Oct22 0:00 bash 
www-data 2023 0.0 0.8 41116 4572 ? S Oct22 0:01 /usr/sbin/apache2 -k start 
www-data 2025 0.0 0.9 38648 4868 ? S Oct22 0:00 /usr/sbin/apache2 -k start 
www-data 2026 0.0 0.9 38548 4692 ? S Oct22 0:00 /usr/sbin/apache2 -k start 
www-data 2034 0.0 0.9 38372 4776 ? S Oct22 0:00 /usr/sbin/apache2 -k start 
www-data 2035 0.0 0.8 38480 4448 ? S Oct22 0:00 /usr/sbin/apache2 -k start 
postfix 2070 0.0 0.3 6364 2028 ? S Oct22 0:00 tlsmgr -l -t unix -u -c 
www-data 5811 0.0 0.8 38416 4512 ? S Oct22 0:00 /usr/sbin/apache2 -k start 
root 6306 0.0 0.4 8372 2376 ? Ss Oct22 0:00 sshd: root@pts/3 
root 6308 0.0 0.2 5060 1400 pts/3 Ss+ Oct22 0:00 -bash 
postfix 7824 0.0 0.2 5932 1488 ? S Oct22 0:00 pickup -l -t fifo -u -c 
zgp 8052 0.0 0.1 5108 836 ? S 00:16 0:00 (MOO name-lookup slave) 
root 8263 0.0 0.5 8528 2680 ? Ss 01:04 0:00 sshd: root@notty 
root 8265 0.0 0.3 5032 1576 ? Ss 01:04 0:00 /usr/lib/openssh/sftp-server 
root 8272 0.0 0.5 8372 2680 ? Ss 01:05 0:00 sshd: root@notty 
root 8274 0.0 0.2 5032 1352 ? Ss 01:05 0:00 /usr/lib/openssh/sftp-server 
root 8275 0.1 0.5 8372 2772 ? Ss 01:05 0:00 sshd: root@pts/4 
root 8277 0.0 0.3 5048 1748 pts/4 Ss 01:05 0:00 -bash 
root 8283 0.0 0.2 4484 1084 pts/4 R+ 01:06 0:00 ps aux

This seems to have the highest %CPU and %MEM:

zgp 1978 19.6 17.8 101524 90996 ? S Oct22 267:38 ./moo -l moo.log moo.db moo.db.new 3500

There seem to be several lines referring to "moo", do you have any idea what that is?

I'd be interested to see the I/O (and perhaps CPU) graphs from the dashboard… the current memory usage profile suggests a beefy periodic task that allocates a lot of memory then goes away. There is some amount of "fairness" inherent to the disk scheduling, so this will impact things for a short while even after it is gone.

And yes, I/O wait will make everything feel laggy, although it might be worth running "mtr" from your local machine to your Linode just to make sure you aren't having two unrelated problems.

Here's some fairly optimistic sequential read testing done on one of my whipping boy nodes, performed about ~30 seconds after an I/O intensive task, and then again about a minute after that:

rtucker@framboise:~$ for i in xvda xvdb xvdc xvdd; do sudo hdparm -t /dev/$i; done

/dev/xvda:
 Timing buffered disk reads:  176 MB in  3.26 seconds =  53.95 MB/sec

/dev/xvdb:
 Timing buffered disk reads:  182 MB in  3.00 seconds =  60.66 MB/sec

/dev/xvdc:
 Timing buffered disk reads:   66 MB in  3.05 seconds =  21.66 MB/sec

/dev/xvdd:
 Timing buffered disk reads:  166 MB in  3.24 seconds =  51.27 MB/sec
rtucker@framboise:~$ for i in xvda xvdb xvdc xvdd; do sudo hdparm -t /dev/$i; done

/dev/xvda:
 Timing buffered disk reads:  220 MB in  3.05 seconds =  72.05 MB/sec

/dev/xvdb:
 Timing buffered disk reads:  236 MB in  3.16 seconds =  74.60 MB/sec

/dev/xvdc:
 Timing buffered disk reads:  122 MB in  3.16 seconds =  38.57 MB/sec

/dev/xvdd:
 Timing buffered disk reads:  218 MB in  3.05 seconds =  71.51 MB/sec

For comparison, here's what I get on a Linode that does absolutely nothing all day:

rtucker@sapling:~$ for i in xvda xvdb; do sudo hdparm -t /dev/$i; done

/dev/xvda:
 Timing buffered disk reads:  464 MB in  3.01 seconds = 154.38 MB/sec

/dev/xvdb:
 Timing buffered disk reads:  256 MB in  1.66 seconds = 154.59 MB/sec

tl;dr: just because it's quiet now doesn't mean it doesn't thrash the disk when you aren't looking.

Yes, MOO is a game we run. It's definitely possible this is the cause for our issues, but prior to the migration we noticed after a Linode reboot it started lagging even before the MOO process was started.

How can we find out for sure what task is allocating this memory and going away? Do you guys want the graphs from the dashboard, if so how do I paste them in here?

I did run mtr and showed the output to Linode support and it doesn't seem to be a network issue.

Right-click-saving and putting them on a publicly-accessible URL usually works. You can redact stuff as required. imgur.com is a decent image host for stuff like this, or the Public folder in a Dropbox works too. If it's a periodic thing going nuts, that will help narrow it down immensely.

munin is a handy tool for graphing a lot of things, too.

CPU: http://dl.dropbox.com/u/1930871/cpu.png

Net: http://dl.dropbox.com/u/1930871/net.png

I/O: http://dl.dropbox.com/u/1930871/io.png

Hope these help figure out the problem.

It looks like there was a spike in CPU at about 16:00, but it doesn't seem to be related to your network traffic, and your io chart shows it's spike about four hours before that, so I doubt that's related.

Do you have a cron job run at around 16:00? If not, were you running any kind of task around that time?

Looks like there's something causing it to dip into swap every hour on the hour. Some sort of cron job?

Hi,

We do run a backup cron script every hour. It copies one file and renames about 30 others. The files are about 60 mb in size, so I suppose this could explain a spike in i/o. However I'm pretty positive the lag occurs more than just on the hour. If it was only hourly, I don't think it would have been as noticeable as it is.

Does it do anything other than cp and mv? There's a lot of swap activity there, so something seems to be allocating a lot of RAM and then freeing it before you can see it…

It might be doing something stupid like "copying" and "renaming" by reading entire files into RAM at once before writing them out again.

Up to a 100 IO blocks per second in average for spikes (and 19.21 in total average) is hardly "a lot" of swap activity. Methinks this is "normal" for default swappiness of 60. Tuning that down to 10 should help with these "spikes".

Also the CPU seems to be spiking up to 100% (out of 400%) an in average being 26% (out of %400). I don't see how this can be the cause of any lags.

However, if this MOO game is what I think it is, with the shown network chart methinks there is a lot of very small packets involved, so the possible causes for the lag can be:

1. Inappropriate buffer sizes and or absence of TCP_NODELAY when dealing with lots of small packets

2. QoS

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct