[SOLVED] My Linode Dies
I've been having some problems recently with one of my Linodes (on Atlanta57).
I'll be doing something simple and boring and the whole thing will die.
Today's example, I had just started extracting a tarball when it died.
This is all I can gather from LISH:
[<c016141d>] mempool_alloc+0x2d/0xe0
[<c016141d>] mempool_alloc+0x2d/0xe0
[<c01a72ab>] bvec_alloc_bs+0x7b/0x140
[<c01a7571>] bio_alloc_bioset+0x51/0xe0
[<c0425852>] clone_bio+0x42/0x90
[<c0426a60>] __split_bio+0x370/0x3a0
[<c0426e3f>] dm_request+0xff/0x170
[<c03a6566>] generic_make_request+0xe6/0x230
[<c0105c53>] xen_restore_fl_direct_end+0x0/0x1
[<c01825f7>] kmem_cache_alloc+0x57/0xb0
[<c016141d>] mempool_alloc+0x2d/0xe0
[<c03a78d3>] submit_bio+0x63/0xf0
[<c01a72bd>] bvec_alloc_bs+0x8d/0x140
[<c01a758b>] bio_alloc_bioset+0x6b/0xe0
[<c01a389a>] submit_bh+0xba/0xf0
[<c01a5639>] __block_write_full_page+0x1a9/0x310
[<c0105407>] xen_force_evtchn_callback+0x17/0x30
[<c0212880>] ext3_get_block+0x0/0x100
[<c01a588a>] block_write_full_page+0xea/0x100
[<c0212880>] ext3_get_block+0x0/0x100
[<c02141b3>] ext3_ordered_writepage+0xa3/0x170
[<c0210f70>] bget_one+0x0/0x10
[<c0164c78>] __writepage+0x8/0x30
[<c016521f>] write_cache_pages</c016521f></c0164c78></c0210f70></c02141b3></c0212880></c01a588a></c0212880></c0105407></c01a5639></c01a389a></c01a758b></c01a72bd></c03a78d3></c016141d></c01825f7></c0105c53></c03a6566></c0426e3f></c0426a60></c0425852></c01a7571></c01a72ab></c016141d></c016141d>
I know these things are near on impossible to diagnose, but any suggestions folks? It's quite annoying
EDIT: forgot to mention, I'm running ArchLinux with kernel 2.6.28-linode15
EDIT 2: Here's the logs from the time it died:
Jun 25 17:12:34 platypus kernel: [IPT ISC] : IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:fe:fd:40:16:47:15:08:00 SRC=192.168.139.100 DST=192.168.255.255 LEN=243 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=138 DPT=138 LEN=223
Jun 25 17:12:34 platypus kernel: [IPT ISC] : IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:fe:fd:40:16:47:15:08:00 SRC=192.168.139.100 DST=192.168.255.255 LEN=235 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=138 DPT=138 LEN=215
Jun 25 17:24:16 dingo syslog-ng[3743]: syslog-ng starting up; version='3.0.1'
Jun 25 17:24:16 dingo kernel: Reserving virtual address space above 0xf5800000
Jun 25 17:24:16 dingo kernel: Linux version 2.6.28-linode15 (root@db1.linode.com) (gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu3)) #2 SMP Wed Jan 14 09:18:53 EST 2009
3 Replies
------------[ cut here ]------------
kernel BUG at drivers/block/xen-blkfront.c:243!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/block/dm-4/removable
Modules linked in:
Pid: 21028, comm: perl Not tainted (2.6.28-linode15 #2)
EIP: 0061:[<c03ee830>] EFLAGS: 00010046 CPU: 0
EIP is at do_blkif_request+0x2e0/0x360
EAX: 00000001 EBX: 00000000 ECX: d43a5bc0 EDX: c343edb0
ESI: d5952288 EDI: d59522c8 EBP: 000001c3 ESP: c151fe98
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
Process perl (pid: 21028, ti=c151e000 task=d49d2040 task.ti=c151e000)
Stack:
00000005 d5952288 00000288 d5988028 d5956000 c420864c 00000007 0000000d
d5956000 00000002 00000006 d5952000 00000000 d43a5bc0 d2c7de0c ffffffff
d5988028 d5956000 0000000b 00000014 c03a6ca5 d5956000 c03ee8c6 00000000
Call Trace:
[<c03a6ca5>] blk_invoke_request_fn+0x95/0x100
[<c03ee8c6>] kick_pending_request_queues+0x16/0x30
[<c03eea6d>] blkif_interrupt+0x18d/0x1d0
[<c0159510>] handle_IRQ_event+0x30/0x60
[<c015b428>] handle_level_irq+0x78/0xf0
[<c010aae7>] do_IRQ+0x77/0x90
[<c03c8968>] xen_evtchn_do_upcall+0xe8/0x150
[<c0109197>] xen_do_upcall+0x7/0xc
Code: 2c 8d 54 03 40 8d 44 0e 54 b9 6c 00 00 00 e8 98 a5 fc ff 8b 44 24 3c e8 ff 92 fd ff 83 44 24 18 01 e9 40 fd ff ff 0f 0b eb fe 90 <0f> 0b eb fe 8b 44 24 20 ba 40 e5 3e c0 8b 4c 24 20 c7 04 24 0b
EIP: [<c03ee830>] do_blkif_request+0x2e0/0x360 SS:ESP 0069:c151fe98
Kernel panic - not syncing: Fatal exception in interrupt
------------[ cut here ]------------
WARNING: at kernel/smp.c:333 smp_call_function_mask+0x1cb/0x1d0()
Modules linked in:
Pid: 21028, comm: perl Tainted: G D 2.6.28-linode15 #2
Call Trace:
[<c0128adf>] warn_on_slowpath+0x5f/0x90
[<c03b8e26>] memmove+0x36/0x40
[<c03dcc5a>] scrup+0x7a/0xe0
[<c0140987>] atomic_notifier_call_chain+0x17/0x20
[<c03dccdf>] notify_update+0x1f/0x30
[<c03dcf6a>] vt_console_print+0x20a/0x2d0
[<c0105407>] xen_force_evtchn_callback+0x17/0x30
[<c0105cea>] check_events+0x8/0xe
[<c0105c53>] xen_restore_fl_direct_end+0x0/0x1
[<c0105407>] xen_force_evtchn_callback+0x17/0x30
[<c0105cea>] check_events+0x8/0xe
[<c0105c53>] xen_restore_fl_direct_end+0x0/0x1
[<c01295e0>] vprintk+0x170/0x350
[<c014a46b>] smp_call_function_mask+0x1cb/0x1d0
[<c0105fd0>] stop_self+0x0/0x30
[<c0105407>] xen_force_evtchn_callback+0x17/0x30
[<c0105cea>] check_events+0x8/0xe
[<c0105c53>] xen_restore_fl_direct_end+0x0/0x1
[<c0561ca3>] _spin_unlock_irqrestore+0x13/0x20
[<c03dec96>] do_unblank_screen+0x16/0x130
[<c014a484>] smp_call_function+0x14/0x20
[<c0128b6e>] panic+0x4e/0x100
[<c010ac3c>] oops_end+0x8c/0xa0
[<c0109b50>] do_invalid_op+0x0/0xa0
[<c0109bcf>] do_invalid_op+0x7f/0xa0
[<c03ee830>] do_blkif_request+0x2e0/0x360
[<c0105407>] xen_force_evtchn_callback+0x17/0x30
[<c0105cea>] check_events+0x8/0xe
[<c0105407>] xen_force_evtchn_callback+0x17/0x30
[<c0105407>] xen_force_evtchn_callback+0x17/0x30
[<c0105cea>] check_events+0x8/0xe
[<c0105407>] xen_force_evtchn_callback+0x17/0x30
[<c0105cea>] check_events+0x8/0xe
[<c0105c53>] xen_restore_fl_direct_end+0x0/0x1
[<c0561ca3>] _spin_unlock_irqrestore+0x13/0x20
[<c0561f4a>] error_code+0x72/0x78
[<c03ee830>] do_blkif_request+0x2e0/0x360
[<c03a6ca5>] blk_invoke_request_fn+0x95/0x100
[<c03ee8c6>] kick_pending_request_queues+0x16/0x30
[<c03eea6d>] blkif_interrupt+0x18d/0x1d0
[<c0159510>] handle_IRQ_event+0x30/0x60
[<c015b428>] handle_level_irq+0x78/0xf0
[<c010aae7>] do_IRQ+0x77/0x90
[<c03c8968>] xen_evtchn_do_upcall+0xe8/0x150
[<c0109197>] xen_do_upcall+0x7/0xc
---[ end trace c449499288c87a80 ]---</c0109197></c03c8968></c010aae7></c015b428></c0159510></c03eea6d></c03ee8c6></c03a6ca5></c03ee830></c0561f4a></c0561ca3></c0105c53></c0105cea></c0105407></c0105cea></c0105407></c0105407></c0105cea></c0105407></c03ee830></c0109bcf></c0109b50></c010ac3c></c0128b6e></c014a484></c03dec96></c0561ca3></c0105c53></c0105cea></c0105407></c0105fd0></c014a46b></c01295e0></c0105c53></c0105cea></c0105407></c0105c53></c0105cea></c0105407></c03dcf6a></c03dccdf></c0140987></c03dcc5a></c03b8e26></c0128adf></c03ee830></c0109197></c03c8968></c010aae7></c015b428></c0159510></c03eea6d></c03ee8c6></c03a6ca5></c03ee830>
2.6.28.3-linode17 has a fix.
I've updated all my Linodes to 2.6.30 now