centos / apache httpd kernel panic - scheduling while atomic
CentOS 5.6 Final / 2.6.39.1-linode34
apache httpd 2.2.3-53.el5
This morning, for the second time, I woke up, went to check my mail, and found my linode totally hung and unresponsive. Graphs showed low CPU utilization, but zero IO for the past three hours, and near-zero network traffic. LISH logview presented:
Showing last 100 lines from current boot
-----------------------------------------
unevictable:1136 dirty:5 writeback:400 unstable:0
free:21681 slab_reclaimable:10401 slab_unreclaimable:3863
mapped:4757 shmem:34 pagetables:1507 bounce:0
DMA free:2872kB min:72kB low:88kB high:108kB active_anon:0kB inactive_anon:324kB active_file:176kB inactive_file:2152kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15808kB mlocked:0kB dirty:0kB writeback:0kB mapped:32kB shmem:0kB slab_reclaimable:188kB slab_unreclaimable:56kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 702 1008 1008
Normal free:79992kB min:3352kB low:4188kB high:5028kB active_anon:167652kB inactive_anon:226896kB active_file:84580kB inactive_file:84608kB unevictable:0kB isolated(anon):84kB isolated(file):52kB present:719320kB mlocked:0kB dirty:8kB writeback:2152kB mapped:8592kB shmem:4kB slab_reclaimable:41416kB slab_unreclaimable:15396kB kernel_stack:1576kB pagetables:6028kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 2444 2444
HighMem free:4848kB min:304kB low:668kB high:1032kB active_anon:125552kB inactive_anon:131192kB active_file:17500kB inactive_file:13012kB unevictable:4544kB isolated(anon):0kB isolated(file):0kB present:312932kB mlocked:4544kB dirty:12kB writeback:336kB mapped:10256kB shmem:132kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 536*4kB 27*8kB 0*16kB 0*32kB 2*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2872kB
Normal: 17736*4kB 803*8kB 62*16kB 37*32kB 7*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 79992kB
HighMem: 822*4kB 130*8kB 12*16kB 2*32kB 6*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4968kB
63658 total pagecache pages
12281 pages in swap cache
Swap cache stats: add 242811, delete 230530, find 5166098/5190474
Free swap = 412996kB
Total swap = 524284kB
264176 pages RAM
78850 pages HighMem
7005 pages reserved
105845 pages shared
185918 pages non-shared
------------[ cut here ]------------
kernel BUG at mm/swapfile.c:2527!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/power/state
Modules linked in:
Pid: 13288, comm: httpd Not tainted 2.6.39.1-linode34 #1
EIP: 0061:[<c01a9506>] EFLAGS: 00210246 CPU: 2
EIP is at swap_count_continued+0x176/0x180
EAX: f5792ca7 EBX: ed19c800 ECX: f5792000 EDX: 00000000
ESI: ed3d7540 EDI: 00000080 EBP: 00000ca7 ESP: cbf47e1c
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Process httpd (pid: 13288, ti=cbf46000 task=cbd09400 task.ti=cbf46000)
Stack:
ebeff8c0 00007ca7 00000040 00000000 c01a9601 d268bc78 ebeff8c0 00007ca7
00000000 c01ab877 d268bc78 b738f000 cbf47f04 c019dd43 9312b045 80000008
2859b063 c0103fa5 37a64000 0000000a 000f94e0 eb89c580 ecbceba0 9312b045
Call Trace:
[<c01a9601>] ? swap_entry_free+0xf1/0x120
[<c01ab877>] ? free_swap_and_cache+0x27/0xd0
[<c019dd43>] ? zap_pte_range+0x1b3/0x470
[<c0103fa5>] ? pte_pfn_to_mfn+0xb5/0xd0
[<c019e111>] ? unmap_page_range+0x111/0x190
[<c019e2bb>] ? unmap_vmas+0x12b/0x1e0
[<c01a0481>] ? exit_mmap+0x91/0x140
[<c01324eb>] ? mmput+0x2b/0xc0
[<c0135ddf>] ? exit_mm+0xef/0x120
[<c01379d5>] ? do_exit+0x125/0x350
[<c016ebc5>] ? audit_syscall_entry+0x1a5/0x1d0
[<c0137c3c>] ? do_group_exit+0x3c/0xa0
[<c0137cb1>] ? sys_exit_group+0x11/0x20
[<c068fe51>] ? syscall_call+0x7/0xb
[<c0680000>] ? sctp_rcv_ootb+0x50/0xf0
Code: ff 89 d8 e8 cd a2 f7 ff 01 e8 8d 76 00 c6 00 00 ba 01 00 00 00 eb b2 89 f8 3c 80 0f 94 c0 e9 b9 fe ff ff 0f 0b eb fe 0f 0b eb fe <0f> 0b eb fe 0f 0b eb fe 66 90 83 ec 10 89 1c 24 89 c3 89 74 24
EIP: [<c01a9506>] swap_count_continued+0x176/0x180 SS:ESP 0069:cbf47e1c
---[ end trace 50b157e47e34d607 ]---
Fixing recursive fault but reboot is needed!
BUG: scheduling while atomic: httpd/13288/0x00000001
Modules linked in:
Pid: 13288, comm: httpd Tainted: G D 2.6.39.1-linode34 #1
Call Trace:
[<c068e15e>] ? schedule+0x50e/0x6d0
[<c013487e>] ? vprintk+0x18e/0x3e0
[<c0105ae7>] ? xen_force_evtchn_callback+0x17/0x30
[<c0109350>] ? do_coprocessor_segment_overrun+0x80/0x80
[<c0109350>] ? do_coprocessor_segment_overrun+0x80/0x80
[<c0137b97>] ? do_exit+0x2e7/0x350
[<c0109350>] ? do_coprocessor_segment_overrun+0x80/0x80
[<c0109350>] ? do_coprocessor_segment_overrun+0x80/0x80
[<c010b7c1>] ? oops_end+0x71/0xa0
[<c01093cf>] ? do_invalid_op+0x7f/0x90
[<c01a9506>] ? swap_count_continued+0x176/0x180
[<c0187df2>] ? free_pcppages_bulk+0x2b2/0x2f0
[<c0105ae7>] ? xen_force_evtchn_callback+0x17/0x30
[<c01062c4>] ? check_events+0x8/0xc
[<c01062bb>] ? xen_restore_fl_direct_reloc+0x4/0x4
[<c0188d73>] ? free_hot_cold_page+0xd3/0x140
[<c0105ae7>] ? xen_force_evtchn_callback+0x17/0x30
[<c0103fa5>] ? pte_pfn_to_mfn+0xb5/0xd0
[<c06903c6>] ? error_code+0x5a/0x60
[<c012007b>] ? try_preserve_large_page+0x7b/0x340
[<c0109350>] ? do_coprocessor_segment_overrun+0x80/0x80
[<c01a9506>] ? swap_count_continued+0x176/0x180
[<c01a9601>] ? swap_entry_free+0xf1/0x120
[<c01ab877>] ? free_swap_and_cache+0x27/0xd0
[<c019dd43>] ? zap_pte_range+0x1b3/0x470
[<c0103fa5>] ? pte_pfn_to_mfn+0xb5/0xd0
[<c019e111>] ? unmap_page_range+0x111/0x190
[<c019e2bb>] ? unmap_vmas+0x12b/0x1e0
[<c01a0481>] ? exit_mmap+0x91/0x140
[<c01324eb>] ? mmput+0x2b/0xc0
[<c0135ddf>] ? exit_mm+0xef/0x120
[<c01379d5>] ? do_exit+0x125/0x350
[<c016ebc5>] ? audit_syscall_entry+0x1a5/0x1d0
[<c0137c3c>] ? do_group_exit+0x3c/0xa0
[<c0137cb1>] ? sys_exit_group+0x11/0x20
[<c068fe51>] ? syscall_call+0x7/0xb
[<c0680000>] ? sctp_rcv_ootb+0x50/0xf0</c0680000></c068fe51></c0137cb1></c0137c3c></c016ebc5></c01379d5></c0135ddf></c01324eb></c01a0481></c019e2bb></c019e111></c0103fa5></c019dd43></c01ab877></c01a9601></c01a9506></c0109350></c012007b></c06903c6></c0103fa5></c0105ae7></c0188d73></c01062bb></c01062c4></c0105ae7></c0187df2></c01a9506></c01093cf></c010b7c1></c0109350></c0109350></c0137b97></c0109350></c0109350></c0105ae7></c013487e></c068e15e></c01a9506></c0680000></c068fe51></c0137cb1></c0137c3c></c016ebc5></c01379d5></c0135ddf></c01324eb></c01a0481></c019e2bb></c019e111></c0103fa5></c019dd43></c01ab877></c01a9601></c01a9506>
I needed to do a destroy and reboot. This is the second time I've had to do this. Google returns virtually nothing about "scheduling while atomic: httpd".
Any thoughts/suggestions before I open a ticket? httpd is newest version, as is kernel, as far as I can tell.
Thanks to anyone who can provide assistance.
2 Replies
I found the archive of a xen-devel thread relating to this, started by psandin:
I can't offer my node for real downtime, but if you guys want to look at logs, etc., I think root could be arranged.
I just changed kernels to 3.0.4-linode38 (sorry about "newest kernel"… I honestly forgot about the kernels coming from the Linode config instead of the repositories configured in my install) and rebooted. I'll keep an eye on it.