Linode mysteriously stops running?
Now it happened again today 4 times. It ran for a few minutes before being powered off. I downgraded from the 2.6 to 2.4 kernel and its been working OK for 10 hours.
The linode was host2/Redhat large pofile/2.6 kernel. The log messaged showed nothing when it stopped. Setup had been changed in months.
I've seen this happen very occasionally in the past but 4 times in 2 hours is too much. Has anyone else experienced this?
1 Reply
Kernel panic - not syncing: Kernel mode fault at addr 0x8, ip 0x400fa148
EIP: 0073:[<400fa148>] CPU: 0 Not tainted ESP: 007b:bffffd88 EFLAGS: 00000246
Not tainted
EAX: 00000001 EBX: 00000000 ECX: bffffdab EDX: 00000001
ESI: 0804ad00 EDI: bffffdab EBP: bffffdb8 DS: 007b ES: 007b
Call Trace:
[<80034549>] vprintk+0xe9/0x13c
[<800421ee>] notifier_call_chain+0x1e/0x38
[<80033a9e>] panic+0x56/0xbc
[<8001d907>] segv+0x83/0x1c8
[<8001d9d6>] segv+0x152/0x1c8
[<8030f1d7>] sigemptyset+0x17/0x30
[<8001bd91>] change_signals+0x41/0x6c
[<8001dd3c>] segv_handler+0x160/0x208
[<8001dcab>] segv_handler+0xcf/0x208
[<80022d2c>] sig_handler_common_skas+0x7c/0x98
[<80022d45>] sig_handler_common_skas+0x95/0x98
[<8001de2a>] sig_handler+0x32/0x34
[<8030edfd>] __libc_longjmp+0x3d/0x50
[<8030eded>] __libc_longjmp+0x2d/0x50
[<80022eda>] do_buffer_op+0x76/0x120
[<8030eef8>] __restore+0x0/0x8
[<80023963>] chan_window_size+0x7/0x44
[<8001ebc2>] setjmp_wrapper+0x16/0x50
[<8030f1d7>] sigemptyset+0x17/0x30
[<8001bed4>] set_signals+0x6c/0x10c
[<8030f1d7>] sigemptyset+0x17/0x30
[<8001bd91>] change_signals+0x41/0x6c
[<8030f1d7>] sigemptyset+0x17/0x30
[<8030f1d7>] sigemptyset+0x17/0x30
[<8001bed4>] set_signals+0x6c/0x10c
[<801fee85>] tty_read+0xa5/0xf0
[<802043fc>] write_chan+0x0/0x29c
[<80065b5b>] vfs_read+0xc3/0xf4
[<80065d97>] sys_read+0x3b/0x68
[<800228a6>] execute_syscall_skas+0xaa/0xec
[<80025149>] winch_interrupt+0x49/0xa8
[<80017047>] handle_IRQ_event+0x27/0x6c
[<800171e0>] do_IRQ+0x68/0xc4
[<80022d84>] user_signal+0x3c/0x44
[<80021efb>] userspace+0x18f/0x1e4
[<8030eef8>] __restore+0x0/0x8
[<8030f151>] kill+0x11/0x20
I'll report this to the uml-devel mailing list, however judging by the chanwindowsize and winch_interrupt system calls, I believe this is the bug that Newsome has been hunting for months, and has been recently fixed in UML. I'll try to get a new 2.6-um kernel out in the next week or two.
-Chris