Asterisk process random crashing ?
I tried running asterisk on a linode after reading hoopycat's blog about it (quite helpful btw, thanks - would never have found linode otherwise).
The problem is though that after following the instructions there, I ended up finding the asterisk process mysteriously dying after a few hours and not leaving anything in the logs to indicate why. I enabled full verbose logging and still got nothing.
Has anyone else seen this ? Wondered if it might be something to do with virtualized kernels not agreeing with the dahdi timing module or something …
Any thoughts ?
Rick
7 Replies
I suppose the first thing I would try is looking for a core file. "updatedb" then "locate core", or use find. If there is a core file, you can do gdb /path/to/asterisk /path/to/corefile and then "bt", and you might be able to figure out what it was doing when it choked.
Barring that, if conferencing isn't a mission-critical task, try unloading the dahdi modules and see if that helps. They're not always the culprit, but they're a big, easy target. Also, take a look at "dmesg" or /var/log/kern.log… once in awhile, a related disruption gets noticed by the kernel.
Good luck! Let me know what you find.
dmesg has nothing except firewall blocked packet log msgs since the VPS was booted.
Just to be sure I did a "sudo grep -i asterisk -r /var/log" in case I was missing something, and all I got was the package installation logs for apt-get and the asterisk boot logs I had seen.
Thanks - will try removing dahdi and see how it goes.
In general, if you can avoid using dahdi, it's probably worthwhile to do so… as far as I can tell, MeetMe is the only thing that requires it.
It normally works fine, though. Part of my testing while writing up those blog posts involved conferencing a couple phones together with some hold music overnight, and Asterisk was fine.
[503886.741604] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]
[503925.547530] BUG: soft lockup - CPU#2 stuck for 61s! [swapper:0]
[503981.182304] BUG: soft lockup - CPU#1 stuck for 61s! [swapper:0]
[503997.327260] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]
[504092.864522] BUG: soft lockup - CPU#2 stuck for 61s! [swapper:0]
[504219.562080] BUG: soft lockup - CPU#2 stuck for 61s! [swapper:0]
[505205.206375] BUG: soft lockup - CPU#2 stuck for 61s! [swapper:0]
[505288.966599] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]
[505729.410219] BUG: soft lockup - CPU#1 stuck for 61s! [swapper:0]
[505734.586064] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]
[505964.432068] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]
[506636.573305] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]
- Asterisk PBX is not running
[597711.283405] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]
[598093.570451] BUG: soft lockup - CPU#2 stuck for 61s! [swapper:0]
[598732.404967] BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]
[598954.665000] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]
[598954.672638] BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]
[599181.227992] BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]
[600487.326731] BUG: soft lockup - CPU#2 stuck for 61s! [swapper:0]
[600767.142879] BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]
[602875.977771] BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]
[603334.463723] BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]
[604933.363993] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]
[605030.990425] BUG: soft lockup - CPU#1 stuck for 61s! [swapper:0]
[605096.506899] BUG: soft lockup - CPU#1 stuck for 61s! [swapper:0]
[605021.310349] BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]
[605139.527156] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]
[605086.806608] BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]
[605276.125638] BUG: soft lockup - CPU#2 stuck for 61s! [swapper:0]
- Asterisk PBX is not running
My guess is that the "asterisk is not running" bits are just me trying to restart when it was already dead.
However, the 61s lockups are almost certainly why something kernel dependent would crash.
Any thoughts about what to do here ?
in this thread
69.801295] BUG: soft lockup - CPU#0 stuck for 61s! invoke-rc.d:970
135.295553] BUG: soft lockup - CPU#0 stuck for 61s! invoke-rc.d:970
66.572783] BUG: soft lockup - CPU#2 stuck for 61s! swapper:0
I did a cold boot (shutdown and boot) and though the notice came up under Lish, ssh was reachable again.
Right now I'm ignoring this but it does warrant further investigation. It seems to be an Ubuntu thing.