Asterisk process random crashing ?

Hi all,

I tried running asterisk on a linode after reading hoopycat's blog about it (quite helpful btw, thanks - would never have found linode otherwise).

http://blog.hoopycat.com/2009/08/asteri … tpd-linode">http://blog.hoopycat.com/2009/08/asterisk-freepbx-ubuntu-lighttpd-linode

The problem is though that after following the instructions there, I ended up finding the asterisk process mysteriously dying after a few hours and not leaving anything in the logs to indicate why. I enabled full verbose logging and still got nothing.

Has anyone else seen this ? Wondered if it might be something to do with virtualized kernels not agreeing with the dahdi timing module or something …

Any thoughts ?

Rick

7 Replies

Asterisk randomly crashing? Ha! Never! Go ahead, pull the other one.

I suppose the first thing I would try is looking for a core file. "updatedb" then "locate core", or use find. If there is a core file, you can do gdb /path/to/asterisk /path/to/corefile and then "bt", and you might be able to figure out what it was doing when it choked.

Barring that, if conferencing isn't a mission-critical task, try unloading the dahdi modules and see if that helps. They're not always the culprit, but they're a big, easy target. Also, take a look at "dmesg" or /var/log/kern.log… once in awhile, a related disruption gets noticed by the kernel.

Good luck! Let me know what you find.

That was the other thing I forgot to mention - no core files either.

dmesg has nothing except firewall blocked packet log msgs since the VPS was booted.

Just to be sure I did a "sudo grep -i asterisk -r /var/log" in case I was missing something, and all I got was the package installation logs for apt-get and the asterisk boot logs I had seen.

Thanks - will try removing dahdi and see how it goes.

Ah, you're a step ahead of me :-)

In general, if you can avoid using dahdi, it's probably worthwhile to do so… as far as I can tell, MeetMe is the only thing that requires it.

It normally works fine, though. Part of my testing while writing up those blog posts involved conferencing a couple phones together with some hold music overnight, and Asterisk was fine.

The following was copied and pasted from my LISH shell.

[503886.741604] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]

[503925.547530] BUG: soft lockup - CPU#2 stuck for 61s! [swapper:0]

[503981.182304] BUG: soft lockup - CPU#1 stuck for 61s! [swapper:0]

[503997.327260] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]

[504092.864522] BUG: soft lockup - CPU#2 stuck for 61s! [swapper:0]

[504219.562080] BUG: soft lockup - CPU#2 stuck for 61s! [swapper:0]

[505205.206375] BUG: soft lockup - CPU#2 stuck for 61s! [swapper:0]

[505288.966599] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]

[505729.410219] BUG: soft lockup - CPU#1 stuck for 61s! [swapper:0]

[505734.586064] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]

[505964.432068] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]

[506636.573305] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]

  • Asterisk PBX is not running

[597711.283405] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]

[598093.570451] BUG: soft lockup - CPU#2 stuck for 61s! [swapper:0]

[598732.404967] BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]

[598954.665000] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]

[598954.672638] BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]

[599181.227992] BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]

[600487.326731] BUG: soft lockup - CPU#2 stuck for 61s! [swapper:0]

[600767.142879] BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]

[602875.977771] BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]

[603334.463723] BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]

[604933.363993] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]

[605030.990425] BUG: soft lockup - CPU#1 stuck for 61s! [swapper:0]

[605096.506899] BUG: soft lockup - CPU#1 stuck for 61s! [swapper:0]

[605021.310349] BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]

[605139.527156] BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0]

[605086.806608] BUG: soft lockup - CPU#3 stuck for 61s! [swapper:0]

[605276.125638] BUG: soft lockup - CPU#2 stuck for 61s! [swapper:0]

  • Asterisk PBX is not running

My guess is that the "asterisk is not running" bits are just me trying to restart when it was already dead.

However, the 61s lockups are almost certainly why something kernel dependent would crash.

Any thoughts about what to do here ?

Ah yep, I did see that at boot sometimes, but it seemed to work itself out of it after a little while. You're the only other person to mention it, as yet… hmm. I think it's a kernel-related bug, personally.

I am following a similar guide as linked in this thread. I had the same problems

69.801295] BUG: soft lockup - CPU#0 stuck for 61s! invoke-rc.d:970
  135.295553] BUG: soft lockup - CPU#0 stuck for 61s! invoke-rc.d:970
   66.572783] BUG: soft lockup - CPU#2 stuck for 61s! swapper:0

I did a cold boot (shutdown and boot) and though the notice came up under Lish, ssh was reachable again.

Right now I'm ignoring this but it does warrant further investigation. It seems to be an Ubuntu thing.

Are you doing it on 10.04, or still with 9.10? I haven't personally tried it on 10.04, but I was secretly hoping that would make everything better.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct