My linode hangs every months or so...

Hi,

I use linode since years, never got a single problem but since 2013 I'm experiencing some linode hanging and I'm not able to discover why it hangs.

When the server hangs I receive a mail from linode that my server has averaged 20% of cpu usage in the last two hours, this usually happens by night and I find my server hanged in the morning.

Apache does not respond anymore, email does not work, svn not working, any service is working neither SSH.

The only service that it works is lish console.

Have you got an idea on what it can cause this issue?

Thanks.

15 Replies

I think that fail2ban is causing this issues, what do you think?

is there a method to catch the problem?

@sblantipodi:

I think that fail2ban is causing this issues, what do you think?

is there a method to catch the problem?

Why do you think that is it?

Have you looked at the /var/log/fail2ban.log files to see what it is doing?

@Dweeber:

@sblantipodi:

I think that fail2ban is causing this issues, what do you think?

is there a method to catch the problem?

Why do you think that is it?

Have you looked at the /var/log/fail2ban.log files to see what it is doing?

because is the most cpu intensive software I have on that linode.

is there something that can send me an email of the jobs that is taking

20% of CPU for more than an hours?

this will help me understanding what happens on my server before hanging.

You could look into Longview.

Perhaps you have really huge logs that aren't being rotated? Just guessing..

Run this program from cron every 5 minutes:

#!/bin/ksh -p

LOG=/var/tmp/srvr_stat.$(date +%Y%m%d)

{
  date
  uptime
  free
  ps aux
  echo
  echo
} >> $LOG

This'll let you see some basics of what your machine if doing; in particular free memory (are you swapping to death?) and processes using lots of CPU. After your machine crashes you can review the log files to see what happened.

@sweh:

Run this program from cron every 5 minutes:

#!/bin/ksh -p

LOG=/var/tmp/srvr_stat.$(date +%Y%m%d)

{
  date
  uptime
  free
  ps aux
  echo
  echo
} >> $LOG

This'll let you see some basics of what your machine if doing; in particular free memory (are you swapping to death?) and processes using lots of CPU. After your machine crashes you can review the log files to see what happened.

Ok, I modified the program to write a new file every 5 minutes and put this files in a new direcotry every day.

#!/bin/ksh -p

mkdir -p /root/log_for_crash_detect/day_$(date +%Y-%m-%d)
LOG=/root/log_for_crash_detect/day_$(date +%Y-%m-%d)/log_$(date +%Y-%m-%d-%H-%M)

{
  date
  uptime
  free
  ps aux
  echo
  echo
} >> $LOG

In this way it will be easyer to track the problem.

I really suspect that fail2ban is the killer.

This particular linode does not run anything such resource intensive, it runs a mailserver, a svn server, a proxy server and I use it for tunneling.

I think that the problem is in fail2ban because I know it has many problem in analyzing big files.

I rotate my maillog every week but it can be up to 300MB and this may create problems to fail2ban I think.

IN any case I will keep you posted if I discover something more.

Thanks to help me tracking the problem.

wait a minute. is there a way to sort for CPU usage using ps command?

Disable fail2ban and see if the problem goes away?

@hoopycat:

Disable fail2ban and see if the problem goes away?

this is the second option I have if I'm sure that fail2ban is the killer.

@sblantipodi:

wait a minute. is there a way to sort for CPU usage using ps command?

ps aux --sort '-pcpu'

sorts all processes by cpu

@obs:

@sblantipodi:

wait a minute. is there a way to sort for CPU usage using ps command?

ps aux --sort '-pcpu'

sorts all processes by cpu

thanks

in any case when my sever hangs, it does not properly hangs, it stops responding to external IP, I can connect to server over lish so it isn't hanged.

I'm thinking that Linode limits CPU usage and some sort of "resource protection" is executed on my linode.

I've never heard of Linode "limiting CPU". If you're using too much, they'll tell you, but then again, I've never heard of that either. Though if you do think the host is the cause you can open a ticket to be migrated to another host.

@Nuvini:

I've never heard of Linode "limiting CPU". If you're using too much, they'll tell you, but then again, I've never heard of that either. Though if you do think the host is the cause you can open a ticket to be migrated to another host.

I opened a ticket previously and they saied me that they not limit any resource.

I will belive that, sorry if I doubted :)

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct