Can't reach atlanta29

This has happened a couple of times in the last week or so. I've been installing new software (plone site) so I thought it was my fault, but now I'm not so sure.

I can't reach my Linode through ssh, lish or the AJAX console (which I've never actually tried before now). But everything seems fine on the Dashboard.

$ ssh bernied@atlanta29.linode.com
ssh: connect to host atlanta29.linode.com port 22: Connection timed out

I can't reach the apache or zope server either.

Last time this happened it came good after some time, maybe 30 minutes, but this is still rubbish.

I thought if it was the server (atlanta29), I'd see something on the forums, but nothing seems to be happening.

It's a Linode 360, by the way.

Anyone got any ideas, before I raise a support ticket?

Thanks

Bernie

15 Replies

A bit more info:

When I try to use AJAX, I get (using firefox on WinXP):
> Network Timeout

The server at console-atlanta.linode.com is taking too long to respond.

So why does the dashboard still look ok?

My CPU usage hasn't gone above 3%, IO spiked at 800 a couple of hours ago, but settled since then, and the host is 'idle'.

Is that the problem, the server is lazy?

Do a traceroute to check the network path from your machine to your Linode.

Right, it's back up now. So that was about 25 minutes that it was unreachable.

I was tempted to try a reboot, but I didn't. And the machine seems to have kept running:

$ uptime
 10:29:53 up 6 days, 20:26,  1 user,  load average: 0.00, 1.87, 5.91

(But that last load number looks high to me - is that the 15 minute average?)

Also, I've had a look at syslog, openvpn was talking to clients and failed at 10:18:41. I'll post logs if anyone wants, but I really don't think that this was about my linode.

Curiously, the network and CPU graphs on the dashboard now seem to have a gap for the time the machine was out of touch (it was probably there before, but I didn't notice), so although the dashboard was saying the linode was running, perhaps it only assumed this.

Does anyone know what happened here?

And how it might be prevented in the future?

Peter, thanks for being interested.

I'm guessing the traceroute info is not much use now, as we're back up?

$ traceroute vserver
traceroute to vserver (64.22.71.39), 30 hops max, 40 byte packets
 1  192.168.0.1 (192.168.0.1)  2.207 ms  0.998 ms  0.968 ms
 2  78.144.176.1 (78.144.176.1)  37.177 ms  38.115 ms  34.194 ms
 3  78.151.226.129 (78.151.226.129)  38.357 ms  38.293 ms  35.025 ms
 4  78.151.225.3 (78.151.225.3)  36.299 ms  39.248 ms  38.874 ms
 5  gig-10-1-rtr001.hex.opaltelecom.net (62.24.254.49)  43.634 ms  46.382 ms  44.713 ms
 6  xe-10-2-0-scr001.sov.as13285.net (78.144.1.128)  44.529 ms  44.400 ms  44.310 ms
 7  xe-10-0-0-scr010.sov.as13285.net (78.144.0.228)  46.834 ms  45.299 ms  113.758 ms
 8  195.66.226.167 (195.66.226.167)  183.445 ms  206.404 ms  206.932 ms
 9  gnax.ge2-13.br01.atl01.pccwbtn.net (63.216.31.130)  162.686 ms  279.420 ms  200.021 ms
10  atl-core-e-gi4-4.gnax.net (209.51.131.30)  139.588 ms  138.655 ms  139.534 ms
11  vserver (64.22.71.39)  138.155 ms  139.717 ms  139.264 ms

@bernied:

But that last load number looks high to me - is that the 15 minute average?
Yes

So from all this, I'm guessing that this site:

https://www.linode.com/members

is not in the Atlanta data centre.

??

Which I suppose is fair enough, but why does the dashboard say my machine is running, when it doesn't actually have any contact with it?

www.linode.com is in Dallas.

When dashboard says your system in running, that's about the same as having the power LED alight on a physical server.

@pclissold:

@bernied:

But that last load number looks high to me - is that the 15 minute average?
Yes
And is that the load for my linode, or the host?

Should I be looking for whatever caused the spike, or is it simply that the loss of contact will have caused the activity?

@bernied:

And is that the load for my linode, or the host?
Your Linode.

My guess is that you should look for the cause of the spike - it's unlikely that the spike was caused by loss of connectivity - more likely that the spike caused the Linode to stop responding to the network.

@pclissold:

@bernied:

And is that the load for my linode, or the host?
Your Linode.

My guess is that you should look for the cause of the spike - it's unlikely that the spike was caused by loss of connectivity - more likely that the spike caused the Linode to stop responding to the network.
If that is the case then I'm messing up the entire host. Lish is on the host (surely?), not the linode. So if I couldn't get through, then nobody else on atlanta29 could either.

Think I'd better raise a support ticket for this - don't want to be getting into more trouble than necessary.

It's currently unreachable. traceroute:

traceroute to 64.22.125.93 (64.22.125.93), 64 hops max, 40 byte packets
 1  192.168.1.1 (192.168.1.1)  3.554 ms  0.908 ms  0.824 ms
 2  10.96.32.1 (10.96.32.1)  7.712 ms  8.815 ms  7.951 ms
 3  dstswr1-vlan-2.rh.nantny.cv.net (67.83.252.161)  7.983 ms  9.258 ms  8.669 ms
 4  rtr1.ge2-15.mhe.prnynj.cv.net (67.83.252.137)  11.767 ms  10.842 ms  10.141 ms
 5  rtr4-tg11-2.wan.prnynj.cv.net (64.15.6.25)  9.998 ms  10.341 ms  11.463 ms
 6  rtr1-tg11-1.in.nwrknjmd.cv.net (64.15.0.82)  12.049 ms  12.637 ms  11.580 ms
 7  * * *
 8  69.31.95.141 (69.31.95.141)  74.329 ms  12.415 ms  11.795 ms
 9  69.31.95.146 (69.31.95.146)  13.344 ms  34.517 ms  11.550 ms
10  69.22.142.73 (69.22.142.73)  17.683 ms  16.131 ms  17.566 ms
11  * 69.22.142.50 (69.22.142.50)  37.188 ms  83.716 ms
12  69.31.135.130 (69.31.135.130)  35.110 ms  35.028 ms  35.945 ms
13  69.31.135.42 (69.31.135.42)  194.990 ms  113.049 ms  220.069 ms
14  209.51.137.98 (209.51.137.98)  34.540 ms  33.070 ms  33.403 ms
15  64.22.125.93 (64.22.125.93)  34.248 ms  35.236 ms  33.175 ms

ssh to the host:

spinoza:~/.ssh$ ssh cturner@host92.atlanta.linode.com
ssh: connect to host host92.atlanta.linode.com port 22: Operation timed out

And my network usage on the control panel shows a spike at midnight and a 30 minute gap around 5am EST. Currently the panle's showing network activity.

Best, Charles

@bernied:

Anyone got any ideas, before I raise a support ticket?
My linode on atlanta35 was unresponsive on "Thu Aug 28 04:09:13 CDT 2008". I have a script that checks from another provider (Panix) every thirty minutes. You are not losing your mind.

– Jeff

@jeffml:

@bernied:

Anyone got any ideas, before I raise a support ticket?
My linode on atlanta35 was unresponsive on "Thu Aug 28 04:09:13 CDT 2008". I have a script that checks from another provider (Panix) every thirty minutes. You are not losing your mind.

– Jeff
Thanks for the concern over my mental health, but that (although never perfect) is not really the issue. The thing I'm concerned about is whether it's me that's done the damage, or some other person or piece of equipment.

Linode support said: > A customer in the Atlanta DC was the target of a DDoS early this morning, and that was resolved fairly swiftly. This was most likely the cause of your network inconsistencies. So that's good enough for me.

FWIW, in the future, you should come on IRC (#linode on irc.oftc.net). Whenever there's a severe network outage, half a dozen people immediately notice, and the admins are always there to explain what happened after it's been dealt with.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct