Can't reach atlanta29
I can't reach my Linode through ssh, lish or the AJAX console (which I've never actually tried before now). But everything seems fine on the Dashboard.
$ ssh bernied@atlanta29.linode.com
ssh: connect to host atlanta29.linode.com port 22: Connection timed out
I can't reach the apache or zope server either.
Last time this happened it came good after some time, maybe 30 minutes, but this is still rubbish.
I thought if it was the server (atlanta29), I'd see something on the forums, but nothing seems to be happening.
It's a Linode 360, by the way.
Anyone got any ideas, before I raise a support ticket?
Thanks
Bernie
15 Replies
When I try to use AJAX, I get (using firefox on WinXP):
> Network Timeout
The server at console-atlanta.linode.com is taking too long to respond.
So why does the dashboard still look ok?
My CPU usage hasn't gone above 3%, IO spiked at 800 a couple of hours ago, but settled since then, and the host is 'idle'.
Is that the problem, the server is lazy?
I was tempted to try a reboot, but I didn't. And the machine seems to have kept running:
$ uptime
10:29:53 up 6 days, 20:26, 1 user, load average: 0.00, 1.87, 5.91
(But that last load number looks high to me - is that the 15 minute average?)
Also, I've had a look at syslog, openvpn was talking to clients and failed at 10:18:41. I'll post logs if anyone wants, but I really don't think that this was about my linode.
Curiously, the network and CPU graphs on the dashboard now seem to have a gap for the time the machine was out of touch (it was probably there before, but I didn't notice), so although the dashboard was saying the linode was running, perhaps it only assumed this.
Does anyone know what happened here?
And how it might be prevented in the future?
I'm guessing the traceroute info is not much use now, as we're back up?
$ traceroute vserver
traceroute to vserver (64.22.71.39), 30 hops max, 40 byte packets
1 192.168.0.1 (192.168.0.1) 2.207 ms 0.998 ms 0.968 ms
2 78.144.176.1 (78.144.176.1) 37.177 ms 38.115 ms 34.194 ms
3 78.151.226.129 (78.151.226.129) 38.357 ms 38.293 ms 35.025 ms
4 78.151.225.3 (78.151.225.3) 36.299 ms 39.248 ms 38.874 ms
5 gig-10-1-rtr001.hex.opaltelecom.net (62.24.254.49) 43.634 ms 46.382 ms 44.713 ms
6 xe-10-2-0-scr001.sov.as13285.net (78.144.1.128) 44.529 ms 44.400 ms 44.310 ms
7 xe-10-0-0-scr010.sov.as13285.net (78.144.0.228) 46.834 ms 45.299 ms 113.758 ms
8 195.66.226.167 (195.66.226.167) 183.445 ms 206.404 ms 206.932 ms
9 gnax.ge2-13.br01.atl01.pccwbtn.net (63.216.31.130) 162.686 ms 279.420 ms 200.021 ms
10 atl-core-e-gi4-4.gnax.net (209.51.131.30) 139.588 ms 138.655 ms 139.534 ms
11 vserver (64.22.71.39) 138.155 ms 139.717 ms 139.264 ms
@bernied:
But that last load number looks high to me - is that the 15 minute average?
Yes
is not in the Atlanta data centre.
??
Which I suppose is fair enough, but why does the dashboard say my machine is running, when it doesn't actually have any contact with it?
When dashboard says your system in running, that's about the same as having the power LED alight on a physical server.
@pclissold:
@bernied:But that last load number looks high to me - is that the 15 minute average?
Yes
And is that the load for my linode, or the host?
Should I be looking for whatever caused the spike, or is it simply that the loss of contact will have caused the activity?
@bernied:
And is that the load for my linode, or the host?
Your Linode.
My guess is that you should look for the cause of the spike - it's unlikely that the spike was caused by loss of connectivity - more likely that the spike caused the Linode to stop responding to the network.
@pclissold:
@bernied:And is that the load for my linode, or the host?
Your Linode.My guess is that you should look for the cause of the spike - it's unlikely that the spike was caused by loss of connectivity - more likely that the spike caused the Linode to stop responding to the network.
If that is the case then I'm messing up the entire host. Lish is on the host (surely?), not the linode. So if I couldn't get through, then nobody else on atlanta29 could either.
Think I'd better raise a support ticket for this - don't want to be getting into more trouble than necessary.
traceroute to 64.22.125.93 (64.22.125.93), 64 hops max, 40 byte packets
1 192.168.1.1 (192.168.1.1) 3.554 ms 0.908 ms 0.824 ms
2 10.96.32.1 (10.96.32.1) 7.712 ms 8.815 ms 7.951 ms
3 dstswr1-vlan-2.rh.nantny.cv.net (67.83.252.161) 7.983 ms 9.258 ms 8.669 ms
4 rtr1.ge2-15.mhe.prnynj.cv.net (67.83.252.137) 11.767 ms 10.842 ms 10.141 ms
5 rtr4-tg11-2.wan.prnynj.cv.net (64.15.6.25) 9.998 ms 10.341 ms 11.463 ms
6 rtr1-tg11-1.in.nwrknjmd.cv.net (64.15.0.82) 12.049 ms 12.637 ms 11.580 ms
7 * * *
8 69.31.95.141 (69.31.95.141) 74.329 ms 12.415 ms 11.795 ms
9 69.31.95.146 (69.31.95.146) 13.344 ms 34.517 ms 11.550 ms
10 69.22.142.73 (69.22.142.73) 17.683 ms 16.131 ms 17.566 ms
11 * 69.22.142.50 (69.22.142.50) 37.188 ms 83.716 ms
12 69.31.135.130 (69.31.135.130) 35.110 ms 35.028 ms 35.945 ms
13 69.31.135.42 (69.31.135.42) 194.990 ms 113.049 ms 220.069 ms
14 209.51.137.98 (209.51.137.98) 34.540 ms 33.070 ms 33.403 ms
15 64.22.125.93 (64.22.125.93) 34.248 ms 35.236 ms 33.175 ms
ssh to the host:
spinoza:~/.ssh$ ssh cturner@host92.atlanta.linode.com
ssh: connect to host host92.atlanta.linode.com port 22: Operation timed out
And my network usage on the control panel shows a spike at midnight and a 30 minute gap around 5am EST. Currently the panle's showing network activity.
Best, Charles
@bernied:
Anyone got any ideas, before I raise a support ticket?
My linode on atlanta35 was unresponsive on "Thu Aug 28 04:09:13 CDT 2008". I have a script that checks from another provider (Panix) every thirty minutes. You are not losing your mind.
– Jeff
@jeffml:
@bernied:Anyone got any ideas, before I raise a support ticket?
My linode on atlanta35 was unresponsive on "Thu Aug 28 04:09:13 CDT 2008". I have a script that checks from another provider (Panix) every thirty minutes. You are not losing your mind.– Jeff
Thanks for the concern over my mental health, but that (although never perfect) is not really the issue. The thing I'm concerned about is whether it's me that's done the damage, or some other person or piece of equipment.
> A customer in the Atlanta DC was the target of a DDoS early this morning, and that was resolved fairly swiftly. This was most likely the cause of your network inconsistencies.
So that's good enough for me.
IRC