Fremont glitch 6/21 10am HST
All our Linodes were unreachable for +90 seconds (nagios monitor)…
for our Email stuff, not so bad (but did show up on our IMAP Proxy).
but for our VoIP PBXs, it caused calls to drop…
So we're just wondering if anyone in the community heard anything.
17 Replies
I checked the Linode Status page, but it has nothing on this issue (at least not so far).
Backups failing every day for over two weeks
DNS Manager issues (reporting zones with errors)
Loss of connectivity today
I'm wondering if it's time to migrate to another DC. Anyone? Thoughts?
thanks,
bruce
@bbergman:
I'm wondering if it's time to migrate to another DC. Anyone? Thoughts?
Just choose the nearest one for main auditory of your site(s):
@bbergman:
I picked Fremont because it was close
Close? How so?
The Internet is a big place, and unless you're running two-way video or voice, or playing FPS type games, whats a few more milliseconds give or take?
@OZ:
Just choose the nearest one for main auditory of your site(s):
I think you're missing the point. The DC's are clearly NOT all the same. This Fremont location is building up a poor reputation for stability.
One would normally think to pick the closest one, and that's exactly what I did (Fremont is in my state), but now I'm thinking I should have read up the quality and performance of the various sites before choosing.
I'm now thinking Atlanta or Dallas. Anyone?
thanks,
bruce
@vonskippy:
@bbergman:I picked Fremont because it was close
Close? How so?The Internet is a big place, and unless you're running two-way video or voice, or playing FPS type games, whats a few more milliseconds give or take?
Well, difference in ping between Britain and USA is very noticeable. I become client of Linode only because Linode has data center in London:)
@bbergman:
I think you're missing the point. The DC's are clearly NOT all the same.
I understood you. I put it just as а starting point:)
And I think Dallas, because…
About stability:
Looking at the graphs I see a huge spike for incoming traffic at about that time (8am PST). I'm wondering if someone made a mistake and realized it very quickly after the fact.
@waldo:
Why does it seem that people post in the forums instead of contacting support when they have issues like this? It would be nice to hear from someone at Linode to see if it was a Linode or HE issue and with all of the recent issues in Fremont to see if Linode is doing anything, like how hard they're leaning on HE to straighten their stuff up.
Actually, I HAVE created tickets with Support for all of my issues. In fact, the Support folks probably cringe every time I create one.
I came to the forums here because on more than one occasion, the Support folks have told me to look for help here (witness my recent performance issue of about two weeks ago). I think there are some awesomely smart people here, and the more we can share this information and crowd-source solutions, the better.
But yes, Support should always be the first stop. And by all means, I would love to hear an official report from Linode on the Fremont DC. The backup problem is getting almost ridiculous, for one…
thanks,
bruce
@bbergman:
Backups failing every day for over two weeks
DNS Manager issues (reporting zones with errors)
Loss of connectivity today
A single storage pool in Fremont has had problems over the past few weeks, which we've been working on to correct.
Part of the solution requires redistributing the data to more storage capacity that was to be added to the pool in question. We encountered some hardware issues which prolonged adding the additional capacity. This was unanticipated, has since been resolved, and now we're in the final stages of moving the data around (which takes a while!). I'd expect the backup service for those affected to resume normal (even better!) operation within the next day. We're working on it.
We have also been discussing our procedures and how we could have handled this situation better - including customer notifications and compensation for the backup service downtime - rest assured that we'll make it right. Again, we're working on it.
The DNS Manager 'check zone' issue is completely unrelated to Fremont
The connectivity issue that was seen today was caused by a BGP session failure, upstream from us. We're working with the Fremont facility to resolve this, but I don't have any additional information at this time.
-Chris
@bbergman:
Fremont seems to have all the problems! As a new Linode user, I picked Fremont because it was close, but so far I've seen:
Backups failing every day for over two weeks
DNS Manager issues (reporting zones with errors)
Loss of connectivity today
I'm wondering if it's time to migrate to another DC. Anyone? Thoughts?
thanks,
bruce
:shock:
Actually we've been quiet impressed with Linode. We've tried to run VoIP on other platforms (Rackspace) and it didn't work out. mix of latency and the VEs processor.
Plus you really have to appreciate that linode lets you pool the bandwidth.
As for the backup, We use Amazon S3 from within our linode, works good and not all your eggs are in 1 basket.
As the the DNS , We just run our own - easy enough.
Besides this 90 second outage (which only really affects VoIP users)… www and email could always blame it on something else..lol. Linode has rocked for us.
@waldo:
Why does it seem that people post in the forums instead of contacting support when they have issues like this? It would be nice to hear from someone at Linode to see if it was a Linode or HE issue and with all of the recent issues in Fremont to see if Linode is doing anything, like how hard they're leaning on HE to straighten their stuff up.
Looking at the graphs I see a huge spike for incoming traffic at about that time (8am PST). I'm wondering if someone made a mistake and realized it very quickly after the fact.
We've all become specialist … HE.net I'm sure gets paid very well to run BGP… (off topic, but have you seen the $1/GB, that's unreal!).
We did talk to linode, but they're only human. The community would have resources beyong Linode (like maybe someone is a XO customer and could verify the outage @ Fremont)….
Just like open source, Lots of minds are better than some.
@caker:
The DNS Manager 'check zone' issue is completely
and has since been resolved. unrelated to Fremont-Chris
Thanks Chris. I made an assumption on that based on other forum posts that indicated that if you change DC's, you have to re-do all your DNS Manager zones. In fact, I found much discussion around this, including a script to help in that migration. That being the case, I just naturally assumed that the DNS Manager entries were then tied to the DC.
Are you saying that if I transitioned to Atlanta, say, tomorrow, that I would not have to make ANY changes to my DNS Manager zones (other than what I choose to make personally)?
Thanks!
bruce
@Alohatone:
Actually we've been quiet impressed with Linode. We've tried to run VoIP on other platforms (Rackspace) and it didn't work out. mix of latency and the VEs processor.
For the record, and lest I sound less than happy here, I am INCREDIBLY VERY REALLY happy with Linode. It's a class act! I can't tell you how many shoddy hosts there are out there, and although I waffled on signing up with Linode, as soon as I got into the control panel and booted up my first node in 15 minutes, I have been smiling the whole time.
Seriously, the issues with Fremont just have me concerned. As a newbie hoster here, naturally I am a bit worried when I see issues like this pop up. Multiple issues. Maybe it's just coincidence that they all happened now, but it was a bit concerning.
Overall though? You'd have to pull out my fingernails before I'd move off of Linode. Just sayin'.
@Alohatone:
As for the backup, We use Amazon S3 from within our linode, works good and not all your eggs are in 1 basket.
As the the DNS , We just run our own - easy enough.
Yes, i have been thinking about off-site backups. I may actually do that. I haven't yet had the need to do a Linode restore, but I like the security and convenience of the service.
As for DNS, that's what I USED to do. When I moved here, I actually had my own DNS up and running, but honestly, if they can do it better (and it seems to be very simple), then why should I utilize my bandwidth and have to do the admin for it on the box. This way my DNS is managed at a meta-layer above my servers.
I note that my secondary DNS is still hosted with afraid.org, which is also very cool.
Mahalo a nui loa!
Bruce
@bbergman:
Are you saying that if I transitioned to Atlanta, say, tomorrow, that I would not have to make ANY changes to my DNS Manager zones (other than what I choose to make personally)?
Well, you'd have fix any IP addresses in records for the Linode you moved, since it would get a new address in the new DC. But the DNS manager (and DNS zone hosting) is part of your account, not a specific Linode or DC. In practice your account's zone data is then distributed among the 5 Linode nameservers (one per data center).
In terms of recent discussions and/or scripts, you're probably thinking of API scripts to perform the IP change in bulk, but that's just changing individual records, no different than you'd have to do with any DNS hosting - you don't have to recreate zones or anything.
– David