Fremont / He.net getting ddos
We had one in Newark last week, but Fremont is one every other day…
Dallas / Atlanta have been clean recently (knocking on wood)
92 Replies
all our fremont nodes 50+ are losing packets…
@Guspaz:
I'm sure they'll nullroute the affected system shortly.
how many nodes do you have on linode?
@Alohatone:
happening right now ..
all our fremont nodes 50+ are losing packets…
And going completely on and offline.
I guess this is the straw: I think I'm going to move out of Fremont … though I suppose I should wait until the net is a little stable.
Suggestions amongst Newark, Atlanta, and Dallas? Stability trumps latency for me. Is Dallas preferred for equalizing latency between the coasts?
@Alohatone:
Suggestions amongst Newark, Atlanta, and Dallas? Stability trumps latency for me. Is Dallas preferred for equalizing latency between the coasts?
Why do you assume those servers are immune?
If Linode can't keep one center working consistently, what's stopping the rest from failing?
@smparkes:
@Alohatone:happening right now ..
all our fremont nodes 50+ are losing packets…
And going completely on and offline.
I guess this is the straw: I think I'm going to move out of Fremont … though I suppose I should wait until the net is a little stable.
Suggestions amongst Newark, Atlanta, and Dallas? Stability trumps latency for me. Is Dallas preferred for equalizing latency between the coasts?
Last week Newark had a ddos attack.
thus far we are happy with Dallas and Atlanta
I haven't used Atanta or Dallas, so I can't make any comments on those.
@Alohatone:
Last week Newark had a ddos attack.
thus far we are happy with Dallas and Atlanta
We did? I haven't noticed any issues…
@Pilate:
Why do you assume those servers are immune?
If Linode can't keep one center working consistently, what's stopping the rest from failing?
A good question. Linode hasn't actually said this is a ddos at this point. Is that certain? And if it is, is it certain that the target of the ddos is a linode customer as opposed, say, to some non-linode target at Fremont?
I know Fremont is HE. Anyone know about the other US datacenters? Is there something about dealing with HE as a network provider that makes Fremont more vulnerable/harder to defend?
@Alohatone:
Last week Newark had a ddos attack.
How do you know? By packet loss? There's nothing on status.linode.com about that …
@smparkes:
@Alohatone:Last week Newark had a ddos attack.
How do you know? By packet loss? There's nothing on status.linode.com about that …
yes, packet loss , submitted a ticket and support updated us.
@smparkes:
I know Fremont is HE. Anyone know about the other US datacenters? Is there something about dealing with HE as a network provider that makes Fremont more vulnerable/harder to defend?
None of the others are HE.
If you want proof that any server can easily fall prey to a DDoS, look at Freenode. They have hubs all around the world on different ISPs. Most of these are in Europe and North America, and we'll sometimes lose part or all of one continent. Of course, IRC is an easy attack target, this is just an example to prove my point.
A good strategy to work around this is to have different servers in different physical locations, then have your DNS setup for failover. From there, it's just a matter of keeping each location's data synchronized.
@Piki:
Any server, regardless of location (physical and network), regardless of host, ISP, etc, is hard to defend, especially with DDoS.
Sure. But hard does not mean there's nothing that can be done. And we're not talking about the server that's the target of the attack here, we're talking about collateral damage. Still hard but, again, that doesn't mean something can't be done.
What I don't know is how much what Linode can do is dependent on the DC/network provider. I know there are devices like Cisco Guard but I get the impression (but don't know for sure) that that's something that the network provider would use (if they chose to). I also hear about asking network providers to null route targeted IPs. I don't know if this is something that HE does better or worse than other providers.
The question should never be what someone is doing to prevent a DDoS, it should be what they are doing to divert as much of the risk as possible, and in the event a DDoS does occur, what they are doing to come back online.
@Alohatone:
@Guspaz:I'm sure they'll nullroute the affected system shortly.
how many nodes do you have on linode?
Three, but that doesn't really change my comment. Have you not been told in the past that if you need high availability you should be spreading your linodes out over multiple datacenters? Any and every provider suffers from occasional issues like this.
@Guspaz:
@Alohatone:
@Guspaz:I'm sure they'll nullroute the affected system shortly.
how many nodes do you have on linode?
Three, but that doesn't really change my comment. Have you not been told in the past that if you need high availability you should be spreading your linodes out over multiple datacenters? Any and every provider suffers from occasional issues like this.
What do you run on your servers?
@Piki:
I didn't say it was impossible to do anything about it. The point that I want to makes is that prevention is completely impossible
Agreed. The question at hand (as I understand it) is whether something about Fremont/HE makes mitigation slower/less effective.
The status page (http://status.linode.com/
> a network issue that is affecting a percentage of Linodes in Fremont.
Of course it's affecting a "percentage" of Linodes since it is affecting some Linodes – unless some Linodes aren't counted toward the total Linodes
(edited (again) to fix grammar -- apparently I'm byslexik now)
@Piki:
That depends on the staff and the tools that the parent company provides.
Right, which was my question. Seems like linode tries to be very neutral about their DC/network providers, which I guess isn't all that surprising.
But see also the discussion at
If you look up softlayer (which theplanet now is) and ddos, you can find comments about softlayer allocating Cisco Guards to mitigate attacks. So does linode have better ability to mitigate at Dallas?
I suspect linode can't/won't comment on that. And we don't even know for sure whether it was a ddos. But it seems pretty clear that Fremont/HE continue to have more trouble than the rest of the DCs (combined?)
@Piki:
> a network issue that is affecting a percentage of Linodes in Fremont.
Of course it's affecting a "percentage" of Linodes since it is affecting some Linodes – unless some Linodes aren't counted toward the total Linodes:D
I think they meant that it was affecting only a small number of the 'nodes at Fremont.
30 minute warning.
@yujb:
we received an alarm (from an external monitor) for nodes linode114737 and linode29827 at the same time about 10 minutes ago.. seems ok now.
Yep, we see fremont as having issues too.
@bjl:
@Piki:
> a network issue that is affecting a percentage of Linodes in Fremont.
Of course it's affecting a "percentage" of Linodes since it is affecting some Linodes – unless some Linodes aren't counted toward the total Linodes:D I think they meant that it was affecting only a small number of the 'nodes at Fremont.
Yes, I know. I just find such wording amusing. Even 100% is a "percentage", so that line could mean "a network issue that is affecting the total percentage of Linodes in Fremont", or "a network issue that is affecting all Linodes in Fremont"
@Alohatone:
What do you run on your servers?
A/UX 3.1.1. What does it matter to you?
@Guspaz:
@Alohatone:What do you run on your servers?
A/UX 3.1.1. What does it matter to you?
So that's where the original Panix servers ended up…
@Alohatone:
We've gotta leave Fremont, too bad linode doesn't have another west coast data center.
How is the latency from Hawaii to Tokyo?
@Guspaz:
@Alohatone:We've gotta leave Fremont, too bad linode doesn't have another west coast data center.
How is the latency from Hawaii to Tokyo?
PING tokyo1.linode.com (106.187.33.12): 56 data bytes
64 bytes from 106.187.33.12: icmp_seq=0 ttl=54 time=177.671 ms
64 bytes from 106.187.33.12: icmp_seq=1 ttl=54 time=175.401 ms
64 bytes from 106.187.33.12: icmp_seq=2 ttl=54 time=175.462 ms
64 bytes from 106.187.33.12: icmp_seq=3 ttl=54 time=176.791 ms
64 bytes from 106.187.33.12: icmp_seq=4 ttl=54 time=175.857 ms
64 bytes from 106.187.33.12: icmp_seq=5 ttl=54 time=174.630 ms
64 bytes from 106.187.33.12: icmp_seq=6 ttl=54 time=174.882 ms
PING fremont1.linode.com (64.71.152.17): 56 data bytes
64 bytes from 64.71.152.17: icmp_seq=0 ttl=56 time=67.692 ms
64 bytes from 64.71.152.17: icmp_seq=1 ttl=56 time=67.367 ms
64 bytes from 64.71.152.17: icmp_seq=2 ttl=56 time=66.088 ms
64 bytes from 64.71.152.17: icmp_seq=3 ttl=56 time=66.321 ms
64 bytes from 64.71.152.17: icmp_seq=4 ttl=56 time=77.084 ms
PING dallas1.linode.com (69.164.200.100): 56 data bytes
64 bytes from 69.164.200.100: icmp_seq=0 ttl=52 time=92.661 ms
64 bytes from 69.164.200.100: icmp_seq=1 ttl=52 time=90.804 ms
64 bytes from 69.164.200.100: icmp_seq=2 ttl=52 time=88.173 ms
64 bytes from 69.164.200.100: icmp_seq=3 ttl=52 time=86.775 ms
64 bytes from 69.164.200.100: icmp_seq=4 ttl=52 time=91.599 ms
SoftLayer has a DC in Fremont (well, San Jose), but it seems like Linode picks a different company for each DC to have better isolation. Surely there is some DC in Fremont that isn't as problematic as HE?
Date on that post is 6/21. Seems pretty clear HE hasn't figured out how to deal with this …
@Crusader:
Aside from Fremont, what would you guys suggest to be the next best datacentre for the West Coast?
I went with Dallas since it's nearest to the west coast (and makes both coasts equally good/bad) and it's theplanet/softlayer which I have some good experience with (though everybody can have issues and that could change …)
This is turning into a bad joke
With the current state of Fremont, Linode does not have a west coast location to offer right now.
A lot of my traffic is west coast and I hope the Linode can offer something reliable there in the near future.
> HE is having an issue network wide right now. We're seeing all kinds of routing issues outside their DC.
> It's not just the FMT facility. This is having a severe impact in Phoenix as well.
Oh … got four pings through. But now gone again …
sigh.
Lish console (console-fremont.linode.com) is not responding either.. a quick traceroute..
TraceRoute to 66.220.1.244 [console-fremont.linode.com]
Hop (ms) (ms) (ms) IP Address Host name
1 0 0 0 206.123.64.154 jbdr2.0.dal.colo4.com
2 0 0 0 64.124.196.225 xe-4-2-0.er2.dfw2.us.above.net
3 1 0 0 206.223.118.37 10gigabitethernet3-1.core1.dal1.he.net
4 38 28 28 184.105.213.118 10gigabitethernet4-4.core1.chi1.he.net
5 74 74 74 184.105.213.86 10gigabitethernet3-2.core1.den1.he.net
6 Timed out Timed out Timed out -
7 Timed out Timed out Timed out -
8 Timed out Timed out Timed out -
From reading posts abroad about this incident… this may well have been avoided if HE maintained its core routers, or used Routers that could withstand this type of attack..
When you stack up the power outages, and now this… it comes back to HE's failure to maintain their infrastructure.
@Crusader:
HE seems to be back up but lagging quite a bit, wouldn't want to try to clone a node until Fremont is "stable" again.
Not for me its not. Still timing out.
@dobie:
I'm moving all my Linodes to Dallas as soon as HE is back up. Does anyone know what Linode or HE offers in compensation for the downtime?
I'd be happy with a plan for a non-HE DC on the west coast.
I don't really blame them for HE problems but I do rely on them to pick good providers. They're too good not to know that so I presume they have to be thinking about relocating. though I'm sure that's pretty painful/costly.
Maybe somebody can correct me, but I don't think they own their IP addresses. 64.71.152.17 shows as ASN 6939 (HE). That means a forced renumbering, right? They could keep HE and open another out here but that's probably expensive.
And FWIW, fremont1.linode.com is pinging again.
Well, it was.
And it is again (over the course of writing this and looking up the ASN).
@smparkes:
@dobie:I'm moving all my Linodes to Dallas as soon as HE is back up. Does anyone know what Linode or HE offers in compensation for the downtime?
I'd be happy with a plan for a non-HE DC on the west coast.
I would love a west coast DC (that's reliable), which is why I've waited so long to move off of Fremont. However, this is just too much. I mean, Linode, you've GOT to be embarrassed as all get out by the state of the Fremont DC?! Why don't you do something about it?
It's really pathetic.
When we're back up, I'm moving.
I had 100% update when in Dallas for years. Moved to Fremont because my customer is local. Thought the latency would be better. Downtime is the suck. It seems like it's every few weeks.
I'm moving all my Linodes back to Dallas.
Sorry guys. I'm out.
@xb95:
I'm moving to Amazon. The price for my 2GB Linode is just about the same as the price of a 1.7GB EC2 instance. I came to Linode from Slicehost, but I can't handle this kind of problems.
Sorry guys. I'm out.
Rather than dump linode, migrate to another datacenter?
@xb95:
I'm moving to Amazon. The price for my 2GB Linode is just about the same as the price of a 1.7GB EC2 instance. I came to Linode from Slicehost, but I can't handle this kind of problems.
Sorry guys. I'm out.
Amazon has had their share of major problems (days) and some of those included data loss. And when they did have issues, there was no one at support that would talk to you. And it's a different service. Ephemeral instances are different than VMs a la Linode.
And this is HE, not Linode.
But it does demonstrate why linode has to separate themselves from HE. Linode's service is stellar but they're getting tainted by HE.
But as everyone has implied, I have a hard time believing anyone will stay at HE after this so it's hard to imagine Linode not making a significant change at this point.
@smparkes:
But it does demonstrate why linode has to separate themselves from HE. Linode's service is stellar but they're getting tainted by HE.
But as everyone has implied, I have a hard time believing anyone will stay at HE after this so it's hard to imagine Linode not making a significant change at this point.
I agree COMPLETELY. Linode is awesome and I love them. But the fact that they continue to tolerate HE downtime and DO NOTHING, leads me to believe that Linode is negligent and uncaring. The best thing they could do is to dump HE completely, beg forgiveness of their customers, and move on. That's the kind of backpedaling that needs to occur.
Instead, they continue to make lame excuses, blame HE and promise that – at some point in the far future -- management will figure out a solution.
For goodness sakes, you're STILL SELLING spots in Fremont! STOP!
Linode, black marks are VERY hard to remove from your record. Every day that goes by, you're getting more and more black marks.
thanks,
bruce
I think we have to be a bit careful about the "DO NOTHING" part. We don't really know what's going on behind the scenes. This isn't something you change overnight and generally companies don't announce something until they have something solid in place.
In retrospect, it's easy to think they should have made changes sooner. But I don't know enough to know how much of this could have been foreseen.
But I agree that this has turned from being a strictly technical issue to being a brand issue, which to some extent might make the plans easier for them.
I would say to anyone thinking of leaving linode altogether to remember that they are extremely helpful compared to the rest. I have asked to be moved to Tokyo when the dust settles. Even in this situation, which must be chaotic, I got an exceptional reply from a support person called Trevor. Wow, I think he cared! It's a rare thing.
Because I've also had ISP problems, I've only just realised this issue existed. I need to set up some better monitoring.
Good luck getting to the bottom of it, linode peeps.
I would like to hear from Linode how they plan on mitigating this ugliness that is apparently becoming quite common at Freemont. Given what I'm reading here and at webhostingtalk.com, this is becoming a bit of a joke. That is, unless you have a site at HE.
1) move out of HE - expensive and painful
2) add another upstream independent of HE's infra (linode owns the IP's afterall) - will add overhead to HE DC but will probably be worth it…
3) do 2 first while moving out and doing a 1.
> add another upstream independent of HE's infra
Anybody know what carriers are at FMT1/FMT2? (And which is linode? I seem to have seen both …)
HE peers with people, but I'm not sure if it's at those facilities. Not a lot of info on their website.
> linode owns the IP's afterall
This I really do wonder about. fremont1.linode.com is in HE's ASN.
@yujb:
well linode can do any of these moving forward:
1) move out of HE - expensive and painful
2) add another upstream independent of HE's infra (linode owns the IP's afterall) - will add overhead to HE DC but will probably be worth it…
3) do 2 first while moving out and doing a 1.
All great ideas, but to me, the travesty here is that Linode is continuing to sell bandwidth and space on the Fremont datacenter. That's kind of like saying "hey, we're going to charge you 100%, but you'll have uptime in the 90% range".
It's unethical.
If Linode was truly sorry about this situation, and was willing to fix it, they would at least stop offering Fremont as an option to NEW customers. What existing customers do (either moving to another DC or not) is a secondary concern.
@smparkes:
This I really do wonder about. fremont1.linode.com is in HE's ASN.
it could be that the existing setup with HE means that the linodes servers has less upstream routes than what HE has available. By putting the API hosts on HE's network range they can get more redundancy. So even if the servers themselves are inaccessible due to a dos on the upstream(s) they're using or against linodes known IP ranges, the app servers for the API will still be accessible.
@Alohatone:
is everyone stable now? we're seeing occasional drops, but nothing like today.
The outage periods seem to be getting less frequent and not quite as long. I wasn't time stamping pings before so I don't know how long ago the last one was but it was out for at least a few minutes.
> it could be that the existing setup with HE means that the linodes servers has less upstream routes than what HE has available. By putting the API hosts on HE's network range they can get more redundancy
I don't really understand this.
The ASN says who owns the IP address block and who is going to announce it via BGP. (I think … my understanding of BGP is high level only). As far as I could find, Linode doesn't have their own ASN so they always get their IP address ranges from their upstream provider and can't (1) move them across providers or (2) announce them to another transit provider.
But I could be wrong …
@smparkes:
> it could be that the existing setup with HE means that the linodes servers has less upstream routes than what HE has available. By putting the API hosts on HE's network range they can get more redundancyI don't really understand this.
The ASN says who owns the IP address block and who is going to announce it via BGP. (I think … my understanding of BGP is high level only). As far as I could find, Linode doesn't have their own ASN so they always get their IP address ranges from their upstream provider and can't (1) move them across providers or (2) announce them to another transit provider.
But I could be wrong …
well looks like the blocks Linode's (http://whois.arin.net/rest/net/NET-74-207-224-0-1/pft
@yujb:
well looks like the blocks Linode's (
) but they've decided to chop it up, assign some to their HE DC and announce the blocks using HE's ASN. http://whois.arin.net/rest/net/NET-74-207-224-0-1/pft
Yup, I missed that.
Network issue in Fremont
02:18AM (EDT): The network upstream of our Fremont facility appears to be experiencing stability issues once again at this time. We've alerted the network operation center at the Fremont facility to have them investigate. We'll provide more information once it is available.
@bbergman:
All great ideas, but to me, the travesty here is that Linode is continuing to sell bandwidth and space on the Fremont datacenter. That's kind of like saying "hey, we're going to charge you 100%, but you'll have uptime in the 90% range".
It's unethical.
If Linode was truly sorry about this situation, and was willing to fix it, they would at least stop offering Fremont as an option to NEW customers. What existing customers do (either moving to another DC or not) is a secondary concern.
While I agree with you, it's Caker's business, and it's up to him to run it to make money, which doesn't always make a good bed-fellow with ethics….
@smparkes:
Suggestions amongst Newark, Atlanta, and Dallas? Stability trumps latency for me. Is Dallas preferred for equalizing latency between the coasts?
I've not had any issues in Dallas. Not that I know about anyway
@Mr Nod:
@bbergman:All great ideas, but to me, the travesty here is that Linode is continuing to sell bandwidth and space on the Fremont datacenter. That's kind of like saying "hey, we're going to charge you 100%, but you'll have uptime in the 90% range".
It's unethical.
If Linode was truly sorry about this situation, and was willing to fix it, they would at least stop offering Fremont as an option to NEW customers. What existing customers do (either moving to another DC or not) is a secondary concern.
While I agree with you, it's Caker's business, and it's up to him to run it to make money, which doesn't always make a good bed-fellow with ethics….
You know, users who don't research have no one but themselves to blame. It's not like Linode has tried to hide the past fremont issues from the public.
@FunkyRes:
@Mr Nod:
@bbergman:All great ideas, but to me, the travesty here is that Linode is continuing to sell bandwidth and space on the Fremont datacenter. That's kind of like saying "hey, we're going to charge you 100%, but you'll have uptime in the 90% range".
It's unethical.
If Linode was truly sorry about this situation, and was willing to fix it, they would at least stop offering Fremont as an option to NEW customers. What existing customers do (either moving to another DC or not) is a secondary concern.
While I agree with you, it's Caker's business, and it's up to him to run it to make money, which doesn't always make a good bed-fellow with ethics….
You know, users who don't research have no one but themselves to blame. It's not like Linode has tried to hide the past fremont issues from the public.
When you sign up for linode its not glaring what ISPs are supplying what locations…
Its represented as linodes… as well as support will always tell you "all data centers are equal".
I'm ashamed we stayed @ Fremont as long as we did, but honestly, I trusted in linode too much to have some pull at HE.net or know more about THEIR/linodes provider than I do… since really I'm a customer of Linode, not HE.net
@xb95:
I'm moving to Amazon. The price for my 2GB Linode is just about the same as the price of a 1.7GB EC2 instance. I came to Linode from Slicehost, but I can't handle this kind of problems.
Sorry guys. I'm out.
2GB Linode on 24-month term (safe since you can cancel at any time): $67.96
Price of 1.7GB EC2 instance with equivalent storage/bandwidth, 3-year reserved instance to be fair:
Monthly cost (reservation): $9.72
Monthly cost (instance): $21.60
Bandwidth: $96.00
Total: $127.32
In order to break even, average monthly bandwidth usage must be no more than 305 GB.
Downsides: EC2 instance has 85% the RAM, ~60% the guaranteed CPU power, and ~13% the burst CPU power.
Upside: EC2 instance has 200% the local storage space, potential cost savings if bandwidth usage is under 305 GB per month.
*: This is a guesstimate based on the following assumptions:
1) Linode hosts have 20GB of instance RAM, meaning 10x 2048 linodes per host machine
2) Linode host machines have two quad-core modern Xeon processors, for an equivalent 16 EC2 compute units per host, based on Amazon's estimate that a compute unit is equivalent to 1.0-1.2 GHz of a 2007 Xeon.
3) Guaranteed compute on a 2048 linode is 1.6 compute units
4) Burst compute on a 2048 linode is 8.0 compute units
Anyhow, knowing all this, I'd pick the Linode any day, but for very low bandwidth and CPU usage, the Amazon instance could come out cheaper.
@xb95:
I'm moving to Amazon. The price for my 2GB Linode is just about the same as the price of a 1.7GB EC2 instance.
Perhaps. But performance isn't