Cross-datacenter load balancing and closest server pref.?

I'm interested in setting up a high availabilty cross-datacenter load balanced cluster of servers for a single website.

The goal is simple, I want to have one website (with a lot of visitors) hosted on a webserver like Nginx, with multiple servers across several datacenters.

One of the issues is that all these servers need to have the same database in sync.

Another issue is that resources should be shared evenly accross these datacenters, while prefering the closest server to each visitors if resource usage is fairly balanced.

So, how would one setup such load balancing while having one synched database for high availability? (I'm thinking about solutions like redis, couchdb, mongodb)

And how would one at the same time make sure that visitors are being pointed to the server that is the closest to them, unless one server uses significant less resources?

Has this been done before, or maybe even documented?

Any suggestions are appreciated.

Thanks.

13 Replies

@tommedema:

high availabilty cross-datacenter
Just to let you know, the IP-based high availability stuff only works within the same datacenter (since our IP addresses are routed to a single facility). To accomplish cross-datacenter HA, you'd be using DNS and your servers would have different IP addresses.

Just clearing that up.

@jed:

@tommedema:

high availabilty cross-datacenter
Just to let you know, the IP-based high availability stuff only works within the same datacenter (since our IP addresses are routed to a single facility). To accomplish cross-datacenter HA, you'd be using DNS and your servers would have different IP addresses.

Just clearing that up.

Alright, but it is possible to setup high availabilty cross datacenter, right?

On a linode, I think all of your options when it comes to high availability are going to involve DNS. Round-robin can reduce the impact of a server going down (4 servers, one goes down, 75% of initial requests still make it through) and low DNS TTLs can reduce the amount of time before the remaining server comes back up.

As for load balancing, you have three options:

1) DNS round-robin. Not perfect load balancing since it doesn't take load into account, but it can help spread the load

2) Geodns. Not accurate since it redirects people based on where their DNS server is, not where they are. For example, I use Google DNS from Montreal, and my ISP routes me through Toronto. I have no idea where a geodns solution would think I live!

3) Load balancing redirects. The idea is that you have frontline server(s) that use some strategy to decide what application servers the customer is redirected to. If you've ever seen your browser going to www1, www2, www3, etc, then they're probably doing something like this.

Of course there is always the option of load-balancing different resources to different places. An application server, a database server, a content server, etc. Keep in mind, though, that all linodes in an account, no matter what datacenter, will share a common bandwidth pool; you don't need to load-balance for that.

@Guspaz:

On a linode, I think all of your options when it comes to high availability are going to involve DNS. Round-robin can reduce the impact of a server going down (4 servers, one goes down, 75% of initial requests still make it through) and low DNS TTLs can reduce the amount of time before the remaining server comes back up.

As for load balancing, you have three options:

1) DNS round-robin. Not perfect load balancing since it doesn't take load into account, but it can help spread the load

2) Geodns. Not accurate since it redirects people based on where their DNS server is, not where they are. For example, I use Google DNS from Montreal, and my ISP routes me through Toronto. I have no idea where a geodns solution would think I live!

3) Load balancing redirects. The idea is that you have frontline server(s) that use some strategy to decide what application servers the customer is redirected to. If you've ever seen your browser going to www1, www2, www3, etc, then they're probably doing something like this.

Of course there is always the option of load-balancing different resources to different places. An application server, a database server, a content server, etc. Keep in mind, though, that all linodes in an account, no matter what datacenter, will share a common bandwidth pool; you don't need to load-balance for that.

Thanks. How exactly does linode limit my options though? I'm not sure why a cloud server would result in any limitations compared to a colocated server.

When I visit google.com I'm pretty sure it directs me to a close by server as my latency to it is always extremely low.

Do you have any idea how they accomplish this? GeoDNS?

Limited in that you don't control your own routing, you can't do anycast or geocast (probably what Google is doing), so you're stick with using DNS or software to do your balancing. One option that I didn't mention is that it's possible to have your DNS server direct people based on logic, so you could have a DNS server send people to the servers with the lowest load. I think that DNS caching would make that impractical, though.

To use GeoDNS, you'd need to sign up with a DNS provider that supports it.

Personally, I'd go for the following approach:

1) DNS round-robin for the initial load spreading. As mentioned, this also reduces the impact of a downed server

2) Low TTL on DNS so that you can update DNS to take a downed server out of the rotation quickly

3) Have the application servers themselves redirect a percentage of users if a server is overloaded, since DNS round-robin does not produce equal load balancing. Need custom code for this, so that each application server is aware of the load of the other servers, so that it can make the decision of where to redirect to.

I'm not an expert in HA systems, though.

Thanks.

Bit of a surprise though there's only one replier in this thread. :)

What I don't understand, is that you need a server to appoint users to the least heavy used servers. What if this appointing server goes down?

Have two.

Or even better, 4!

Two in datacenter A which have ip failover between each other, 2 in data centre B with he same setup and DNS round robin between the two data centres.

@obs:

Have two.

Or even better, 4!

Two in datacenter A which have ip failover between each other, 2 in data centre B with he same setup and DNS round robin between the two data centres.

So how would one be appointed to the correct datacenter depending on 1) it's load balance and if nothing is critical 2) the users location to the server?

It doesn't have to be one server doing the appointing. All servers can do it.

DNS round-robin will go most of the way towards getting load spread. Each application server can then decide if a user should be served or passed on to a different server with a lower load.

To do this, you'll need to have the current load of each server known by each server so that each server (what a mouthful) can make the decision of "I'm overloaded, better pass off this user to another server. Hey, server C has the lowest load, I'll send him there.

> So how would one be appointed to the correct datacenter depending on 1) it's load balance and if nothing is critical 2) the users location to the server?

I recently found the silver bullet to this question

http://www.ultradns.com/solutions/sitebacker.html

It's a commercial service and I've not used it yet but from what I've read it does look like it will do what you're looking for. You can pay more money and get users redirected to local DCs also.

DynDNS also do a similar product I believe, not looked into this one too much yet.

That automates it, but doesn't necessarily offer anything you can't get out of a monitoring solution and an on-the-ball admin.

Monitoring solutions and on-the-ball admins cost money, and adminning reliable customized DNS is not the world's most pleasant and trivial activity.

If it's important enough to require high availability, it's important enough to hire professionals to do the dirty work. Unless you're a professional, in which case everyone else should hire you. :-)

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct