how does a node balancer help me

how does a node balancer help me if the whole data center is down?

I have (2) virutals - on (2) different physical machines / nodes that are both down right now.

10 Replies

> how does a node balancer help me if the whole data center is down?

It doesn't. I'm still trying to figure out the best way to maintain service when an entire datacenter goes down (Fremont has provided too much practice for my liking, but there's no reason other datacenters couldn't have a power outage).

I've heard mixed reviews about the performance of MySQL replication across datacenters, but that's the best option I've come up with. With two NodeBalancers in different datacenters with master slave replication in MySQL, the slave DB could take over in read-only mode until the master DB came back up. It would require the application to have a read-only mode, but it seems like it would work.

Some sort of DNS failover service can help with the actual redirection of traffic, but yes, data replication is a pain in the butt.

I've thought about doing master-master replication, which would (in theory) "take care of everything," but MySQL's replication gets squirrely enough as it is. I do think it would work reasonably well, though, especially if the two masters are used in an failover situation (with only one of them actively accepting commits at a time).

Master-slave with the backup site in read-only mode would do the trick too. (That's actually the scenario we run: our main web servers only have SELECT privileges on the database, so "read-only mode" is normal.)

That's a good point, with a NodeBalancer in each datacenter, DNS failover would be great (assuming data replication was in place). I've thought about switching to another database entirely, like MongoDB, that appears to have better replication support, but that means rewriting the DB layer of the application and figuring out how to get MongoDB set up and running smoothly.

Do you currently use MySQL master-slave replication across datacenters? If so, have you had (m)any issues with it? From my reading it sounded like setting up a VPN or SSH tunnel was the preferred way to not have to open MySQL up and keep your data secure, but I have yet to set it up.

I like the idea of having read-only mode be the norm, and special-casing writes. I may just do that too :)

I've been doing MySQL master-master between datacenters over SSH tunnels for years, running a service that's mission-critical for hundreds of institutions. I use DNS for failover and load balancing. Can't recommend it enough!

Wow, that's the nirvana of uptime. Thanks for sharing!

Do you recall having any hiccups while setting it up?

A couple of things come to mind. You need to make sure that your server-ids are different, and well-managed, especially if you're planning on hanging backup servers off the masters. Also your SSH tunnels will need to be iron-clad: you need to be 100% sure that if the connection gets interrupted for whatever reason, that the tunnel will be re-established. Wrap it in a really good script.

@jaden:

That's a good point, with a NodeBalancer in each datacenter, DNS failover would be great (assuming data replication was in place). I've thought about switching to another database entirely, like MongoDB, that appears to have better replication support, but that means rewriting the DB layer of the application and figuring out how to get MongoDB set up and running smoothly.

Yeah, I'm not too keen on doing that yet. :-) We primarily use Django, and although our dataset would fit nicely and efforts are afoot to bring non-relational database support to the ORM, I'm probably not going to use it until it hits trunk.

> Do you currently use MySQL master-slave replication across datacenters? If so, have you had (m)any issues with it? From my reading it sounded like setting up a VPN or SSH tunnel was the preferred way to not have to open MySQL up and keep your data secure, but I have yet to set it up.

Not yet, but I do have code and procedures in place to do it. Turns out that this particular platform is operating on Rackspace Cloud Servers, and (for reasons unknown) location is an account-wide setting. It's easier for us to deploy a new instance on Linode than it is to deploy in a different Rackspace datacenter, which is kinda my medium-term plan ;-)

To facilitate this, we store nightly mysqldumps on S3. We use InnoDB exclusively (if you're going to have a relational database, you might as well have referential integrity and transactions!) and have binlogging enabled by default, so throwing some additional options on the mysqldump gives us an exact snapshot at a specific point, along with a CHANGE MASTER TO command that points to the exact correct point to start replication:

/usr/bin/mysqldump --defaults-extra-file="/etc/mysql/debian.cnf" --single-transaction --quick --master-data=1 --all-databases --add-drop-database > <%= @localfile %>
/usr/bin/s3cmd del <%= @s3file %>.old
sleep 1
/usr/bin/s3cmd --acl-private mv <%= @s3file %> <%= @s3file %>.old
sleep 1
/usr/bin/s3cmd --acl-private put <%= @localfile %> <%= @s3file %>
sleep 1

(Yes, that needs better error handling. Urk.)

Deploying a new slave, whereever it is, consists of pulling this file from S3. Good news is that almost everything on the site today was prepared last night, so this morning's backup would be almost entirely complete.

Speaking of server-id, we generate it based on the public IP address of the database server, which damn well better be unique.

set_unless[:mysql][:replication][:server_id] = "#{node[:ipaddress].split('.').collect(&:to_i).pack('C*').unpack('N').first}"

Also, MySQL supports SSL natively, so you can skip the VPN/SSH stuff if you do it right. Here's some documentation on it. You're going to want to use certificates and some sort of trust architecture.

> I like the idea of having read-only mode be the norm, and special-casing writes. I may just do that too :)

They tell me a "Service-Oriented Architecture" is the way to go. I think they might be right. :-) Heck, might want to do load-balancing between two or more locations for the primary read-only stuff (www.example.com), while keeping writable stuff (checkout.example.com, comments.example.com, stalking.example.com, etc) localized to the master site… hmm… maybe have the master site at Rackspace Chicago with read-only sites at Linode Fremont and Atlanta… twirls mustache

That's exactly what I was looking for. Thanks!

PostgreSQL 9.x has this all built in. If you are building a live system then I would have thought, starting out, it's now much better to go with Postgres than MySQL.

I just wasn't happy with Atomicity and HA solutions for MySQL, even with InnoDB. Also not confident that they would scale.

Plus Oracle looms on the horizon ;)

Check out the PostgreSQL new replication and hot swap features in 9.x. Built in streaming replication…. yummy :)

@Xan:

A couple of things come to mind. You need to make sure that your server-ids are different, and well-managed, especially if you're planning on hanging backup servers off the masters. Also your SSH tunnels will need to be iron-clad: you need to be 100% sure that if the connection gets interrupted for whatever reason, that the tunnel will be re-established. Wrap it in a really good script.

Why do you tunnel it?

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct