High availability for IPv6
I've tried the basics of moving floating IPv4 addresses between Linodes - and that's easy enough. Just enable IP failover for the relevant IPs, reboot as needed, and use arping to force other hosts to update their ARP cache. It works nicely.
However I've had no such luck with doing the same with IPv6. It's possible to transfer IPv6 pool addresses to other servers, however it's impossible to get the IPv6 NDP cache updated on other servers. For other Linodes on the same network this isn't too bad - their neighbour cache doesn't last that long. However for external connectivity, this results in 20-30 minutes of traffic going to the old Linode.
I've tried to use arpsend/ndsend to get the neighbours caches updated:
root@devon:~# arpsend -U -i 2a01:7e00::2:9995 eth0
18:37:13.021656 IP6 fe80::f03c:91ff:fe6e:afd5 > ff02::1: ICMP6, neighbor advertisement, tgt is 2a01:7e00::2:9995, length 32
But sadly this traffic isn't being seen on other servers, probably similar to how ping6 ff02::1%eth0 doesn't work either.
Anyone with any experience on this?
11 Replies
This is the command I'm using:
ping6 -c 1 -w 15 -I $MY_POOL_ADDRESS fe80::1%eth0 > /dev/null
This pings the default gateway from the pool address until it gets a response, or 15 seconds elapse. It 15 seconds elapse without a response, there's probably some other problem with your connectivity.
You need a version of ping that supports the -I option. On Debian, this means you need the iputils-ping package rather than the inetutils-ping package.
Edit: I should mention that if you try to run ping6 immediately after adding the pool address to your interface, ping6 might fail to bind to the pool address because DAD hasn't completed yet
Thank you for this workaround! I can finally do some high availability without having to worry about IPv6 traffic disappearing for up to 30 minutes.
The only downside is that I'll have to flush the NDP cache of other Linodes - but at least that's a solvable problem.
I've got both servers with the relevant IPv6 addresses as additional IPs on the local interface:
iface lo inet6 loopback
up ip -6 addr add 2a01:7e00:etc:etc/128 dev lo preferred_lft 0
Now the high availability monitor can either add the same address to eth0 on one of the servers, or the pool address if you've got a routed subnet to a pool IP.
The one quirk of this is that whilst traffic is going to the old server - the old server will keep on responding to existing traffic until the routers cache expires and points to the new server. A big disadvantage is that if you're doing this with pool addresses with other servers of your own, the NDP cache of servers will keep on sending traffic to the old server for quite a while! It won't stop until traffic has stopped flowing to that IP for a few minutes (enough to timeout), or until you flush the cache for that destination IP.
This allows me to perform scheduled maintenance by turning off keepalived on the server ~30 minutes in advance. Sadly it doesn't help for real high availability, traffic will eventually flow to the right server if there's unexpected downtime, but not immediately.
The only reason I'm tolerating this as a solution is that it should happen very rarely, IPv6 traffic is a small percentage of overall traffic, and happy eyeballs will hopefully favour IPv4 until the IPv6 address is reachable again.
I don't like it, and I hope that Linode will at some point take this issue seriously. IPv6 just feels like an afterthought in multiple ways.
IPv4 high availability - easy. IPv6 high availability - don't bother with Linode.
I think that Linode should remove the restriction on the link-local all nodes address, as one can easily flood the broadcast domain by other methods which have to be allowed in order for the intended purpose of those protocols to work. Note, however, that even if they do, it won't solve the in-datacenter problem, because Linodes can only receive traffic on addresses assigned to them (the periodic router advertisements that allow SLAAC to work are the only exception to this, and it should stay that way).
The only reason you don't see these same problems with IPv4 is because it's highly unlikely that you'll have another Linode with an IP address in the same subnet as your failover IP address, so all in-datacenter consumers of your failover-protected service go through the routers to reach the failover IP, and sending the gratuitous ARP on a failover event does reach the routers just fine. If you were doing IP failover in the IPv4 private network, you'd have the same problem as with in-datacenter IPv6, and would have to use unicast unsolicited ARP replies in order for failover to work correctly in that case as well.
I ended up doing what dwfreed suggested - and created a Python/scapy script which sends NA packets to ff02::2 (all-routers) and unicast NA packets to all other servers. Seems to work reasonably well.
I'll try and get it up on GitHub and PyPI this weekend.
I was able to get external failover to work quickly on debian buster by pinging the link-local all routers address.
/bin/ping -6 -w 10 -c 1 -I $MY_POOL_ADDRESS ff02::2%eth0
This still isn't as fast as the ipv4 failover, but it only takes a few seconds which is considerably better than 30 minutes. There's still a 30-40 second delay for same-datacenter failover, but that doesn't really affect my use case.
Never got around to uploading the script (sorry!), so I've quickly added it as a GitHub gist: https://gist.github.com/tomkins/c1fec82499fa273c6e1712147867bfa5
Requires Python 2.7 and scapy - I might create a proper package for it once it supports Python 3!
Give it a list of source addresses you want to send advertisements for, the interface to send the messages, and the destination link local IPv6 addresses of all the servers you care about - and it'll send NA packets for all of them.