SOLVED: Can't ssh from node to node:
NOTE: This is not an "Aegir" post, just giving some background in case it matters.
Background: Setting up Aegir with remote server. Master is set in 1st node, remote server is set in 2nd node. This is my DEV version and I am using one domain for the 1st node, a second domain for the 2nd second node.
Initially I had only one node set up with static IP with two IP's (one for the main site, another for the second site/client site), each with its own domain. This worked just fine. Then I decided that for my business it was best to have the main aegir set up in the 1st node, then run the rest of the sites in separate nodes as remote servers (one per client), so each client gets his/her own node.
So the steps I took to get back to square 1:
1) Got a new Linode. (2nd node & 2nd domain that will be remote server for client)
2) Delete the second IP I had set up in the 1st node, since it is not longer needed.
3) Reverted the 1st node to DCHP instead of static IP, thus bringing it back to its original state when it was originally purchased.
4) Delete the old DNS configuration for the second IP & domain and created a new one for the new node using the new node's IP and the 2nd domain.
The Issue: Now, Aegir configuration aside, the problem I have is that I cannot seem to be able to SSH from one node to another.
I can SSH to each node individually from my local computer no problem, BUT I cannot ssh from one node to another. Every time I do, I get this error:
In 1st node shell I try:
ssh root@XXX.XXX.XXX.XXX
ssh: connect to host XXX.XXX.XXX.XXX port 22: No route to host
I check by doing netstat -an | grep "LISTEN " on my 2nd node and I see port 22 is open (Don't mind 80 is closed, part of the Aegir installation):
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN
tcp6 0 0 :::22 :::* LISTEN
So why can't I login to the 2nd node from the 1st node? Any ideas? I know I missing something, but cannot find what that is.
Thanks
2 Replies
It's tough to offer concrete suggestions without the actual addresses involved, but a few thoughts:
* Can you ping the same host you are trying to ssh to? (This failing could just be a firewall but if it works and ssh fails, it might point to network interference or a firewall rather than local mis-configuration)
Similarly, any firewalling on either computer? I'm wouldn't normally expect to see a route error from such, but might as well turn it all off when testing.
If you look at the output of "netstat -rn" can you see an entry that covers the address you are trying to reach? There should be one if the address is part of a local network, or you should have a default route (0.0.0.0) for anything else. The output of ifconfig may also help to double check your interface settings.
You can also try using traceroute to see where the path to the address breaks down, but I suspect it's failing right at the local node.
Are you trying to reach a "private" address from a different data center?
Have you done the local configurations (such as for static addresses) for the extra IP addresses in both nodes, and then taken any steps (which may be distribution specific) to restart networking or otherwise ensure that the configuration is active?
Have you rebooted any of the machines to which you made IP address changes in the Linode Manager? A reboot is typically needed for such changes to take effect, even if you've made the local configuration changes within the system itself. The last three items probably aren't likely as long as you're using the same address as in the ssh command when connecting from your personal computer, but just in case you're not…
If none of this helps, it'll probably be necessary to know some concrete details for the interfaces and addresses involved.
-- David
Restarting the node fix the issue.
Thank you very much for the help.