OpenVPN stopped working suddenly?

Hi all,

Since mid November up until two days ago, my connection has been running without a hitch. Suddenly though, the VPN seems to have stopped working properly. I'm running Debian 5 btw.

I can connect up, the OpenVPN client shows connected, and I'm able to log in to IMAP or SSH servers, but then the connection seems to hang. I can get in via LISH happily enough, and I've update/upgraded and rebooted, restarted all the services etc. and it just continues to happen - nothing I connect through the VPN is able to keep alive for more than a few seconds without stalling.

Both the machines I'm connecting from are Windows 7 and have recently had updates applied. That's the only thing I can think of that's changed. I guess also I logged in from the second machine yesterday before I noticed the problem, and that's fairly rare so may have messed something up. I haven't received any email since the 21st though.

Any idea how I can diagnose this problem? I can't see anything unusual in the logs.

Many thanks for any help!

Daveo

– Some logs:

I connect to the VPN

==> daemon.log <==

Jan 23 12:57:41 [myhostname] ovpn-server[3959]: Peer Connection Initiated with [myclientip]:61036

==> syslog <==

Jan 23 12:57:41 [myhostname] ovpn-server[3959]: Peer Connection Initiated with [myclientip]:61036

I click on my mail folder

==> mail.info <==

Jan 23 12:57:58 [myhostname] imapd-ssl: LOGIN, user=[myusername], ip=[::ffff:[myclientVPNip], port=[49463], protocol=IMAP

==> mail.log <==

Jan 23 12:57:58 [myhostname] imapd-ssl: Connection, ip=[::ffff:[myclientVPNip]

Jan 23 12:57:58 [myhostname] imapd-ssl: LOGIN, user=[mymailusername], ip=[::ffff:[myclientVPNip]], port=[49463], protocol=IMAP

==> syslog <==

Jan 23 12:57:58 [myhostname] imapd-ssl: Connection, ip=[::ffff:[myclientVPNip]

Jan 23 12:57:58 [myhostname] imapd-ssl: LOGIN, user=[mymailusername], ip=[::ffff:[myclientVPNip]], port=[49463], protocol=IMAP

Thunderbird hangs until it gets bored

==> mail.log <==

Jan 23 13:01:22 [myhostname] imapd-ssl: DISCONNECTED, user=[mymailusername], ip=[::ffff:[myclientVPNip], headers=0, body=0, rcvd=67, sent=8337, time=102, starttls=1

==> syslog <==

Jan 23 13:01:22 [myhostname] imapd-ssl: DISCONNECTED, user=[mymailusername], ip=[::ffff:[myclientVPNip]], headers=0, body=0, rcvd=67, sent=8337, time=102, starttls=1

–-------------------

I connect up an SSH session (have not reconnected OpenVPN client, same connection as above there)

==> auth.log <==

Jan 23 13:07:14 [myhostname] sshd[4036]: Accepted password for [user] from [myopenVPNip] port 49475 ssh2

Jan 23 13:07:14 [myhostname] sshd[4036]: pam_env(sshd:setcred): Unable to open env file: /etc/default/locale: No such file or directory

Jan 23 13:07:14 [myhostname] sshd[4036]: pam_unix(sshd:session): session opened for user [user] by (uid=0)

Jan 23 13:07:14 [myhostname] sshd[4038]: pam_env(sshd:setcred): Unable to open env file: /etc/default/locale: No such file or directory

==> auth.log <==

Jan 23 13:07:25 [myhostname] su[4042]: Successful su for root by [user]

Jan 23 13:07:25 [myhostname] su[4042]: + pts/0 [user]:root

Jan 23 13:07:25 [myhostname] su[4042]: pam_env(su:session): Unable to open env file: /etc/default/locale: No such file or directory

Jan 23 13:07:25 [myhostname] su[4042]: pam_unix(su:session): session opened for user root by user

cat a few files, do a find . on /, whatever; after about 30 seconds SSH client hangs. No further log messages.

5 Replies

@Daveo:

I can connect up, the OpenVPN client shows connected, and I'm able to log in to IMAP or SSH servers, but then the connection seems to hang. I can get in via LISH happily enough, and I've update/upgraded and rebooted, restarted all the services etc. and it just continues to happen - nothing I connect through the VPN is able to keep alive for more than a few seconds without stalling.
If I read your post correctly, it also sounds like a straight SSH (without OpenVPN) is also hanging, right? So it's not obvious to me that this is limited to an OpenVPN problem as opposed to a more general one. Out of curiosity what kernel are you running?

It's a long shot and probably completely unrelated, but I had a 2.6.32 (latest stable paravirt) Linode (Ubuntu 8.04 LTS) I brought up earlier this month that after a few days of operation - shortly after it started receiving production traffic - ended up in a state that sounds very similar to what you describe.

I'd be able to connect without a problem, but I/O over the connection would just stop after a short number of operations. At first I thought I was getting wedged in disk I/O somehow and thought it might even be a host issue, but there were other hangs when clearly not waiting on disk I/O and I caught some stats showing large outbound TCP connection queues. It was the oddest behavior, as I could keep recreating connections, just not do much for long over them. I did open a ticket but the host was running fine.

The system was extremely close in configuration to existing Linodes I had but the first to use 2.6.32 (I had some stable 2.6.18s and one stable paravirt 2.6.31 that hadn't been rebooted into 2.6.32 yet). I was firewalled, but the ruleset had a dedicated hole in it for my home location, so very little processing going on. Though, as an aside, there was another post here about trouble getting ufw (simple firewall tool for Ubuntu) to install rules in a 2.6.32 kernel that previously worked with 2.6.31, so maybe a hint of something interesting in netfilter/iptables in 2.6.32.

In my case I couldn't spend too much time fiddling with the Linode as I needed it in production, so safety won and I just went back to 2.6.18 and no burps since. While I've intended to do some testing on a spare configuration, at this point I have no hard data that it was definitely something in 2.6.32, but there weren't a whole lot of other variables compared to my other Linodes.

– David

Thanks for the reply David. I was doing SSH over the VPN for convenience, and everything that's not over the VPN (LISH, web, IMAP now that I've connected to it on the outside) is working fine.

I haven't done anything unusual to the kernel (2.6.18.8-linode22), it's been untouched and working fine for 8 weeks maybe since I last did anything to the box. The symptoms you described (connection works fine, comms works fine for a few seconds before blocking) sounds very similar, though in my case it does only seem to be happening inside the VPN (though I haven't done exhaustive testing outside of it)

I've tried updating OpenVPN on the client boxes now, to no avail. I may try connecting up from an ubuntu boot disk on the client side, see if I can isolate whether the problem is this end or that.

I'm being migrated today anyway, so I'll look again at this problem afterwards.. who knows it might just go away? [edit - can report no improvement post migration :(]

Any other thoughts gratefully received :)

D

@Daveo:

Thanks for the reply David. I was doing SSH over the VPN for convenience, and everything that's not over the VPN (LISH, web, IMAP now that I've connected to it on the outside) is working fine.
Oh sorry - I thought you said you were using SSH without the OpenVPN client running, so presumed that was a direct connection that still hung.

The only immediate next thoughts might be something related to filtering accidentally interfereing just with OpenVPN tunnel traffic (perhaps rate limiting dropping packets). Any kernel logs related to filtering when you have issues? I believe by default on Debian they'll be in kern.log (and also on the lish console I think). Though I guess you would probably have noticed that when using lish when you started having problems. Perhaps dropping any firewall you may have temporarily to see if it changes things would be one step to try.

Not sure why that would suddenly start happening without any changes though. So I think your thought of testing from a different client is a good one too to isolate possible client issues.

Post back if you do figure it out, and especially if it's something on the host side. Maybe it'll be related somehow to what happened to my system.

– David

Will do David :)

Nothing going on in kern.log at the time of a block. daemon.log has some dhcp messages as well as reporting my (successful) connection but nothing when that connection hangs. Nothing interesting in syslog either. faillog looks like it has been modified recently, but I couldn't read it (empty or binary, not sure and I'm not connected right now to check).

I temporarily disabled fail2ban and reset my iptables rules but it doesn't seem to have helped.

Long time since I looked at this problem, seems to be fixed now.

Repair method:

  • Wait several weeks

  • Update packages on client and server

  • See if problem has gone away

lol ;)

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct