✓ Solved

SSL cert renewals fail with "Connection refused", but URLs working fine

Recently seeing this error when the cert renewals run:


certbot renew --dry-run --cert-name fltk.org

Saving debug log to /var/log/letsencrypt/letsencrypt.log


Processing /etc/letsencrypt/renewal/fltk.org.conf


Cert not due for renewal, but simulating renewal for dry run
Plugins selected: Authenticator webroot, Installer None
Starting new HTTPS connection (1): acme-staging-v02.api.letsencrypt.org
Renewing an existing certificate
Performing the following challenges:
http-01 challenge for fltk.org
http-01 challenge for www.fltk.org
Waiting for verification…
Cleaning up challenges
Attempting to renew cert (fltk.org) from /etc/letsencrypt/renewal/fltk.org.conf produced an unexpected error: Failed authorization procedure. fltk.org (http-01): urn:ietf:params:acme:error:connection :: The server could not connect to the client to verify the domain :: During secondary validation: 173.230.155.139: Fetching http://fltk.org/.well-known/acme-challenge/[..has snipped..]: Connection refused, www.fltk.org (http-01): urn:ietf:params:acme:error:connection :: The server could not connect to the client to verify the domain :: During secondary validation: 173.230.155.139: Fetching http://www.fltk.org/.well-known/acme-challenge/[..hash snipped..]: Connection refused. Skipping.
All renewal attempts failed. The following certs could not be renewed:
/etc/letsencrypt/live/fltk.org/fullchain.pem (failure)


** DRY RUN: simulating 'certbot renew' close to cert expiry
** (The test certificates below have not been saved.)


  • I've ruled out the firewall by clearing it with 'iptables -F'

  • The Apache server for sure is running and reachable and listening
    on ports 80 and 443

  • Copy pasting the URLs that 'certbot renew' shows as failing into a browser works.
    We put a certbot.html in the same dir to use for debugging, e.g.
    http://fltk.org/.well-known/acme-challenge/certbot.html
    This url seems to resolve fine in different countries.

The certbot config has been working for many years (since 2018), only recently stopped working (the last month or so, Jun 2024).

NOTE: We have recently been having large attacks (PHP crack attempts) from a very wide range of AWS private cloud server ips which had been /killing/ our server, causing OOM errors that killed apache/fail2ban/mysql daemons. Was only able to get things back to normal by blocking all AWS IPs, e.g. https://docs.aws.amazon.com/vpc/latest/userguide/aws-ip-ranges.html

But even when those blocks are removed (iptables -F) before running the certbot tests, it still gives "Connection refused" during the run.

Not sure how else to debug this. Any pointers/suggs? I've searched around quite a bit, not seeing anything obvious.

3 Replies

✓ Best Answer

Regarding (1): Yes, I could confirm SOME but not ALL connections were coming into our server from lets encrypt. Apparently some were blocked.

Think I found the solution though: apparently iptables -F just flushes ipv4, leaving the ipv6 AWS blocks in place.

I realized I needed to use ip6tables -F (emphasis on "6") to flush the ipv6 firewall too. Once I did that, then the certbot renew was able to work again, and I could see both ipv4 AND ipv6 return addresses in our apache2 access.log during their cert validation probes.

So apparently what happened recently is two different events coincided:

  • A month or so ago "Let's Encrypt" started using multiple remotes to validate certs, to work around BGP hijacking and DNS poisoning. They call this "multi-perspective validation" so that the validations didn't only originate from their servers.

  • That unfortunately coincided with a recent and very wide and brutal cracker attack from AWS servers that caused /us/ to take measures to block /all/ AWS private cloud servers from connecting to our server.

Apparently Let's Encrypt must be using some AWS servers for their checks, and some of those servers use ipv6, and thus were caught in the "ip6tables" oriented AWS blocks.

SOLUTION
So our solution is to save and completely clear both firewalls (ipv4 and ipv6), run the certbot renewal, then restore. Basically:

# Save the ipv4+6 firewall configs
iptables-save     > /tmp/iptables.txt
ip6tables-save    > /tmp/ip6tables.txt

# Completely clear the ipv4+6 firewalls
iptables -F
ip6tables -F

["certbot renew" commands go here, while firewall is cleared]

# Restore ipv4 + ipv6 firewalls
iptables-restore  < /tmp/iptables.txt
ip6tables-restore < /tmp/ip6tables.txt

Without knowing your exact setup and what you've already tried, I'd:

  1. Check the web server logs to confirm that the requests from Let's Encrypt definitely aren't reaching the web server.

  2. I use UFW to manage the server firewall, but are the blocked requests from AWS being logged somewhere on your server (e.g. syslog)? If so, check to see if you're getting blocked requests when you run the "--dry-run"s.

If you're not already using it, it's worth looking at using fail2ban to monitor the web server error log to do the server firewall blocking. Mod Security also works well with Apache to detect hacking attempts and again fail2ban can monitor that to do the firewall-level blocking.

Your suggestion worked for me. Thank you.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct