Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jrcs/letsencrypt-nginx-proxy-companion - HTTPS is throwing 500 Internal Server Error #959

Open
Swapratim opened this issue Jun 9, 2022 · 20 comments

Comments

@Swapratim
Copy link

Swapratim commented Jun 9, 2022

Hello,

I'm stuck with this prod issue for couple of days and it is hurting a customer now.
So if anyone would be kind enough to help - I'd really appreciate here :-)

I'm using jrcs/letsencrypt-nginx-proxy-companion for generating SSL certs for a number of companion apps running on sidecar docker-compose. There are more than 100 dockers and each one is running with a subdomain.

I pulled the latest docker tag of jrcs/letsencrypt-nginx-proxy-companion couple of days ago and issue started from there.

Issue:

  1. More than half of the apps are running fine with HTTPS. Valid Letsencrypt cert is showing there.
  2. Few apps (docker-compose) are not running with HTTPS - showing "500 Internal Server Error | nginx/1.19.3" error
  3. But they are running fine with HTTP
  4. Lots of subdomains are getting SSL renewed when /app/force_renew is run. But some of them are not getting renewed
Create new order error. Le_OrderFinalize not found. {
  "type": "urn:ietf:params:acme:error:rateLimited",
  "detail": "Error creating new order :: too many certificates (5) already issued for this exact set of domains in the last 168 hours: friis1.marvinxr.com: see https://letsencrypt.org/docs/rate-limits/",
  "status": 429
}

While some of them are auto-renewed (all of them have same configuration)

Creating/renewal spiderman.marvinxr.com certificates... (spiderman.marvinxr.com)
[Thu Jun  9 13:53:29 UTC 2022] Domains not changed.
[Thu Jun  9 13:53:29 UTC 2022] Skip, Next renewal time is: Mon Aug  8 09:00:49 UTC 2022

I have tried restarting/force_renew certificates quite some time (I know it's not the best practice but the client has a big launch and cannot wait). The issue is not getting solved at all.

jrcs/letsencrypt-nginx-proxy-companion docker-compose:

version: '2'

services:

  nginx-proxy:
    image: jwilder/nginx-proxy
    container_name: nginx-proxy
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - conf:/etc/nginx/conf.d
      - vhost:/etc/nginx/vhost.d
      - html:/usr/share/nginx/html
      - dhparam:/etc/nginx/dhparam
      - certs:/etc/nginx/certs:ro
      - /var/run/docker.sock:/tmp/docker.sock:ro
    restart: always

  letsencrypt:
    image: jrcs/letsencrypt-nginx-proxy-companion
    container_name: nginx-proxy-le
    depends_on:
      - nginx-proxy
    volumes_from:
      - nginx-proxy
    volumes:
      - certs:/etc/nginx/certs
      - acme:/etc/acme.sh
      - /var/run/docker.sock:/var/run/docker.sock:ro
    restart: always

volumes:
  conf:
  vhost:
  html:
  dhparam:
  certs:
  acme:

networks:
  default:
    external:
      name: letsencrypt

sidecar application docker-compose:

version: '3.4'

networks:
  application_network:
services:
    ar_app:
      restart: "always"
      container_name: friis1
      build:
        context: app
      environment:
         VIRTUAL_HOST: friis1.marvinxr.com
         PORT: 6000
         LETSENCRYPT_EMAIL: [email protected]
         LETSENCRYPT_HOST: friis1.marvinxr.com
      expose:
        - "6000"


networks:
  default:
    external:
      name: letsencrypt

Dockerfile for application docker-compose (from above)

FROM python:3.11.0b3-alpine3.16

WORKDIR /app

COPY requirements.txt ./

RUN pip install -r requirements.txt

COPY . /app

CMD ["python","-u","server.py"]

I can see the full list of certificate chain and acme keys stored corrctly.
friis1.marvinxr.com (directory)

  • friis1.marvinxr.com.conf
  • friis1.marvinxr.com.csr
  • friis1.marvinxr.com.csr.conf
  • friis1.marvinxr.com.key

friis1.marvinxr.com.chain.pem
friis1.marvinxr.com.crt
friis1.marvinxr.com.dhparam.pem
friis1.marvinxr.com.key

The certificate was working perfectly for over a month. But then they got regenerated on June 7th.
The problem starts from there.

Doing curl gives me this result:

curl https://friis1.marvinxr.com
curl: (60) SSL certificate problem: self signed certificate
More details here: https://curl.haxx.se/docs/sslcerts.html

curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the web page mentioned above.

Mozilla says Error code: MOZILLA_PKIX_ERROR_SELF_SIGNED_CERT
https://friis1.marvinxr.com/

The certificate is not trusted because it is self-signed.

HTTP Strict Transport Security: false
HTTP Public Key Pinning: false

Certificate chain:

-----BEGIN CERTIFICATE-----
MIIFOTCCAyGgAwIBAgIUJURatmDfv7J5KmupgfDJEbixz7cwDQYJKoZIhvcNAQEL
BQAwLDEqMCgGA1UEAwwhbGV0c2VuY3J5cHQtbmdpbngtcHJveHktY29tcGFuaW9u
MB4XDTIyMDYwODIxMzcwOFoXDTIzMDYwODIxMzcwOFowLDEqMCgGA1UEAwwhbGV0
c2VuY3J5cHQtbmdpbngtcHJveHktY29tcGFuaW9uMIICIjANBgkqhkiG9w0BAQEF
AAOCAg8AMIICCgKCAgEAwvj257m7ha/E1VTS7wkOEqunZg/yAY/VWyDGHSE9LtvX
y0xS08DL1dPpo4TJyX2dmdgQTtj3R/5CfVIPNdk9fy6VRmq7kPomKioSDN+04Eip
I/0RaSYtRyhSWxzb7JLPTeb4J4eVGUlpyghaZV3jds2Eb8EESARR81muhtN432Mb
tgu6SIf4CR541ArjlBaAnRe94tA0/dL0la4kKtzgFM7dh7qm7vQkDbB5bCMFGd9J
BculjqWxbbDU9n3bXF3uKpXeF8dJV3SFzbv3xeerBB/m28isSzPpnBjXi/JMgs4X
ZY2T91Zek9dqcGGlcPzkey9obvhazot9fwaTE89vwmuDzsReJuiR/q7vLpEgKGLb
lCEAyJAGMczPj2qF/MPI6qOK+WaN7yjis4mVeQdVL0gLqtE1rPFAMSTKXmjvacTr
Jpm/IKzPQFquqWuUR/UInOvSwp8AjiHlQTcpx6d8nLPNEvCw6w7nWSPVDAYD4qup
VxcAXSGsqgm+p2lLo9vcD7ZZvdpsQN3s2l+P+6if3CFK0yHDjTf6rIUwmXbtOIww
VH+fScVSrZLYyk9X3YpXGX/k3vg+LfkOpgHepcqW0A4WpTsVtg5vGEfUct0z6Cmu
Ia0IZvecLtjkXeoRYxJiyzk790ZdDmmKfcJ2uS/tVW6B+iQgseinR/ydikPFwoUC
AwEAAaNTMFEwHQYDVR0OBBYEFGs4WOV5xq4B6saFQKhYJDLJbu4wMB8GA1UdIwQY
MBaAFGs4WOV5xq4B6saFQKhYJDLJbu4wMA8GA1UdEwEB/wQFMAMBAf8wDQYJKoZI
hvcNAQELBQADggIBAD0V/v3+s+RfSdBgDdpp95numDNwQi+y1RuN3VBB/JTAIfz/
pj0b5MU8dAL36fZdMHmV/cwUPxydZkUYBqZ6bpdjPIVVuT/gdr6Sk81Fba2PXHFq
Tk6DtbaQpEV47t+aIiPuI4c28cTkU+Ww/jRX8fwaB53RywlbLmshOpSFLLBCRbPP
GE85NaFettEnn6Hv3+oBmeVRDpI8z2pNhibcSiptScsPQQ2vpQlnTyAPoBgJv4Ro
65g1qSkPPMfzHx/Syk3yVgCOzZ7DfmGTLnn5sWOM09K+tk3zZUS0zZ6ObJ0rcCfs
TtVZeMdHLplrmpjIJGtq64UCkMQKtS7Hn3NkXAwkulpCKptDktikJnX+yOvONftc
6bpwOSmAcSUUqsFJJOhVibPNc81MdC4peUE9L9PPwhCnC/yDBKAd3nUKtnNTeXtp
GmPtLopSTZ/6kUPdsOBcPbKprck2SlnpO7pii2LD8cZvWVLppCLUAqGfyCHCdo4n
0oiOiArcZLiI5vOgyBg5SuE8nEevP8NcdV3F4zq9jCOZ9zjoAK5PDDaqhXZe8V/2
3McpsjL0EGtzAXHiPVEgOFaalsrpPGx73oa5Crz9HsrBwdFD6y3fJu7u90Iq6oOp
qdRag9mVQ/+GrtGqN4XGGtsrugF9RHp4I/oTmOr6nV9mlLunHwVQqlbuN1Pn
-----END CERTIFICATE-----

Then I check it from https://www.ssllabs.com --> it says the cert is invalid

image

My question is: how to recreate them and make it work asap?

P.S. - It's an AWS server and the inbound security rules are as per the documentation - allowed
I have gone through almost all issues here and I have checked the pre-requisites multiple times already (inluding DNS, wildcard etc.)

image

Please help. I'm quite clueless now. Maybe a small thing that I'm missing here.

Thanks in advance for your help.

@uacaman
Copy link

uacaman commented Jun 10, 2022

I have the same problem, i looks like the certificates are generate every time the service is started, even if you use named volumes

@Swapratim
Copy link
Author

Swapratim commented Jun 10, 2022

Yes correct. And that makes me little concerned because for production usage there is a rate limit of 5 cert renewal per week. If I have 100 certs failing - how will it ensure that after 7 days all of them will be renewed properly. There's no way I can prioritize anything as such.

Between have you found any solution @uacaman?

@Swapratim
Copy link
Author

@JrCs and @buchdag - any idea or suggestion please? The client is really not happy since they ordered a custom stencil with QR code with HTTPS link to spray paint it across the city and now it is of no use to them due to the problem.

@uacaman
Copy link

uacaman commented Jun 10, 2022

post your config, maybe i can help. in my case was a typo on volume mappings

@buchdag
Copy link
Member

buchdag commented Jun 10, 2022

@Swapratim could you try to switch to Zero SSL instead of Let's Encrypt ? Zero SSL does not have rate limits. Please check the docs for instructions.

@Swapratim
Copy link
Author

Thanks @buchdag. I'd rather prefer to Letsencrypt than Zero SSL since their review is not very good (trustpilot).
I'm wondering two questions:

  1. If the cert renew limit is 5/week and all certs gets renewed at new letsencrypt-nginx-proxy-companion start - then how rest of the 100 certs are working fine?
  2. The cert for friis1.marvinxr.com was renewed on June 7th. What made it a bad/invalid certificate while others remain functional?

I compared the issued certs with other containers' certs - and all the formats are same actually.

@Swapratim
Copy link
Author

I have the same problem, i looks like the certificates are generate every time the service is started, even if you use named volumes

@uacaman I have posted all configs above.

@buchdag
Copy link
Member

buchdag commented Jun 10, 2022

As a temporary / emergency fix you can still obtain a certificate from an outside source and mount it inside the certificate volume for use with nginx-proxy, following the later's doc.

@buchdag
Copy link
Member

buchdag commented Jun 10, 2022

On a side note I'd advise you to never ever use the latest tag of any container on a production environment.

@Swapratim
Copy link
Author

On a side note I'd advise you to never ever use the latest tag of any container on a production environment.

What is the most stable container version you suggest for jrcs/letsencrypt-nginx-proxy-companion and nginx-proxy?

@buchdag
Copy link
Member

buchdag commented Jun 10, 2022

I'd rather prefer to Letsencrypt than Zero SSL

Since you've blown through rate limiting you're pretty much out of that option if you want to fix this quickly. You can set up Zero SSL on a per container basis and keep Let's Encrypt for the others.

@uacaman
Copy link

uacaman commented Jun 10, 2022

are you using compose version 2? i am not familiar with that syntax, but you have to make sure that the volumes don't get created every time.

In my case, i map the volume to the host like this:

volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- /opt/nginx/certs:/etc/nginx/certs
- /opt/nginx/vhost:/etc/nginx/vhost.d
- /opt/nginx/html:/usr/share/nginx/html
- /opt/nginx/acme:/etc/acme.sh

@Swapratim
Copy link
Author

Swapratim commented Jun 10, 2022

I had the diaster with version 3. So decided to move it back to version 2 since I was running ver. 2 over an year now.
Volumes are mapped and are only getting created once.

@uacaman
Copy link

uacaman commented Jun 10, 2022

if the volumes are mapped, check the volumes, the generated certificates should be in there.

@Swapratim
Copy link
Author

If you see above, I have already described that the certs are there. But I provided the cert details too. But somehow one is not working.

@uacaman
Copy link

uacaman commented Jun 10, 2022

and if you connect to the docket and check the file system, the certs are there? Since you hit the rate limit, the only option is to find the already generated certs and use or wait the 7 days

@Swapratim
Copy link
Author

Thanks @buchdag. I'd rather prefer to Letsencrypt than Zero SSL since their review is not very good (trustpilot). I'm wondering two questions:

  1. If the cert renew limit is 5/week and all certs gets renewed at new letsencrypt-nginx-proxy-companion start - then how rest of the 100 certs are working fine?
  2. The cert for friis1.marvinxr.com was renewed on June 7th. What made it a bad/invalid certificate while others remain functional?

I compared the issued certs with other containers' certs - and all the formats are same actually.

@uacaman That is my problem. I have the certs, they seem legit and got renewed 4 days back by letsencrypt. But it is showing as invalid against the docker-app - when opened from the browser. That is the reason I wanted to ask initially. Maybe I was not clear enough before.

@hadpro24
Copy link

Thanks @buchdag. I'd rather prefer to Letsencrypt than Zero SSL since their review is not very good (trustpilot). I'm wondering two questions:

  1. If the cert renew limit is 5/week and all certs gets renewed at new letsencrypt-nginx-proxy-companion start - then how rest of the 100 certs are working fine?
  2. The cert for friis1.marvinxr.com was renewed on June 7th. What made it a bad/invalid certificate while others remain functional?

I compared the issued certs with other containers' certs - and all the formats are same actually.

@uacaman That is my problem. I have the certs, they seem legit and got renewed 4 days back by letsencrypt. But it is showing as invalid against the docker-app - when opened from the browser. That is the reason I wanted to ask initially. Maybe I was not clear enough before.

I have the same probleme. Do you find solution ?

@wekaz
Copy link

wekaz commented Sep 4, 2022

does anyone have solution yet? i thought it's just me

@edruder
Copy link

edruder commented Dec 24, 2022

I'm having a very similar situation to what's described here. acme-companion is getting a 503 error when GET /.well-known/acme-challenge/<random token>. (A static site also behind nginx-proxy is returning 404s.)

  • I'm using docker-compose, "three containers example", essentially copied from the wiki.
  • I'm using zerossl.com because I got rate-limited by letsencrypt.
  • nginx log for the acme-challenge against my app:
    "GET /.well-known/acme-challenge/UjkSifSqQtA67Decxn-jA7xn2O6dlC0OSl39I8JqtyY HTTP/1.1" 503 190 "-" "acme.zerossl.com/v2/DV90" "-"
  • I've disable IPv6 for my domain.
  • docker run -d -p 80:80 nginx:alpine on my remote box returns the expected Nginx page.
  • https://unboundtest.com/ gets validation success
  • My docker-compose.yml:
    version: '2'
    
    services:
      nginx-proxy:
        image: nginx:alpine
        container_name: nginx-proxy
        ports:
          - "80:80"
          - "443:443"
        volumes:
          - conf:/etc/nginx/conf.d
          - vhost:/etc/nginx/vhost.d
          - html:/usr/share/nginx/html
          - certs:/etc/nginx/certs:ro
        network_mode: bridge
    
      nginx-proxy-gen:
        image: nginxproxy/docker-gen:latest
        container_name: nginx-proxy-gen
        command: -notify-sighup nginx-proxy -watch /etc/docker-gen/templates/nginx.tmpl /etc/nginx/conf.d/default.conf
        volumes_from:
          - nginx-proxy
        volumes:
          - ./nginx.tmpl:/etc/docker-gen/templates/nginx.tmpl:ro
          - /var/run/docker.sock:/tmp/docker.sock:ro
        labels:
          - "com.github.jrcs.letsencrypt_nginx_proxy_companion.docker_gen"
        network_mode: bridge
    
      nginx-proxy-acme:
        image: nginxproxy/acme-companion:latest
        container_name: nginx-proxy-acme
        volumes_from:
          - nginx-proxy
        volumes:
          - certs:/etc/nginx/certs:rw
          - acme:/etc/acme.sh
          - /var/run/docker.sock:/var/run/docker.sock:ro
        environment:
          - [email protected]
          - ACME_CA_URI=https://acme.zerossl.com/v2/DV90
          - ZEROSSL_API_KEY=<something something>
        network_mode: bridge
    
    volumes:
      acme:
      certs:
      conf:
      html:
      vhost:

Any suggestions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants