Skip to content
This repository has been archived by the owner on Aug 7, 2020. It is now read-only.

Healthcheck makes emqx exit abnormally #143

Open
renatomotorline opened this issue May 8, 2020 · 5 comments
Open

Healthcheck makes emqx exit abnormally #143

renatomotorline opened this issue May 8, 2020 · 5 comments
Assignees

Comments

@renatomotorline
Copy link

Expected behavior

The container execute the healthcheck and don't exit.

Actual behavior

Container exits abnormally approximately 40 seconds after start. The healthcheck is 2m.
If I remove the healthcheck all works ok.

Docker logs

node.max_ports=1048576
listener.tcp.external.acceptors=64
listener.ssl.external.acceptors=32
node.process_limit=2097152
node.max_ets_tables=2097152
cluster.discovery=static
cluster.discovery=static
listener.ws.external.acceptors=16
[email protected]
[email protected], [email protected], [email protected]
[email protected], [email protected], [email protected]
EMQX_LOADED_PLUGINS="emqx_management,emqx_auth_http,emqx_recon,emqx_retainer,emqx_dashboard"

=====
===== LOGGING STARTED Fri May  8 14:28:32 UTC 2020
=====
Exec: /opt/emqx/erts-10.5.6/bin/erlexec -boot /opt/emqx/releases/v4.0.5/emqx -mode embedded -boot_var ERTS_LIB_DIR /opt/emqx/erts-10.5.6/../lib -mnesia dir "/opt/emqx/data/mnesia/[email protected]" -config /opt/emqx/data/configs/app.2020.05.08.14.28.34.config -args_file /opt/emqx/data/configs/vm.2020.05.08.14.28.34.args -vm_args /opt/emqx/data/configs/vm.2020.05.08.14.28.34.args -start_epmd false -epmd_module ekka_epmd -proto_dist ekka -- console
Root: /opt/emqx
/opt/emqx
Starting emqx on node [email protected]
Start http:management listener on 8081 successfully.
Start http:dashboard listener on 18083 successfully.
Start mqtt:tcp listener on 127.0.0.1:11883 successfully.
Start mqtt:tcp listener on 0.0.0.0:1883 successfully.
Start mqtt:ws listener on 0.0.0.0:8083 successfully.
Start mqtt:ssl listener on 0.0.0.0:8883 successfully.
Start mqtt:wss listener on 0.0.0.0:8084 successfully.
EMQ X Broker 4.0.5 is running now!
Eshell V10.5.6  (abort with ^G)
([email protected])1> ['2020-05-08T14:29:04Z']:emqx exit abnormally

Test Dockerfile

# Use the emqx official
FROM emqx/emqx:v4.0.5

# Go to config folder
WORKDIR /opt/emqx/etc

# Install curl for hearth check
RUN sudo apk add --no-cache curl

# Set user
USER emqx

# Copy configurations
COPY emqx.conf .
COPY plugins/* ./plugins/

# emqx will occupy these port:
# - 1883 port for MQTT
# - 8080 for mgmt API
# - 8083 for WebSocket/HTTP
# - 8084 for WSS/HTTPS
# - 8883 port for MQTT(SSL)
# - 11883 port for internal MQTT/TCP
# - 18083 for dashboard
# - 4369 for port mapping
# - 5369 for gen_rpc port mapping
# - 6369 for distributed node
EXPOSE 1883 8080 8081 8083 8084 8883 11883 18083 4369 5369 6369

HEALTHCHECK --interval=2m --timeout=3s --retries=3 \
  CMD curl -f --basic -u emqx-backend:public -k http://localhost:8081/api/v4/brokers || exit 1

EMQ version

emqx/emqx:v4.0.5

Docker version

Which docker-engine version?

docker -v
Docker version 19.03.8, build afacb8b

How docker info?

docker info
Docker version 19.03.8, build afacb8b
[root@docker1 docker-workspace]# docker info
Client:
 Debug Mode: false

Server:
 Containers: 5
  Running: 5
  Paused: 0
  Stopped: 0
 Images: 40
 Server Version: 19.03.8
 Storage Driver: overlay2
  Backing Filesystem: <unknown>
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: active
  NodeID: qlm1hzy33qg8h8b8oe18httvo
  Is Manager: true
  ClusterID: lod1cg5a3dmoc87v839s4i24u
  Managers: 1
  Nodes: 3
  Default Address Pool: 10.0.0.0/8  
  SubnetSize: 24
  Data Path Port: 4789
  Orchestration:
   Task History Retention Limit: 5
  Raft:
   Snapshot Interval: 10000
   Number of Old Snapshots to Retain: 0
   Heartbeat Tick: 1
   Election Tick: 10
  Dispatcher:
   Heartbeat Period: 5 seconds
  CA Configuration:
   Expiry Duration: 3 months
   Force Rotate: 0
  Autolock Managers: false
  Root Rotation In Progress: false
  Node Address: 192.168.16.21
  Manager Addresses:
   192.168.16.21:2377
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: b34a5c8af56e510852c35414db4c1f4fa6172339
 runc version: 3e425f80a8c931f88e6d94a8c831b9d5aa481657
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 4.18.0-147.8.1.el8_1.x86_64
 Operating System: CentOS Linux 8 (Core)
 OSType: linux
 Architecture: x86_64
 CPUs: 1
 Total Memory: 3.692GiB
 Name: docker1.localdomain
 ID: RZZ2:XOOT:4FIB:GH4B:G6LD:6Z3K:7RDO:UMJ3:3FHV:3XC2:LEWH:NI5A
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

System

What system do you use?
CentOS Linux release 8.1.1911 (Core)

@Rory-Z
Copy link
Contributor

Rory-Z commented May 9, 2020

Hi, @renatomotorline, sorry for late reply there
I suggest you change the health check address to http://127.0.0.1:8081/status and try again

@renatomotorline
Copy link
Author

@zhanghongtong the problem persists. Even if I put the healthcheck command equal to exit 0.

@Rory-Z
Copy link
Contributor

Rory-Z commented May 12, 2020

@renatomotorline Can you share the contents of log/emqx.log.* and log/erlang.log.* after the health check failed?

@renatomotorline
Copy link
Author

The fail is before the health check I think because the health check has an interval of 2 minutes and the container fails after approximately 40 seconds.
If I run the container works fine.

docker run emqx-backend:1.0.0

If I deploy in the docker swarm the problem happens.

docker stack deploy --with-registry-auth -c docker-compose.yml emqx-backend

The /opt/emqx/log/emqx.log.* only give �

/opt/emqx/log/erlang.log.*

cat /var/lib/docker/volumes/emqx-backend_emqx-backend01-logs/_data/erlang.log.*

=====
===== LOGGING STARTED Tue May 12 09:46:43 UTC 2020
=====
Exec: /opt/emqx/erts-10.5.6/bin/erlexec -boot /opt/emqx/releases/v4.0.5/emqx -mode embedded -boot_var ERTS_LIB_DIR /opt/emqx/erts-10.5.6/../lib -mnesia dir "/opt/emqx/data/mnesia/[email protected]" -config /opt/emqx/data/configs/app.2020.05.12.09.46.45.config -args_file /opt/emqx/data/configs/vm.2020.05.12.09.46.45.args -vm_args /opt/emqx/data/configs/vm.2020.05.12.09.46.45.args -start_epmd false -epmd_module ekka_epmd -proto_dist ekka -- console
Root: /opt/emqx
/opt/emqx
Starting emqx on node [email protected]
Start http:management listener on 8081 successfully.
Start http:dashboard listener on 18083 successfully.
Start mqtt:tcp listener on 127.0.0.1:11883 successfully.
Start mqtt:tcp listener on 0.0.0.0:1883 successfully.
Start mqtt:ws listener on 0.0.0.0:8083 successfully.
Start mqtt:ssl listener on 0.0.0.0:8883 successfully.
Start mqtt:wss listener on 0.0.0.0:8084 successfully.
EMQ X Broker 4.0.5 is running now!
Eshell V10.5.6  (abort with ^G)

docker-compose.yml

version: "3.3"

services:
  emqx-backend01:
    image: emqx-backend:1.0.0
    environment:
      - "EMQX_LOADED_PLUGINS=\"emqx_management,emqx_auth_http,emqx_recon,emqx_retainer,emqx_dashboard\""
      - "EMQX_NAME=emqx"
      - "EMQX_HOST=emqx.backend01"
      - "EMQX_CLUSTER__DISCOVERY=static"
      - "[email protected], [email protected], [email protected]"
    volumes:
      - emqx-backend01-logs:/opt/emqx/log
    deploy:
      replicas: 1
      placement:
        constraints:
          # - node.role == worker
          - node.labels.worker == 1
      restart_policy:
        condition: none
      labels:
        - "traefik.enable=true"
        - "traefik.docker.network=local-network"
        # http with redirection
        - "traefik.http.routers.emqx-backend-dashboard.entrypoints=web"
        - "traefik.http.routers.emqx-backend-dashboard.rule=Host(`<HOSTNAME>`)"
        - "traefik.http.middlewares.redirect-middleware.redirectscheme.scheme=https"
        - "traefik.http.routers.emqx-backend-dashboard.middlewares=redirect-middleware"
        # https
        - "traefik.http.routers.emqx-backend-dashboard-secure.rule=Host(`<HOSTNAME>`)"
        - "traefik.http.routers.emqx-backend-dashboard-secure.entrypoints=websecure"
        - "traefik.http.routers.emqx-backend-dashboard-secure.tls=true"
        - "traefik.http.routers.emqx-backend-dashboard-secure.tls.certresolver=myresolver"
        - "traefik.http.routers.emqx-backend-dashboard-secure.service=service-emqx-backend-dashboard"
        - "traefik.http.services.service-emqx-backend-dashboard.loadbalancer.server.port=18083"
        # mqtts
        - "traefik.tcp.routers.emqx-backend-mqtts.entrypoints=mqtts-backend"
        - "traefik.tcp.routers.emqx-backend-mqtts.rule=HostSNI(`<HOSTNAME>`)"
        - "traefik.tcp.routers.emqx-backend-mqtts.tls=true"
        - "traefik.tcp.routers.emqx-backend-mqtts.tls.certresolver=myresolver"
        - "traefik.tcp.routers.emqx-backend-mqtts.service=service-mqtts-backend"
        - "traefik.tcp.services.service-mqtts-backend.loadbalancer.server.port=1883"
    networks:
      local-network:
        aliases:
        - emqx.backend01
        - emqx.backend

  emqx-backend02:
    image: emqx-backend:1.0.0
    environment:
      - "EMQX_LOADED_PLUGINS=\"emqx_management,emqx_auth_http,emqx_recon,emqx_retainer,emqx_dashboard\""
      - "EMQX_NAME=emqx"
      - "EMQX_HOST=emqx.backend02"
      - "EMQX_CLUSTER__DISCOVERY=static"
      - "[email protected], [email protected], [email protected]"
    deploy:
      replicas: 1
      placement:
        constraints:
          # - node.role == worker
          - node.labels.worker == 2
      restart_policy:
        condition: on-failure
      labels:
        - "traefik.enable=true"
        - "traefik.docker.network=local-network"
        # http with redirection
        - "traefik.http.routers.emqx-backend-dashboard.entrypoints=web"
        - "traefik.http.routers.emqx-backend-dashboard.rule=Host(`<HOSTNAME>`)"
        - "traefik.http.middlewares.redirect-middleware.redirectscheme.scheme=https"
        - "traefik.http.routers.emqx-backend-dashboard.middlewares=redirect-middleware"
        # https
        - "traefik.http.routers.emqx-backend-dashboard-secure.rule=Host(`<HOSTNAME>`)"
        - "traefik.http.routers.emqx-backend-dashboard-secure.entrypoints=websecure"
        - "traefik.http.routers.emqx-backend-dashboard-secure.tls=true"
        - "traefik.http.routers.emqx-backend-dashboard-secure.tls.certresolver=myresolver"
        - "traefik.http.routers.emqx-backend-dashboard-secure.service=service-emqx-backend-dashboard"
        - "traefik.http.services.service-emqx-backend-dashboard.loadbalancer.server.port=18083"
        # mqtts
        - "traefik.tcp.routers.emqx-backend-mqtts.entrypoints=mqtts-backend"
        - "traefik.tcp.routers.emqx-backend-mqtts.rule=HostSNI(`<HOSTNAME>`)"
        - "traefik.tcp.routers.emqx-backend-mqtts.tls=true"
        - "traefik.tcp.routers.emqx-backend-mqtts.tls.certresolver=myresolver"
        - "traefik.tcp.routers.emqx-backend-mqtts.service=service-mqtts-backend"
        - "traefik.tcp.services.service-mqtts-backend.loadbalancer.server.port=1883"
    networks:
      local-network:
        aliases:
        - emqx.backend02
        - emqx.backend

  emqx-backend03:
    image: emqx-backend:1.0.0
    environment:
      - "EMQX_LOADED_PLUGINS=\"emqx_management,emqx_auth_http,emqx_recon,emqx_retainer,emqx_dashboard\""
      - "EMQX_NAME=emqx"
      - "EMQX_HOST=emqx.backend03"
      - "EMQX_CLUSTER__DISCOVERY=static"
      - "[email protected], [email protected], [email protected]"
    deploy:
      replicas: 1
      placement:
        constraints:
          # - node.role == worker
          # - node.labels.mongo.replica == 3
          - node.labels.worker == 3
      restart_policy:
        condition: on-failure
      labels:
        - "traefik.enable=true"
        - "traefik.docker.network=local-network"
        # http with redirection
        - "traefik.http.routers.emqx-backend-dashboard.entrypoints=web"
        - "traefik.http.routers.emqx-backend-dashboard.rule=Host(`<HOSTNAME>`)"
        - "traefik.http.middlewares.redirect-middleware.redirectscheme.scheme=https"
        - "traefik.http.routers.emqx-backend-dashboard.middlewares=redirect-middleware"
        # https
        - "traefik.http.routers.emqx-backend-dashboard-secure.rule=Host(`<HOSTNAME>`)"
        - "traefik.http.routers.emqx-backend-dashboard-secure.entrypoints=websecure"
        - "traefik.http.routers.emqx-backend-dashboard-secure.tls=true"
        - "traefik.http.routers.emqx-backend-dashboard-secure.tls.certresolver=myresolver"
        - "traefik.http.routers.emqx-backend-dashboard-secure.service=service-emqx-backend-dashboard"
        - "traefik.http.services.service-emqx-backend-dashboard.loadbalancer.server.port=18083"
        # mqtts
        - "traefik.tcp.routers.emqx-backend-mqtts.entrypoints=mqtts-backend"
        - "traefik.tcp.routers.emqx-backend-mqtts.rule=HostSNI(`<HOSTNAME>`)"
        - "traefik.tcp.routers.emqx-backend-mqtts.tls=true"
        - "traefik.tcp.routers.emqx-backend-mqtts.tls.certresolver=myresolver"
        - "traefik.tcp.routers.emqx-backend-mqtts.service=service-mqtts-backend"
        - "traefik.tcp.services.service-mqtts-backend.loadbalancer.server.port=1883"
    networks:
      local-network:
        aliases:
        - emqx.backend03
        - emqx.backend

networks:
  local-network:
    external: true

volumes:
  emqx-backend01-logs:

@Rory-Z
Copy link
Contributor

Rory-Z commented May 14, 2020

@renatomotorline Sorry, I don't have a docker swarm cluster, so I can't test the docker-compose.yaml file you gave me, but I deleted deploy from the docker-compose.yaml file and set the network to driver: bridge, then it is possible to successfully deploy the docker-compose cluster, and the health check is also passed, so it is recommended that you review your own docker swarm deployment

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants