Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Internal SMTP connection fails with App Mesh #572

Closed
dhilgarth opened this issue Jul 1, 2023 · 1 comment
Closed

[BUG] Internal SMTP connection fails with App Mesh #572

dhilgarth opened this issue Jul 1, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@dhilgarth
Copy link

dhilgarth commented Jul 1, 2023

Describe the bug
Service A tries to send E-Mails via SMTP server running in Service B. Connection fails with Unexpected socket close
Removing envoy and proxy configuration from the SMTP server task fixes the issue

Platform
ECS Fargate

To Reproduce

  • Create task definitions for the app task that sends the mail and the SMTP server task that is the SMTP server (I've used mailhog/mailhog:latest for this). The task definitions have to include the envoy container definitions as well as the proxy configurations.
  • Create virtual services, virtual routers and virtual nodes for both. Add the SMTP server as a backend to the app virtual node.
  • When everything is started, try to send an email from the app to the SMTP server - it will fail. The envoy proxy of the app task will show that it tries to connect to the SMTP task but in the SMTP envoy log, there is nothing regarding an incoming connection.
  • Remove just the proxy configuration and envoy container definition from the SMTP server task definition but leave the app mesh configuration untouched -> it will start working
  • The HTTP connection from the app task to the Mailhog UI on port 8025 always works
  • The security group allows all inbound traffic from the VPC

Expected behavior
Connection should succeed even with envoy sidecar

Config files, and API responses

Virtual Node App

meshName: dhtesting-local_mesh
virtualNodeName: app-vn
spec:
  backends:
    - virtualService:
        virtualServiceName: mailhog.dhtesting-local.svc.cluster.local
  listeners:
    - portMapping:
        port: 8000
        protocol: http
  logging:
    accessLog:
      file:
        path: /dev/stdout
  serviceDiscovery:
    awsCloudMap:
      namespaceName: dhtesting-local.svc.cluster.local
      serviceName: app

Virtual Node Mailhog

meshName: dhtesting-local_mesh
virtualNodeName: mailhog-vn
spec:
  backends: []
  listeners:
    - portMapping:
        port: 8025
        protocol: http
    - portMapping:
        port: 1025
        protocol: tcp
  logging:
    accessLog:
      file:
        path: /dev/stdout
  serviceDiscovery:
    awsCloudMap:
      namespaceName: dhtesting-local.svc.cluster.local
      serviceName: mailhog

Virtual Router Mailhog

meshName: dhtesting-local_mesh
virtualRouterName: mailhog-vr
spec:
  listeners:
    - portMapping:
        port: 8025
        protocol: http
    - portMapping:
        port: 1025
        protocol: tcp

Route Mailhog 1025

meshName: dhtesting-local_mesh
routeName: mailhog_route_1025
virtualRouterName: mailhog-vr
spec:
  tcpRoute:
    action:
      weightedTargets:
        - port: 1025
          virtualNode: mailhog-vn
          weight: 1
    match:
      port: 1025

Route Mailhog 8025

meshName: dhtesting-local_mesh
routeName: mailhog_route_8025
virtualRouterName: mailhog-vr
spec:
  httpRoute:
    action:
      weightedTargets:
        - port: 8025
          virtualNode: mailhog-vn
          weight: 1
    match:
      port: 8025
      prefix: /

Virtual Service Mailhog

meshName: dhtesting-local_mesh
virtualServiceName: mailhog.dhtesting-local.svc.cluster.local
spec:
  provider:
    virtualRouter:
      virtualRouterName: mailhog-vr

Task definition App

{
    "family": "dhtesting-local_app",
    "containerDefinitions": [
        {
            "name": "app",
            "image": "...",
            "cpu": 0,
            "portMappings": [
                {
                    "name": "app-8000-tcp",
                    "containerPort": 8000,
                    "hostPort": 8000,
                    "protocol": "tcp"
                }
            ],
            "essential": true,
            "environment": [
                {
                    "name": "MAIL_HOST",
                    "value": "mailhog.dhtesting-local.svc.cluster.local"
                },
                {
                    "name": "MAIL_PORT",
                    "value": "1025"
                }
            ],
            "dependsOn": [
                {
                    "containerName": "envoy",
                    "condition": "HEALTHY"
                }
            ],
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-create-group": "true",
                    "awslogs-group": "/dhtesting/local/services/app",
                    "awslogs-region": "eu-central-1",
                    "awslogs-stream-prefix": "_"
                }
            },
            "healthCheck": {
                "command": [
                    "CMD-SHELL",
                    "curl -f http://localhost:8000/ || exit 1"
                ],
                "interval": 30,
                "timeout": 5,
                "retries": 3
            }
        },
        {
            "name": "envoy",
            "image": "public.ecr.aws/appmesh/aws-appmesh-envoy:v1.25.4.0-prod",
            "cpu": 0,
            "portMappings": [],
            "essential": true,
            "environment": [
                {
                    "name": "APPMESH_RESOURCE_ARN",
                    "value": "arn:aws:appmesh:eu-central-1:XXX:mesh/dhtesting-local_mesh/virtualNode/app-vn"
                },
                {
                    "name": "ENVOY_LOG_LEVEL",
                    "value": "debug"
                }
            ],
            "mountPoints": [],
            "volumesFrom": [],
            "user": "1337",
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-create-group": "true",
                    "awslogs-group": "/dhtesting/local/services/app",
                    "awslogs-region": "eu-central-1",
                    "awslogs-stream-prefix": "_"
                }
            },
            "healthCheck": {
                "command": [
                    "CMD-SHELL",
                    "curl -s http://localhost:9901/server_info | grep state | grep -q LIVE"
                ],
                "interval": 5,
                "timeout": 2,
                "retries": 3,
                "startPeriod": 10
            }
        }
    ],
    "taskRoleArn": "arn:aws:iam::XXX:role/dhtesting-local_ecs-task-role",
    "executionRoleArn": "arn:aws:iam::XXX:role/dhtesting-local_ecs_execution_role",
    "networkMode": "awsvpc",
    "requiresCompatibilities": [
        "FARGATE"
    ],
    "cpu": "2048",
    "memory": "4096",
    "proxyConfiguration": {
        "type": "APPMESH",
        "containerName": "envoy",
        "properties": [
            {
                "name": "ProxyIngressPort",
                "value": "15000"
            },
            {
                "name": "AppPorts",
                "value": "8000"
            },
            {
                "name": "EgressIgnoredIPs",
                "value": "169.254.170.2,169.254.169.254"
            },
            {
                "name": "IgnoredUID",
                "value": "1337"
            },
            {
                "name": "ProxyEgressPort",
                "value": "15001"
            }
        ]
    }
}

Task definition Mailhog

{
    "family": "dhtesting-local_mailhog",
    "containerDefinitions": [
        {
            "name": "mailhog",
            "image": "mailhog/mailhog:latest",
            "cpu": 0,
            "portMappings": [
                {
                    "containerPort": 8025,
                    "hostPort": 8025,
                    "protocol": "tcp"
                },
                {
                    "containerPort": 1025,
                    "hostPort": 1025,
                    "protocol": "tcp"
                }
            ],
            "essential": true,
            "environment": [],
            "mountPoints": [],
            "volumesFrom": [],
            "dependsOn": [
                {
                    "containerName": "envoy",
                    "condition": "HEALTHY"
                }
            ],
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-create-group": "true",
                    "awslogs-group": "/dhtesting/local/services/mailhog",
                    "awslogs-region": "eu-central-1",
                    "awslogs-stream-prefix": "_"
                }
            },
            "healthCheck": {
                "command": [
                    "CMD-SHELL",
                    "wget -qO- http://localhost:8025/ > /dev/null 2>&1 || exit 1"
                ],
                "interval": 30,
                "timeout": 5,
                "retries": 3
            }
        },
        {
            "name": "envoy",
            "image": "public.ecr.aws/appmesh/aws-appmesh-envoy:v1.25.4.0-prod",
            "cpu": 0,
            "portMappings": [],
            "essential": true,
            "environment": [
                {
                    "name": "APPMESH_RESOURCE_ARN",
                    "value": "arn:aws:appmesh:eu-central-1:XXX:mesh/dhtesting-local_mesh/virtualNode/mailhog-vn"
                },
                {
                    "name": "ENVOY_LOG_LEVEL",
                    "value": "debug"
                }
            ],
            "mountPoints": [],
            "volumesFrom": [],
            "user": "1337",
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-create-group": "true",
                    "awslogs-group": "/dhtesting/local/services/mailhog",
                    "awslogs-region": "eu-central-1",
                    "awslogs-stream-prefix": "_"
                }
            },
            "healthCheck": {
                "command": [
                    "CMD-SHELL",
                    "curl -s http://localhost:9901/server_info | grep state | grep -q LIVE"
                ],
                "interval": 5,
                "timeout": 2,
                "retries": 3,
                "startPeriod": 10
            }
        }
    ],
    "taskRoleArn": "arn:aws:iam::XXX:role/dhtesting-local_ecs-task-role",
    "executionRoleArn": "arn:aws:iam::XXX:role/dhtesting-local_ecs_execution_role",
    "networkMode": "awsvpc",
    "requiresCompatibilities": [
        "FARGATE"
    ],
    "cpu": "256",
    "memory": "512",
    "proxyConfiguration": {
        "type": "APPMESH",
        "containerName": "envoy",
        "properties": [
            {
                "name": "ProxyIngressPort",
                "value": "15000"
            },
            {
                "name": "AppPorts",
                "value": "8025,1025"
            },
            {
                "name": "EgressIgnoredIPs",
                "value": "169.254.170.2,169.254.169.254"
            },
            {
                "name": "IgnoredUID",
                "value": "1337"
            },
            {
                "name": "ProxyEgressPort",
                "value": "15001"
            }
        ]
    }
}

Log from app-envoy

[2023-07-01 09:35:58.512][55][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:211] [C197] new tcp proxy session
[2023-07-01 09:35:58.512][55][trace][connection] [source/common/network/connection_impl.cc:362] [C197] readDisable: disable=true disable_count=0 state=0 buffer_length=0
[2023-07-01 09:35:58.512][55][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:383] [C197] Creating connection to cluster cds_egress_dhtesting-local_mesh_mailhog-vn_tcp_1025
[2023-07-01 09:35:58.512][55][trace][connection] [source/common/network/connection_impl.cc:423] [C197] raising connection event 2
[2023-07-01 09:35:58.512][55][trace][filter] [source/common/tcp_proxy/tcp_proxy.cc:670] [C197] on downstream event 2, has upstream = false
[2023-07-01 09:35:58.512][55][debug][conn_handler] [source/extensions/listener_managers/listener_manager/active_tcp_listener.cc:147] [C197] new connection from 10.0.1.55:46644
[2023-07-01 09:35:58.512][55][trace][connection] [source/common/network/connection_impl.cc:568] [C197] socket event: 2
[2023-07-01 09:35:58.512][55][trace][connection] [source/common/network/connection_impl.cc:679] [C197] write ready
[2023-07-01 09:35:58.513][55][trace][connection] [source/common/network/connection_impl.cc:362] [C197] readDisable: disable=false disable_count=1 state=0 buffer_length=0
[2023-07-01 09:35:58.513][55][debug][filter] [source/common/tcp_proxy/tcp_proxy.cc:748] [C197] TCP:onUpstreamEvent(), requestedServerName:
[2023-07-01 09:35:58.513][55][trace][connection] [source/common/network/connection_impl.cc:568] [C197] socket event: 2
[2023-07-01 09:35:58.513][55][trace][connection] [source/common/network/connection_impl.cc:679] [C197] write ready
[2023-07-01 09:36:13.518][55][trace][filter] [source/common/tcp_proxy/tcp_proxy.cc:697] [C197] upstream connection received 0 bytes, end_stream=true
[2023-07-01 09:36:13.518][55][trace][connection] [source/common/network/connection_impl.cc:483] [C197] writing 0 bytes, end_stream true
[2023-07-01 09:36:13.518][55][trace][connection] [source/common/network/connection_impl.cc:568] [C197] socket event: 2
[2023-07-01 09:36:13.518][55][trace][connection] [source/common/network/connection_impl.cc:679] [C197] write ready
[2023-07-01 09:36:13.518][55][trace][connection] [source/common/network/connection_impl.cc:568] [C197] socket event: 2
[2023-07-01 09:36:13.518][55][trace][connection] [source/common/network/connection_impl.cc:679] [C197] write ready
[2023-07-01 09:36:13.519][55][trace][connection] [source/common/network/connection_impl.cc:568] [C197] socket event: 3
[2023-07-01 09:36:13.519][55][trace][connection] [source/common/network/connection_impl.cc:679] [C197] write ready
[2023-07-01 09:36:13.519][55][trace][connection] [source/common/network/connection_impl.cc:608] [C197] read ready. dispatch_buffered_data=0
[2023-07-01 09:36:13.519][55][trace][connection] [source/common/network/raw_buffer_socket.cc:24] [C197] read returns: 0
[2023-07-01 09:36:13.519][55][trace][filter] [source/common/tcp_proxy/tcp_proxy.cc:623] [C197] downstream connection received 0 bytes, end_stream=true
[2023-07-01 09:36:13.519][55][debug][connection] [source/common/network/connection_impl.cc:656] [C197] remote close
[2023-07-01 09:36:13.519][55][debug][connection] [source/common/network/connection_impl.cc:250] [C197] closing socket: 0
[2023-07-01 09:36:13.519][55][trace][connection] [source/common/network/connection_impl.cc:423] [C197] raising connection event 0
[2023-07-01 09:36:13.519][55][trace][filter] [source/common/tcp_proxy/tcp_proxy.cc:670] [C197] on downstream event 0, has upstream = true
[2023-07-01 09:36:13.519][55][trace][conn_handler] [source/extensions/listener_managers/listener_manager/active_stream_listener_base.cc:111] [C197] connection on event 0
[2023-07-01 09:36:13.519][55][debug][conn_handler] [source/extensions/listener_managers/listener_manager/active_stream_listener_base.cc:120] [C197] adding to cleanup list

IP 10.0.1.55 is the IP address of the App Task

@dhilgarth dhilgarth added the bug Something isn't working label Jul 1, 2023
@dhilgarth
Copy link
Author

Recreated in the proper repo: aws/aws-app-mesh-roadmap#468

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant