Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

running Galene behind haproxy ? #45

Closed
nlienard opened this issue Jan 22, 2021 · 18 comments
Closed

running Galene behind haproxy ? #45

nlienard opened this issue Jan 22, 2021 · 18 comments

Comments

@nlienard
Copy link

nlienard commented Jan 22, 2021

Question is about HAPROXY.

Galene is installed sucessfuly but can't see each other on a room.

I only see my own face. (users also see only their own face).

Configuration
Running Galene in a container with only a private IP ( 192.168.10.111)
Public IP is behind another container dedicated to haproxy (192.168.10.252 + PUblic IP ) which send the 443 traffic to Galene container on 8443 port.
Also some iptables redirect for 1194/UDP and 10000-65535 udp port from Haproxy container to Galene container.

when starting Galene:

# ./galene -turn PUBLIC_IP:1194
2021/01/22 11:51:01 Starting built-in TURN server
2021/01/22 11:51:21 Relay test failed: timeout
2021/01/22 11:51:21 Perhaps you didn't configure a TURN server?
2021/01/22 11:53:31 client: read tcp 192.168.10.111:8443->192.168.10.252:35850: read: connection reset by peer

As we can see, it shows Haproxy IP (192.168.10.252) instead of user IP. (X Forward For ?).

Does Galene support haproxy ?

thanks

@nlienard nlienard changed the title running galiene behind haproxy ? running Galene behind haproxy ? Jan 22, 2021
@jech
Copy link
Owner

jech commented Jan 22, 2021

It looks like haproxy is breaking the WebSocket connection (that's what the "client:...connection reset by peer" message implies). Please check the haproxy documentation, and make sure that haproxy is configured in order to proxy WebSocket connections to /ws with a timeout of at least 60 seconds (and preferably more, since Galène will timeout idle WebSocket connections on its own).
Please summarise your configuration here once you get it working.

@nlienard
Copy link
Author

Finally i set directly a public IP and it works.
but with a phone, i got wss error.
Anything to do for mobile support ?

@jech
Copy link
Owner

jech commented Jan 22, 2021

Anything to do for mobile support ?

Are you using a TLS certificate that's signed by an authority recognised by the mobile?

@nlienard
Copy link
Author

Finally, it works on mobile. I'm using LET'sENCRYPT certif but i was migrating DNS to the dedicated IP i put on the container.
In the meantime of DNS propagation, i was testing directly with public IP https://IP and indeed it was due to certificate SSL not matching.

When DNS was ok, all became good ! Thanks !

@nlienard
Copy link
Author

For haproxy, if you have some clue about the configuration, i would be interested because i would prefer to have Galene behind it. Any way, great works, it is so simple to setup compared to BBB (i tried it just before and it was just insane).

@jech
Copy link
Owner

jech commented Jan 23, 2021

Glad you've solved your immediate problem. Please do summarise your findings when you manage to get Galène to work behind haproxy.

@jech jech closed this as completed Jan 23, 2021
@nlienard
Copy link
Author

nlienard commented Jan 25, 2021

I did some new test with haproxy, for now it looks like to work but it autodisconnects user when idling too much.

On frontend, i added:

    # Galene START OPTIONS
    timeout connect 0ms
    timeout client 0ms
    timeout server 0ms
    option  http-server-close
    # Galene END OPTIONS

Still using "mode http" because this frontend is shared with many other services but i guess it would be better to have "mode tcp" for websocket.

@jech
Copy link
Owner

jech commented Jan 25, 2021 via email

@nlienard
Copy link
Author

I put 90 sec the timeout but still these errors when a user is disconnected abrubtaly:

Jan 26 09:26:30 atxovh-vis500 galene[16489]: 2021/01/26 09:26:30 PushConn: client is dead
Jan 26 09:26:30 atxovh-vis500 galene[16489]: 2021/01/26 09:26:30 client: read tcp 192.168.10.112:443->192.168.10.252:34536: read: connection reset by peer

@nlienard
Copy link
Author

nlienard commented Jan 26, 2021

image

This is the message in browser when it kicked me out.

In parallel, Galene shows this log:

Jan 26 09:45:04 atxovh-vis500 galene[16489]: 2021/01/26 09:45:04 client: read tcp 192.168.10.112:443->192.168.10.252:51796: read: connection reset by peer

en tcpdump ca donne ca

10:09:05.761661 IP (tos 0x0, ttl 64, id 2427, offset 0, flags [DF], proto TCP (6), length 52)
    atxovh-ha002.57126 > atxovh-vis500.https: Flags [R.], cksum 0x96e3 (incorrect -> 0xcd84), seq 1523, ack 1105, win 501, options [nop,nop,TS val 1794802304 ecr 194367539], length 0

@nlienard
Copy link
Author

I was focus in modification on FRONTEND timeout but it appears there are also timeout in BACKEND side.
After overriding the default, it looks like it is not disconnecting anymore.

in Backend:

timeout connect 600s
timeout server 600s

@nlienard
Copy link
Author

nlienard commented Jan 26, 2021

Now haproxy is working, i turned-off the STUN server json to use only the internal one but not working:

Jan 26 12:25:42 atxovh-vis500 galene[18949]: 2021/01/26 12:25:42 Perhaps you didn't configure a TURN server?
Jan 26 12:26:02 atxovh-vis500 galene[18949]: 2021/01/26 12:26:02 SetRemoteDescription(offer): ICE Agent can not be restarted when gathering
Jan 26 12:26:02 atxovh-vis500 galene[18949]: 2021/01/26 12:26:02 Deleting unknown down connection
Jan 26 12:26:08 atxovh-vis500 galene[18949]: 2021/01/26 12:26:08 SetRemoteDescription(offer): ICE Agent can not be restarted when gathering
Jan 26 12:26:08 atxovh-vis500 galene[18949]: 2021/01/26 12:26:08 Deleting unknown down connection
Jan 26 12:26:14 atxovh-vis500 galene[18949]: 2021/01/26 12:26:14 Deleting unknown down connection
Jan 26 12:26:32 atxovh-vis500 galene[18949]: 2021/01/26 12:26:32 SetRemoteDescription(offer): ICE Agent can not be restarted when gathering
Jan 26 12:26:32 atxovh-vis500 galene[18949]: 2021/01/26 12:26:32 Deleting unknown down connection
Jan 26 12:26:48 atxovh-vis500 galene[18949]: 2021/01/26 12:26:48 SetRemoteDescription(offer): ICE Agent can not be restarted when gathering
Jan 26 12:26:48 atxovh-vis500 galene[18949]: 2021/01/26 12:26:48 Deleting unknown down connection
Jan 26 12:27:02 atxovh-vis500 galene[18949]: 2021/01/26 12:27:02 SetRemoteDescription(offer): ICE Agent can not be restarted when gathering
Jan 26 12:27:02 atxovh-vis500 galene[18949]: 2021/01/26 12:27:02 Deleting unknown down connection

On haproxy container, i got iptables to redirect port 1195 to the Galene container. (TCP/UDP).

@nlienard
Copy link
Author

Ok i had an asymetrical routage, now it works !!!

@jech
Copy link
Owner

jech commented Jan 26, 2021 via email

@nlienard
Copy link
Author

nlienard commented Jan 26, 2021

it was the default gateway of the Galene container which was wrong with bad effet it was going on internet with another public IP that the one configured to go inside. (HAPROXY container has multiples IP public).
Now, that the network configuration is proper, all is working FINE !

HAPROXY : OK
INTERNAL STUN SERVER: OK

IPTABLES (ON HAPROXY CONTAINER)

-A PREROUTING -d A.B.C.D/32 -p tcp -m tcp --dport 1195 -m comment --comment "//visio.xxxx.net" -j DNAT --to-destination 192.168.10.112:1195
-A PREROUTING -d A.B.C.D/32 -p udp -m udp --dport 1195 -m comment --comment "//visio.xxxx.net" -j DNAT --to-destination 192.168.10.112:1195

HAPROXY

frontend frontend_atx_http
        bind A.B.C.D:80 name A.B.C.D:80 ecdhe secp384r1
        bind A.B.C.D.:443 name A.B.C.D:443 ssl crt-list /etc/haproxy/ovh1.crt
        bind 192.168.10.252:80 name 192.168.10.252:80 ecdhe secp384r1
        bind 192.168.10.252:443 name 192.168.10.252:443 ssl crt-list /etc/haproxy/ovh1.crt
        mode                    http
        log                     global
        option                  http-keep-alive
        option                  forwardfor

        # Galene START OPTIONS
        timeout connect 600s
        timeout client 600s
        timeout server 600s
        timeout http-keep-alive 600s
        option http-keep-alive
        option http-pretend-keepalive
        option http-server-close
        # Galene END OPTIONS
  • BACKEND
# atx_visio_stg
backend atx_visio_stg
  mode http
  log                   global
  # option httpchk
  # Galene Timeout
  timeout connect       600s
  timeout server        600s
  retries               3
  server atxovh-vis500 192.168.10.112:443 check ssl verify none inter 5s

GALENE CONFIG

root@atxovh-vis500:/data/galene/groups# cat /etc/systemd/system/galene.service
 [Unit]
    Description=Galene
    After=network.target

 [Service]
    Type=simple
    WorkingDirectory=/data/galene
    User=galene
    Group=galene
    ExecStart=/data/galene/galene -turn A.B.C.D:1195 -http 192.168.10.112:443
    LimitNOFILE=65536
    AmbientCapabilities=CAP_NET_BIND_SERVICE

 [Install]
    WantedBy=multi-user.target

@jech
Copy link
Owner

jech commented Jan 26, 2021 via email

@nlienard
Copy link
Author

I 've still some error like that :

Jan 26 14:16:28 atxovh-vis001 galene[558]: turn ERROR: 2021/01/26 14:16:28 error when handling datagram: failed to handle Allocate-request from 82.64.236.146:42213: relay alr
for 5-TUPLE
Jan 26 14:16:29 atxovh-vis001 galene[558]: turn ERROR: 2021/01/26 14:16:29 error when handling datagram: failed to handle Allocate-request from 82.64.236.146:42213: relay alr
for 5-TUPLE
Jan 26 14:16:30 atxovh-vis001 galene[558]: turn ERROR: 2021/01/26 14:16:30 error when handling datagram: failed to handle Allocate-request from 82.64.236.146:42213: relay alr
for 5-TUPLE
Jan 26 14:16:30 atxovh-vis001 galene[558]: turn ERROR: 2021/01/26 14:16:30 error when handling datagram: failed to handle Allocate-request from 82.64.236.146:40939: relay alr
for 5-TUPLE
Jan 26 14:16:30 atxovh-vis001 galene[558]: turn ERROR: 2021/01/26 14:16:30 error when handling datagram: failed to handle Allocate-request from 82.64.236.146:44369: relay alr
for 5-TUPLE

Could it be due to the fact i'm using 2 devices under my wifi network ?

thanks

@jech
Copy link
Owner

jech commented Jan 26, 2021

This is probably nothing to worry about. Please see pion/turn#197.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants