Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Swarm services #186

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

WIP: Swarm services #186

wants to merge 7 commits into from

Conversation

ehazlett
Copy link
Owner

@ehazlett ehazlett commented Jul 9, 2016

This adds support for Docker 1.12 services. There is an example doc showing how it works. All container labels that were used to configure Interlock should be supported using Service labels.

This also switches from the dockerclient Docker Go lib to the official docker/engine-api client. This adds a few fixes and should improve stability.

This also pulls in the InfluxDB backend for the beacon extension.

Closes #178

@tpbowden
Copy link

tpbowden commented Jul 9, 2016

As this runs globally and uses docker.sock, how do you handle worker nodes which can't give information about swarm services? Does it need to be restricted to managers only?

@curtismitchell
Copy link

@tpbowden I imagine the config.toml file could still use the remote api via tcp as seen here. Would that handle your concern?

@ehazlett
Copy link
Owner Author

ehazlett commented Jul 9, 2016

Yes or a proxy. I have a proxy that works and comes with tls that could be
used on the workers.

On Jul 9, 2016 15:36, "Curtis Mitchell" [email protected] wrote:

@tpbowden https://github.com/tpbowden I imagine the config.toml file
could still use the remote api via tcp as seen here
https://github.com/ehazlett/interlock/blob/master/docs/examples/nginx-swarm/config.toml.
Would that handle your concern?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#186 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAP6IsuTI8r2T9soMHYoDBMTVTJQuj-Oks5qT_hPgaJpZM4JIiDb
.

@ehazlett
Copy link
Owner Author

ehazlett commented Jul 9, 2016

I have been thinking of building this into interlock and running as a
global service to work with the worker nodes.

On Jul 9, 2016 15:48, "Evan Hazlett" [email protected] wrote:

Yes or a proxy. I have a proxy that works and comes with tls that could be
used on the workers.

On Jul 9, 2016 15:36, "Curtis Mitchell" [email protected] wrote:

@tpbowden https://github.com/tpbowden I imagine the config.toml file
could still use the remote api via tcp as seen here
https://github.com/ehazlett/interlock/blob/master/docs/examples/nginx-swarm/config.toml.
Would that handle your concern?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#186 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAP6IsuTI8r2T9soMHYoDBMTVTJQuj-Oks5qT_hPgaJpZM4JIiDb
.

@curtismitchell
Copy link

Interesting idea - how is access to the proxy secured?

@ehazlett
Copy link
Owner Author

ehazlett commented Jul 9, 2016

TLS

On Jul 9, 2016 16:04, "Curtis Mitchell" [email protected] wrote:

Interesting idea - how is access to the proxy secured?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#186 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAP6IvQgpHab2qHy4vjByEGLRUBLqF3Aks5qT_7jgaJpZM4JIiDb
.

@curtismitchell
Copy link

Whoops - let me clarify. I understand it uses TLS to talk to the remote api. Does it also use TLS between the docker client and itself? And, does it require any authentication?

@tpbowden
Copy link

tpbowden commented Jul 9, 2016

@curtismitchell How would config.toml work with both docker.sock and TLS if its a global environment variable? Surely putting certs/keys with access to the master nodes onto every worker is a security issue and also a fair amount of effort?

@ehazlett have you got an example of this proxy working? Sounds like an interesting idea, not sure how it would work security wise.

@curtismitchell
Copy link

@tpbowden it would only need to use TLS on the swarm manager.

@curtismitchell
Copy link

@ehazlett the docs do not seem to show the nginx.conf file being shared between the nginx service and interlock service. Is sharing (putting on a named volume) necessary for this to work?

@tpbowden sorry - it looks like I was mistaken in my previous comment. Per the docs, interlock should only need access to the docker socket on the swarm manager. I'm testing this out, now.

@ehazlett
Copy link
Owner Author

@curtismitchell no there is no need to share the config. interlock will configure each instance.

@curtismitchell
Copy link

@ehazlett Thanks for the speedy reply. It works! It just took a little longer than I expected based on the 2s polling interval.

@curtismitchell
Copy link

@ehazlett It's inconsistent. I followed the steps in your documentation with one exception: I used the dockercloud/hello-world image to test it out instead of your image. Furthermore, I'm using a 3 node cluster with one manager.

With nginx and interlock running as global services and the hello-world image running as a non-global service with 1 replica (or whatever the default), my request to http://demo.local/ did not work right away. I waited a few minutes, then posted my previous question about the use of a named volume.

It took some time (minutes maybe? I left the house and came back.) before I was able to get the hello world message on the screen with a request to http://demo.local/. That's when I posted my last comment.

Within minutes, it stopped working again. I hadn't made any changes. I scaled the hello service up to 5 replicas in hope that docker would spread the additional tasks across all the nodes. It did. I now have Refresh Monkey, the Chrome extension, refreshing http://demo.local every 30s. The response is either "hello world", or "Welcome to nginx!". And, it appears to be random.

BTW, is this the right place for this feedback? Since it doesn't appear that this PR has been merged yet, I didn't know where else to offer these observations.

@ehazlett
Copy link
Owner Author

Yes there is currently a limitation in that you can only run interlock on
managers or you will run into this issue. I'm looking for a fix. You can
either pin Interlock on the manager (no global service) or run all managers
for the interim. I'm working on a fix.

On Jul 16, 2016 12:38, "Curtis Mitchell" [email protected] wrote:

@ehazlett https://github.com/ehazlett It's inconsistent. I followed the
steps in your documentation with one exception: I used the
dockercloud/hello-world image to test it out instead of your image.
Furthermore, I'm using a 3 node cluster with one manager.

With nginx and interlock running as global services and the hello-world
image running as a non-global service with 1 replica (or whatever the
default), my request to http://demo.local/ did not work right away. I
waited a few minutes, then posted my previous question about the use of a
named volume.

It took some time (minutes maybe? I left the house and came back.) before
I was able to get the hello world message on the screen with a request to
http://demo.local/. That's when I posted my last comment.

Within minutes, it stopped working again. I hadn't made any changes. I
scaled the hello service up to 5 replicas in hope that docker would
spread the additional tasks across all the nodes. It did. I now have Refresh
Monkey
, the Chrome extension, refreshing http://demo.local every 30s.
The response is either "hello world", or "Welcome to nginx!". And, it
appears to be random.

BTW, is this the right place for this feedback? Since it doesn't appear
that this PR has been merged yet, I didn't know where else to offer these
observations.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#186 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAP6IgNKP6a0p9ZJTMz6ykR3TqCQTyICks5qWQkAgaJpZM4JIiDb
.

@ehazlett
Copy link
Owner Author

Yes this is the right place for feedback. This will be the PR that adds
Swarm mode support.

If you just want to chat for help or debug you can ping me on irc. Maybe
I'll setup a channel for this. Thoughts?

On Jul 16, 2016 12:48, "Evan Hazlett" [email protected] wrote:

Yes there is currently a limitation in that you can only run interlock on
managers or you will run into this issue. I'm looking for a fix. You can
either pin Interlock on the manager (no global service) or run all managers
for the interim. I'm working on a fix.

On Jul 16, 2016 12:38, "Curtis Mitchell" [email protected] wrote:

@ehazlett https://github.com/ehazlett It's inconsistent. I followed
the steps in your documentation with one exception: I used the
dockercloud/hello-world image to test it out instead of your image.
Furthermore, I'm using a 3 node cluster with one manager.

With nginx and interlock running as global services and the hello-world
image running as a non-global service with 1 replica (or whatever the
default), my request to http://demo.local/ did not work right away. I
waited a few minutes, then posted my previous question about the use of a
named volume.

It took some time (minutes maybe? I left the house and came back.) before
I was able to get the hello world message on the screen with a request to
http://demo.local/. That's when I posted my last comment.

Within minutes, it stopped working again. I hadn't made any changes. I
scaled the hello service up to 5 replicas in hope that docker would
spread the additional tasks across all the nodes. It did. I now have Refresh
Monkey
, the Chrome extension, refreshing http://demo.local every 30s.
The response is either "hello world", or "Welcome to nginx!". And, it
appears to be random.

BTW, is this the right place for this feedback? Since it doesn't appear
that this PR has been merged yet, I didn't know where else to offer these
observations.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#186 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAP6IgNKP6a0p9ZJTMz6ykR3TqCQTyICks5qWQkAgaJpZM4JIiDb
.

@curtismitchell
Copy link

Oh! Again, I misunderstood something that was mentioned earlier. Thanks for the explanation.

@ehazlett
Copy link
Owner Author

No problem :)

On Jul 16, 2016 12:50, "Curtis Mitchell" [email protected] wrote:

Oh! Again, I misunderstood something that was mentioned earlier. Thanks
for the explanation.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#186 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAP6Ir2stcshqZU2BqxaXkhXo2ttKC8Jks5qWQvYgaJpZM4JIiDb
.

@curtismitchell
Copy link

I think irc or gitter.im would be great if there were more participants. So far, it has only been 3 of us, and I think my questions are answered for now.

@virtuman
Copy link

I was actually unable to get a hello world at all, but i did get default Welcome to nginx at some point.

My thoughts were that in the nginx-service steps guide - it included flag ,writable=yes and my swarm manager complained about this flat for -m mount options, so I looked through the docker's source and found that there isn't really a writable attribute but instead readonly attribute available in mount options, so i tried readonly=false, but it still didn't work. I'm assuming that since the socket is probably not writable by default - my nginx instances never get notified that new service joined in and all nginx configs look like defaults. Is there something that I need to do to be able to use that writable=true flag or is the issue something different altogether?

Also - when the new service joins in - does the container that is running proxy extension (haproxy or nginx) actually get retstarted? that's what it looked like when i was testing non-services option. Is there a reason why nginx or haproxy reload not used? ie nginx -s reload, i thought it was odd that the restart was required, seems like it would drop all active connections and through a not available / found error if someone was waiting for response from the service on open connections?

So far i was completely unable to get the services version working with one exception, i did the follwing:

  • started interlock, nginx and demo container on same node, publishing port 82 on nginx and updated config.toml to reflect that port
  • modified /etc/hosts to specify host and ip address to point to this specific instance
  • was able to connect to the test domain locally (although it displayed Welcome to nginx instead of hello world from demo container that was running)
  • was never able to connect from remote connection, as in docker never exposed public port but netstat -tulpn | grep 82 shows that port 82 bound to ipv6 only

Since it works for somebody but is not working for me - my guess is that i'm not doing something right and any pointers are really very appreciated.

Using CentOS7, Docker 1.12-rc4, 2xNICs (one for local server network and one for public internet)

@superseb
Copy link

It's kinda working for me, if I set up nginx + interlock + demo app and have 1 task running only the nginx with the task will reload nginx and update nginx.conf, the other one will return 503 if I hit it by reloading the page.

docker service create \
--mode global \
--name interlock \
--mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock,writable=true \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to moby/moby#24053
writable=true is the default behavior and the flag has been removed

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I'll re-vendor and update.

On Jul 31, 2016 05:51, "Luc Vieillescazes" [email protected] wrote:

In docs/examples/nginx-services/README.md
#186 (comment):

+[[Extensions]]
+Name = "nginx"
+ConfigPath = "/etc/nginx/nginx.conf"
+PidPath = "/var/run/nginx.pid"
+TemplatePath = ""
+MaxConn = 1024
+Port = 80
+ + +Now create the Interlock service: + +
+docker service create \

  • --mode global \
  • --name interlock \
  • --mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock,writable=true \

According to moby/moby#24053
moby/moby#24053
writable=true is the default behavior and the flag has been removed


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/ehazlett/interlock/pull/186/files/97977eb4a071a8c515ac030999a90c798d6477d1#r72901527,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAP6ImaphFDoOaO14zzA9lOOZCetHiXhks5qbHAzgaJpZM4JIiDb
.

@Richard-Mathie
Copy link
Contributor

So I got this working to a certain extent in 1.12.1 a cluster of nodes

hosts setup:

docker-machine create master1 node1 node2
eval $(docker-machine env master1)
docker swarm init
WORKER_TOKEN=`docker $(docker-machine config master1) swarm join-token -q`
MANAGER_IP=$(docker-machine ip master1)
SWARM_PORT=2377
docker $(docker-machine config node1) swarm join --token $WORKER_TOKEN $MANAGER_IP:$SWARM_PORT

service setup:

docker service create \
    --name nginx \
    --mode global \
    --constraint 'node.role == manager' \
    --publish "80:80" \
    --label 'interlock.ext.name=nginx' \
    nginx \
    nginx -g "daemon off;" -c /etc/nginx/nginx.conf

INTERLOCK_CONFIG='ListenAddr = ":8080"
DockerURL = "unix:///var/run/docker.sock"
PollInterval = "2s"

[[Extensions]]
Name = "nginx"
ConfigPath = "/etc/nginx/nginx.conf"
PidPath = "/var/run/nginx.pid"
TemplatePath = ""
MaxConn = 1024
Port = 80
'

docker service create \
    --name interlock \
    --mode global \
    --constraint 'node.role == manager' \
    --mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock \
    --env INTERLOCK_CONFIG="$INTERLOCK_CONFIG" \
    ehazlett/interlock:swarm-services -D run

docker service create \
    --name demo \
    --publish 8080 \
    --replicas 10 \
    --env SHOW_VERSION=1 \
    --label interlock.hostname=demo \
    --label interlock.domain=local \
    ehazlett/docker-demo:latest

then add demo.local to hosts

echo $MANAGER_IP demo.local | sudo tee -a /etc/hosts

However

We only ever see one container on the demo app. This could be todo with how the docker guys have changed host discovery in 1.12.1.

I notice that the upstream server listed in /etc/nginx/nginx.conf is the ingress-endpoint of the ingress network for the host on which the interlock container resides. Curling this from inside a container on the network indeed only pings one of the containers:

root@0ae4955447f8:/# curl 10.255.0.3:30000/ping                                                                                                                                                                                              
{"instance":"893624ec1f64","version":"0.1"}
root@0ae4955447f8:/# curl 10.255.0.3:30000/ping
{"instance":"893624ec1f64","version":"0.1"}
root@0ae4955447f8:/#

perhaps load balancing is not happening the way it used to in 1.12. Also a new feature has been added to the dns for service discovery so you can do:

root@0ae4955447f8:/# nslookup tasks.demo
Server:     127.0.0.11
Address:    127.0.0.11#53

Non-authoritative answer:
Name:   tasks.demo
Address: 10.255.0.45
Name:   tasks.demo
Address: 10.255.0.38
.....
Name:   tasks.demo
Address: 10.255.0.41
Name:   tasks.demo
Address: 10.255.0.42

I guess this would ideally be translated to the continence of the nginx config file.

incidentally should we be putting the config in /etc/nginx/conf.d/interlock instead of overwriting the main config file? we could create a config for each service even....

@Richard-Mathie
Copy link
Contributor

Strange now after a machine re start and building the cluster from scratch this works as expected...

oh well.

Its a bit annoying that interlock and nginx have to reside on the same node and that node has to be a managers node. As this means the publicly facing nodes end up being managers, and you are limited to have as many nginx containers as managers. Am I right in thinking this is true?

@Richard-Mathie
Copy link
Contributor

Humm ok this dose seem to break on killing of scaling the service down, we will then get intermittent failures as it looks like the docker is keeping hold of the stale IP's for the service

after scaling docker service scale demo=5

:/# curl 10.255.0.3:30000/ping
^C
:/# curl 10.255.0.3:30000/ping
^C
:/# curl 10.255.0.3:30000/ping
{"instance":"668505eeae8c","version":"0.1"}

the dns records still have 10 items

:/# nslookup tasks.demo_ui | grep Address:
Address:    127.0.0.11#53
Address: 10.255.0.52
Address: 10.255.0.47
Address: 10.255.0.51
Address: 10.255.0.49
Address: 10.255.0.53
Address: 10.255.0.54
Address: 10.255.0.48
Address: 10.255.0.55
Address: 10.255.0.50

and i presume this is why I the connection to the ingress endpoint is intermittent as docker is dialling into those stale IP's.

related to moby/moby#25130
moby/moby#25219
moby/moby#23855

Also It looks like we can't use other overlay networks for interlock to work.

@ehazlett
Copy link
Owner Author

Yes this branch is still WIP. Obviously you don't just want it on managers
as said in other comments. There are some networking fixes in 1.12.1 that
may help. I'll do some more testing. Thanks!

On Aug 25, 2016 08:42, "Richard Mathie" [email protected] wrote:

Humm ok this dose seem to break on killing of scaling the service down, we
will then get intermittent failures as it looks like the docker is keeping
hold of the stale IP's for the service

after scaling docker service scale demo=5

:/# curl 10.255.0.3:30000/ping
^C
:/# curl 10.255.0.3:30000/ping
^C
:/# curl 10.255.0.3:30000/ping
{"instance":"668505eeae8c","version":"0.1"}

the dns records still have 10 items

:/# nslookup tasks.demo_ui | grep Address:
Address: 127.0.0.11#53
Address: 10.255.0.52
Address: 10.255.0.47
Address: 10.255.0.51
Address: 10.255.0.49
Address: 10.255.0.53
Address: 10.255.0.54
Address: 10.255.0.48
Address: 10.255.0.55
Address: 10.255.0.50

and i presume this is why I the connection to the ingress endpoint is
intermittent as docker is dialling into those stale IP's.

related to moby/moby#25130
moby/moby#25130
moby/moby#25219 moby/moby#25219
moby/moby#23855 moby/moby#23855

Also It looks like we can't use other overlay networks for interlock to
work.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#186 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAP6IhFupJMecmAswuWw8mIFLmrJR0kRks5qjY21gaJpZM4JIiDb
.

@ehazlett ehazlett changed the title Swarm services WIP: Swarm services Aug 25, 2016
@Richard-Mathie
Copy link
Contributor

seems like moby/moby#25962 fixes the scaling problem.

@vincentlepot
Copy link

Thanks for this work, really interesting. But is there a way to avoid exposing ports for other services, since I don't necessarily want them to be accessible (I know, I can block them with iptables, but it would be even easier not to expose it).

My suggestion, taking your example:
Instead of defining server X.Y.Z.T:30000 as the upstream backend, why not use demo:8080?

This would maybe need some declaration of the docker service command with a label, for instance interlock:port=8080

The command declaring the demo service would be something like:

docker service create \
    --name demo \
    --env SHOW_VERSION=1 \
    --label interlock.hostname=demo \
    --label interlock.domain=local \
    --label interlock.port=8080 \
    ehazlett/docker-demo:latest

What do you think?

@pascalandy
Copy link

I'm ready to test this as well for Swarm 1.12+. Can't wait to make it happen :)

@vincentlepot
Copy link

vincentlepot commented Oct 5, 2016

@ehazlett I posted a PR #204 to describe more precisely what I have in mind. If you want to test.

Edit: This will only work when using docker networks of course

@ehazlett
Copy link
Owner Author

ehazlett commented Oct 5, 2016

commented --thx!

On Wed, Oct 5, 2016 at 10:25 AM, Vincent Lepot [email protected]
wrote:

@ehazlett https://github.com/ehazlett I posted a PR #204
#204 to describe more
precisely what I have in mind. If you want to test.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#186 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAP6IuVuncbWCf5DUQDg9rdWs1qloxNVks5qw7NogaJpZM4JIiDb
.

@jcmcote jcmcote mentioned this pull request Sep 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: Integration with the new Docker 1.12 services
9 participants