Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

helm/2517 - Deis on OpenShift #5

Open
Cryptophobia opened this issue Mar 20, 2018 · 22 comments
Open

helm/2517 - Deis on OpenShift #5

Cryptophobia opened this issue Mar 20, 2018 · 22 comments

Comments

@Cryptophobia
Copy link
Member

From @kingdonb on August 24, 2017 3:38

I tried to launch Workflow on OpenShift and I have followed from helm/helm#2517 where it was explained how Helm can be used on OpenShift

kingdonb$ helm install --tiller-namespace myproject -n deis --namespace myproject workflow/
Error: release deis failed: User "system:serviceaccount:myproject:default" cannot create daemonsets.extensions in project "myproject"

Is this an issue with OpenShift or with Deis? My reading is that there is a resource kind that does not exist on OpenShift yet, that Deis leveraged on a high version of Kubernetes...

Copied from original issue: deis/workflow#856

@Cryptophobia
Copy link
Member Author

From @kingdonb on August 24, 2017 16:49

Here's what I've done:

minishift start
...
oc login -u system:admin
kubectl -n myproject create sa tiller
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=myproject:tiller
helm --tiller-namespace myproject --service-account tiller init
oc policy add-role-to-user admin -z tiller -n myproject
helm install --tiller-namespace myproject -n deis --namespace myproject workflow/

which is the most permissive role I can imagine, still only to get

Error: release deis failed: User "system:serviceaccount:myproject:tiller" cannot create daemonsets.extensions in project "myproject"

I followed this, because I understood that OpenShift role based security has essentially merged with standard kube RBAC, so I guessed that either set of permissions would work. I don't really want to set the most permissive possible though, I'd love to understand why this just doesn't work.

https://gist.github.com/mgoodness/bd887830cd5d483446cc4cd3cb7db09d

@Cryptophobia
Copy link
Member Author

From @kingdonb on August 24, 2017 17:30

I get the same results with https://docs.bitnami.com/kubernetes/how-to/configure-rbac-in-your-kubernetes-cluster/

$ helm install -n deis --namespace myproject workflow/
Error: release deis failed: User "system:serviceaccount:kube-system:tiller" cannot create daemonsets.extensions in project "myproject"

@Cryptophobia
Copy link
Member Author

From @bacongobbler on August 24, 2017 17:32

Did you grant tiller RBAC permissions to create objects in the "myproject" namespace?

That gist you posted earlier is only relevant if tiller is installed in the kube-system namespace. It looks like from your commands you deployed tiller in the "myproject" namespace.

@Cryptophobia
Copy link
Member Author

From @bacongobbler on August 24, 2017 17:45

openshift/origin#8242

@Cryptophobia
Copy link
Member Author

From @kingdonb on August 24, 2017 17:56

Oh!

The OpenShift devs helped me out, two things I'm doing wrong...

  1. When a tutorial says to use -z in oc policy, it's a shortcut for system:serviceaccount:[-n]:ARG but this is new in 3.5, and I had an old oc client 3.4.1.2 hanging around in my path -- why I thought it would be OK to drop the -z I don't know, but in the end I sometimes got errors telling I couldn't even list the namespace, which I was supposedly granting myself admin role for. I didn't have the permission applied to the actual serviceaccount properly.

I could see this was the case by looking at oc policy who-can ... but it was not so easy to figure this out in the documentation for OpenShift, unfortunately, and I didn't realize until later I had an old oc binary.

  1. What you just found there... giving the SA an admin role is not enough, because admin role is not allowed to handle daemonsets for some reason. You can give the cluster-admin role though:

So this didn't work: oc adm policy add-cluster-role-to-user cluster-admin tiller -n myproject

but this did work:

oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:myproject:tiller -n myproject

@Cryptophobia
Copy link
Member Author

From @bacongobbler on August 24, 2017 18:1

Did you make sure to install Workflow with RBAC permissions? You'll need to add --set global.use_rbac=true in your helm install invocation to install Workflow with RBAC permissions. See #812.

@Cryptophobia
Copy link
Member Author

From @bacongobbler on August 24, 2017 18:2

Not sure about the other issues. We don't officially support Workflow on Openshift, but please continue to post here so others can learn.

@Cryptophobia
Copy link
Member Author

From @kingdonb on August 24, 2017 18:2

You have been so helpful. Thank you! I did not do that.

@Cryptophobia
Copy link
Member Author

From @kingdonb on August 24, 2017 18:7

This big "MISSING" at the bottom looks kind of ominous... those are my RBAC roles, and I'm guessing that Missing means they're using a resource kind that I don't have on my cluster. Maybe I would have better luck with a newer OpenShift version...

$ helm install --tiller-namespace myproject -n deis --namespace myproject workflow/
NAME:   deis
LAST DEPLOYED: Thu Aug 24 14:03:09 2017
NAMESPACE: myproject
STATUS: DEPLOYED

RESOURCES:
==> v1/Secret
NAME                   TYPE    DATA  AGE
minio-user             Opaque  2     12s
deis-router-dhparam    Opaque  1     12s
objectstorage-keyfile  Opaque  2     12s

==> v1/ConfigMap
NAME                  DATA  AGE
dockerbuilder-config  2     12s
slugbuilder-config    2     12s
slugrunner-config     1     12s

==> v1/ServiceAccount
NAME                   SECRETS  AGE
deis-builder           2        12s
deis-controller        2        12s
deis-database          2        12s
deis-logger-fluentd    2        12s
deis-logger            2        12s
deis-minio             2        12s
deis-monitor-telegraf  2        12s
deis-nsqd              2        12s
deis-registry          2        12s
deis-router            2        12s
deis-workflow-manager  2        12s

==> v1/Service
NAME                    CLUSTER-IP      EXTERNAL-IP                    PORT(S)                                                   AGE
deis-builder            172.30.114.189  <none>                         2222/TCP                                                  11s
deis-controller         172.30.141.36   <none>                         80/TCP                                                    11s
deis-database           172.30.152.48   <none>                         5432/TCP                                                  11s
deis-logger             172.30.117.93   <none>                         80/TCP                                                    11s
deis-minio              172.30.27.13    <none>                         9000/TCP                                                  11s
deis-monitor-grafana    172.30.184.206  <none>                         80/TCP                                                    11s
deis-monitor-influxapi  172.30.160.104  <none>                         80/TCP                                                    11s
deis-monitor-influxui   172.30.71.72    <none>                         80/TCP                                                    11s
deis-nsqd               172.30.41.90    <none>                         4151/TCP,4150/TCP                                         10s
deis-logger-redis       172.30.182.56   <none>                         6379/TCP                                                  10s
deis-registry           172.30.100.247  <none>                         80/TCP                                                    10s
deis-router             172.30.141.156  172.29.232.160,172.29.232.160  80:31300/TCP,443:30075/TCP,2222:30158/TCP,9090:32519/TCP  10s
deis-workflow-manager   172.30.208.185  <none>                         80/TCP                                                    10s

==> v1beta1/DaemonSet
NAME                   DESIRED  CURRENT  READY  UP-TO-DATE  AVAILABLE  NODE-SELECTOR  AGE
deis-logger-fluentd    0        0        0      0           0          <none>         10s
deis-monitor-telegraf  0        0        0      0           0          <none>         10s
deis-registry-proxy    0        0        0      0           0          <none>         10s

==> v1beta1/Deployment
NAME                   DESIRED  CURRENT  UP-TO-DATE  AVAILABLE  AGE
deis-builder           1        1        1           0          10s
deis-controller        1        0        0           0          10s
deis-database          1        1        1           0          10s
deis-logger            1        1        1           0          9s
deis-minio             1        1        1           0          9s
deis-monitor-grafana   1        1        1           1          9s
deis-monitor-influxdb  1        1        1           0          9s
deis-nsqd              1        1        1           0          9s
deis-logger-redis      1        1        1           0          9s
deis-registry          1        1        1           0          9s
deis-router            1        1        1           0          8s
deis-workflow-manager  1        1        1           1          8s

==> MISSING
KIND                   NAME
clusterroles           deis:deis-builder
clusterroles           deis:deis-controller
clusterroles           deis:deis-logger-fluentd
clusterroles           deis:deis-router
clusterrolebindings    deis:deis-builder
clusterrolebindings    deis:deis-controller
clusterrolebindings    deis:deis-logger-fluentd
clusterrolebindings    deis:deis-router
roles                  deis-builder
roles                  deis-monitor-telegraf
roles                  deis-router
rolebindings           deis-builder
rolebindings           deis-monitor-telegraf
rolebindings           deis-router

@Cryptophobia
Copy link
Member Author

From @kingdonb on August 24, 2017 18:37

... It is true, the OpenShift 3.4 security model had not yet started to merge with K8S RBAC because K8S was not fully baked when 3.4 was developed and released. That doesn't seem to be my problem anymore though.

In 3.5 and 3.6, this should work fine. I'll test it out and add notes when I find out. If someone needs to use 3.4, there is probably a way to apply the permissions to the service accounts, but in my case it looks like the only reason I'm having problems is that I have an old oc OpenShift client binary.

@Cryptophobia
Copy link
Member Author

From @kingdonb on August 24, 2017 19:0

Almost... there is a bridge for roles (OpenShift has its own role kinds that predate Kubernetes formal implementation of RBAC), but the connector goes the wrong way for what I'm trying to do:

in 3.6, we go openshift -> rbac, *not* rbac to openshift. If you create an RBAC object, it will be stomped away

So if you have OpenShift roles implemented, and you want to access them as Kubernetes roles, you can do it, but if you try the reverse you will know it didn't work because helm very helpfully tells you the state of these resources you created after install is MISSING. It means just what it says.

There is supposedly a raw and fairly untested tool for doing the conversion of those resources and we are going to try it out against the output of helm --debug

@Cryptophobia
Copy link
Member Author

From @kingdonb on August 25, 2017 12:54

these roles/bindings:
https://gist.github.com/enj/078cc916365a7c3db6da581ea4485bd6

and some combination of these scc settings:

# (for tiller in kube-system namespace, cluster-admin because daemonsets)
oc policy add-role-to-user edit system:serviceaccount:kube-system:default
oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:kube-system:default

# for your deis roles, you don't need to use system:authenticated
# but I was having trouble making sure my policies were actually
# getting applied where I thought they were
oc adm policy add-scc-to-group anyuid system:authenticated
oc adm policy add-scc-to-group privileged system:serviceaccounts:NAMESPACE

will get you most of the way there, in the worst possible way... this still didn't quite work for me, most of the containers came up, but a few still didn't. It's a bit mysterious, but I'll try again today.

Controller for instance, which was complaining about not being able to create a directory /app/data, but why wasn't clear. (anyuid policy should have fixed that, adding that setting I was able to start pods and see them come up as root via kubectl. Another security feature of OpenShift is that containers are not allowed to start as root unless you specifically say they are allowed to.)

I'm saying this is the worst possible way because it's basically turning off all of the security, but I'll work on tightening it up for the PR once I figure out what's still missing!

@Cryptophobia
Copy link
Member Author

From @kingdonb on August 25, 2017 13:31

Hm. Today on a clean minishift, it works! I believe with no more information than I already had posted here. Waiting for Builder to come up...

NAME                                     READY     STATUS    RESTARTS   AGE
deis-builder-904459970-rwms8             0/1       Running   1          2m
deis-controller-198627737-jqq9m          1/1       Running   3          2m
deis-database-4120584427-kp6l8           1/1       Running   0          2m
deis-logger-2717637750-l9g60             1/1       Running   2          2m
deis-logger-fluentd-zcz1d                1/1       Running   0          2m
deis-logger-redis-1281502559-srbsz       1/1       Running   0          2m
deis-minio-2522384958-kc60z              1/1       Running   0          2m
deis-monitor-grafana-1204143830-q77cx    1/1       Running   0          2m
deis-monitor-influxdb-2842921881-91s7x   1/1       Running   0          2m
deis-monitor-telegraf-dt6w3              1/1       Running   0          2m
deis-nsqd-1225249577-v1lf1               1/1       Running   0          2m
deis-registry-3907297837-msgxm           1/1       Running   2          2m
deis-registry-proxy-74ljd                1/1       Running   0          2m
deis-router-3453227723-x352n             1/1       Running   0          2m
deis-workflow-manager-3757950027-s2d00   1/1       Running   0          2m
$ deis register deis.deis:32423
username: yebyen
password:
password (confirm):
email: [email protected]
Registered yebyen
Logged in as yebyen
Configuration file written to /Users/kingdonb/.deis/client.json

... there are lots of things to work out after that, but it looks like I can use Deis Workflow on MiniShift now! That's great news, I love interchangeable parts.

@Cryptophobia
Copy link
Member Author

From @kingdonb on August 29, 2017 1:58

I have now deployed Workflow on OpenShift Origin as well as Minishift, and tackled some of the additional challenges that entails.

I think mainly, the struggle came because this time I did not grant any privileges to ambiguous any-user 'system:authenticated' and made it a point to trace down issues individually through each of the logs, so I would know what permissions I was granting, and what wouldn't work until they were granted. (Most things just worked on minishift as I granted the more permissive settings broadly.)

They were all some subset of these serviceaccounts that required additional 'scc' permissions, which is another OpenShift thing.

... bad example deleted

My search history seems to indicate I also needed

setenforce 0

at some point, or for some reason.

These pods all needed cluster-admin I think mostly because on OpenShift, you can only enumerate namespaces if you have cluster-admin. (Edit: Perhaps there is a different endpoint that users can query to get all of their own namespaces, or projects as they're sometimes called in OpenShift)

I'm not sure how we can provide CI validation for this for a release,

I have cleaned my rough work up further into almost draft shape for a pull request, I am just running the deployment again onto my OpenShift/CentOS VM now to ensure the chart with rewritten roles, and a new value in values.yaml is still working as expected after I have adjusted it to fit within the template.

Maybe a ten-minute YouTube video that starts from vagrant destroy and skipping up to the part where you install docker onto your CentOS 7.3 machine, will suffice for the purposes of validating the release as "Experimental OpenShift Support" with prominent notice marking it as such. Not sure if you will want to merge this at all when it's done, actually I foresee probably not...

I predict a fork that is committed to supporting Deis Workflow on OpenShift and tracking maintenance as new Kube releases are tracked against OpenShift 😁 the one thing I have not yet figured out is what will allow telegraf to collect metrics, as I was able to communicate with InfluxDB from the logged-in grafana, and read the basic InfluxDB stats, and the stats from router; the interface worked but no CPU or Memory statistics were gathered, and I gather that is telegraf's job.

@Cryptophobia
Copy link
Member Author

From @kingdonb on August 29, 2017 2:22

I packed the changes up into one repo so they can be reviewed at-once, but

https://github.com/kingdonb/openshift-deis/commits/openshift-deis-v2.17

I will unpack them into PRs on the affected deis/* repos if there's any interest in merging them back. I'll have to tear down CentOS VM and do from scratch to ensure I got all of the oadm policy things in. Presumably there is also some kind of declarative way I can set those, though maybe not via helm.

I'll also need to touch base with someone else who has helm working on OpenShift realistically to get some kind of approval or review on this branch, since that is also not especially trivial or well documented. But it works! This can deploy Deis Workflow into a scratch project on OpenShift.

I'll try hooking it up to #857 next, since I personally will need both.

@Cryptophobia
Copy link
Member Author

From @kingdonb on September 16, 2017 17:44

After a few weeks of upstream development, I went in to review this stuff and try porting it forward to the latest releases of everything (minishift, Deis, OpenShift Origin, all have had releases since 19 days ago...)

I have created a new branch for openshift-deis-v2.18 so you can try release-2.17 and release-2.18, but in 2.17 I'm sure you may need to loosen the permissions model because I hadn't worked out most of the details when I put those changes in 3 weeks ago.

It definitely "worked, mostly." I've figured out a little more specifically what privileges are needed in the 2.18 branch.

@Cryptophobia
Copy link
Member Author

From @kingdonb on September 16, 2017 20:4

Yep. The key is to make sure you get the order right. I've got almost everything coming up now, and I'm closing in on the last of the seemingly new problems I maybe created by upgrading (tl;dr: you'll see if you follow these instructions all the way down to the bottom that they're incomplete, controller and deis-monitor-telegraf both can't seem to come up properly even with all of the permissions I granted and scc rules I thought I relaxed... or did I?)

These instructions "work" on a MacOS machine with Homebrew, installing Docker client v1.12.3 via DVM, with Minishift release v1.5.0 (which currently installs OpenShift v3.6.0 by default, which is roughly equivalent to Kubernetes v1.6.1):

First get the right docker client version (which matches the Docker provided by Minishift) this is important for reasons I do not fully understand but, perhaps take it up with Minishift:

brew install dvm
# ... follow the instructions to add dvm to your bash profile, restart your shell, and then
dvm install 1.12.3
dvm use 1.12.3

Then, start OpenShift via Minishift and plug Docker into it:

git clone -b openshift-deis-v2.18 https://github.com/kingdonb/openshift-deis workflow
minishift start --vm-driver=virtualbox --memory=4GB --cpus=4
eval $(minishift docker-env)

Create a project for Deis to live in, and get Helm up and running:

oc login -u system:admin
oc new-project deis
oc create sa tiller
oc adm policy add-cluster-role-to-user cluster-admin -z tiller

helm init --service-account tiller --tiller-namespace deis
export TILLER_NAMESPACE=deis

Grant a bunch of additional privileges that Deis needs on OpenShift...

Give cluster-admin role to all of these deployments that have their own ServiceAccounts, since Deis will create namespaces, as well as manage and monitor all of the cluster's namespaces (projects):

oc adm policy add-cluster-role-to-user cluster-admin -z deis-router
oc adm policy add-cluster-role-to-user cluster-admin -z deis-builder
oc adm policy add-cluster-role-to-user cluster-admin -z deis-controller
oc adm policy add-cluster-role-to-user cluster-admin -z deis-logger-fluentd
oc adm policy add-cluster-role-to-user cluster-admin -z deis-workflow-manager

Many Deis containers currently depend on being able to start with a root UID (and from what I could tell at first glance, some of them don't appear to have their own ServiceAccounts assigned yet):

oc adm policy add-scc-to-group anyuid system:serviceaccounts:deis
oc adm policy add-scc-to-group privileged system:serviceaccounts:deis

At this stage because you haven't created any system:serviceaccounts:deis yet, you should get no report back from these commands... it was at this moment the Buddha got enlightened. (ed; put a pin in this, I will come back later to make it better)

Router wants to be able to punch a hole to the outside network and open a listening port on the host, and OpenShift does not allow this without another scc... the hostnetwork scc:

oc adm policy add-scc-to-user hostnetwork -z deis-router

Controller seems to need to be able to mount a HostPath volume and get access to /var/run/docker.sock in order to do its job... this sounds like something highly inadvisable, and to my knowledge this can only be allowed on the latest OpenShift release if the global scc restricted rules are modified (is this permissions model quite as complex and variegated as SELinux yet?)

So take care of that, and then grant your developer user some control and visibility into the project:

#oc edit scc restricted   # at this point change allowHostDirVolumePlugin to "true"
oc policy add-role-to-user admin developer -n deis
#oc adm policy add-cluster-role-to-user cluster-admin developer
helm install -n deis --namespace deis workflow/

At this point you'd get an error already if tiller wasn't started, next log into the OpenShift dashboard at the URL you got when minishift start finished. If you haven't already, login as the developer user (the developer username will accept any password on minishift), and watch as pods come up.

fixme below

There need to be some eyes on all of what I've done above to ensure that it's granting the least required privilege.

I also wasn't able to figure out how to put any of the above into YAML files which we could gate similarly to the modified roles and clusterroles we have already through the use_openshift_rbac flag in values.yaml... not yet anyway...

If you run into problems and want to do helm install again, you'll need to wipe out the secrets that were generated for you first, so the next helm install can generate some more:

kubectl -n deis delete secret builder-key-auth builder-ssh-private-keys database-creds deploy-hook-key django-secret-key logger-redis-creds

Scratch space below...

#oc adm policy add-scc-to-group privileged system:serviceaccount:deis
#oc adm policy add-scc-to-group anyuid system:serviceaccount:deis
#oc adm policy add-scc-to-user anyuid -z deis-database
#oc adm policy add-scc-to-user anyuid -z deis-minio
#oc adm policy add-scc-to-user anyuid -z deis-nsqd
#oc adm policy add-scc-to-user anyuid -z default
#oc adm policy add-scc-to-user anyuid -z deis-router
#oc adm policy add-scc-to-user anyuid -z deis-controller




@Cryptophobia
Copy link
Member Author

From @kingdonb on September 16, 2017 20:10

The result is now (running tally of unresolved errors looks like):

Deis Controller deploy logs

system information:
Django Version: 1.11.4
Python 3.5.2
mkdir: cannot create directory '/app/data': Permission denied

Deis Monitor Telegraf daemonset logs

Node Name set (null)
jq: error (at <stdin>:13): Cannot index string with string "hostIP"

Deis Builder

It probably is just failing its readiness check because the controller service is not answering yet...


I'm going to go revisit all of the previous comments I left on this issue and delete any that can be subsumed by what I'm documenting here today. Then I think I'll go back and try with the previous versions, see if something has changed with how Minishift is handling HostDir volumes (since it looks like maybe it has? More likely I think than Deis added breaking changes in v2.18, but we'll see!)

@Cryptophobia
Copy link
Member Author

From @kingdonb on September 16, 2017 21:10

Rolling the stack back to Minishift v1.4.1 and Deis v2.17, I had much better results:

# the syntax for --memory has changed
minishift start --vm-driver=virtualbox --memory=4096 --cpus=4

#... (everything else the same as above, but against the prior release of Deis Workflow)
cd workflow/
git checkout openshift-deis-v2.17

# Make sure you set platform_domain in the root setting file: workflow/values.yaml
$EDITOR values.yaml
cd ..

#create the Deis project and setup helm, as above
...
#grant the cluster-admin roles, just as shown above
...
#grant the anyuid and/or privileged scc policies to the system:serviceaccounts:deis group as above
...
#grant the hostnetwork scc to the deis-router serviceaccount as above
...
# pull up the dashboard and grant developer user admin on your Deis namespace/project as above
helm install -n deis --namespace deis workflow/
# ... again, just as above

... wait a while, then ...

See that all of the health checks are OK, across the board.

Add a route to deis-router through the GUI so you can reach deis.[basedomain], register a user, create an app... pop into the Service definition and find out what nodeport has been assigned, so you can correct the [deis] remote in .git/config (the nodeport that points to TCP port 2222 is the one that you want)...

Add your keys with deis keys:add

Do git push deis master

Pop over to the Monitoring dashboard of Openshift and watch your build!

Add another route to allow traffic into your app via OpenShift's router and Deis's router (or add a wildcard route for *.[basedomain], or short-circuit any of these if necessary... since OpenShift's router does not support non-HTTP traffic you will be stuck with some short-circuiting) and enjoy!

Unfortunately deis-monitor-telegraf pod is still spewing errors out in its log, although it appears to be passing the health checks I believe it is not collecting any of the host logs, not sure but unlikely if it will get pod logs either.

Otherwise I think everything still worked at this point (with the same stack I used 19 days ago.) Unfortunately this time I tried bringing up an example app (the ruby-sinatra buildpack example) and noticed that sadly, it does not come up. (Curious as it worked last time.)

mkdir: cannot create directory '/app/objectstore': Permission denied
/bin/get_object: line 9: /app/objectstore/minio/builder-bucket: No such file or directory
2017/09/16 21:06:12 open /app/objectstore/minio/builder-bucket: no such file or directory

Likely owing to RBAC differences in OpenShift that we discovered a few weeks ago, that I was much more aggressively tearing down and short-circuiting in order to see if I could make it work. You could easily work around this by disabling all of the nice security features. It's going to take at least another look or two in order to figure out the "least required privilege" settings and implement it properly.

I don't know who to blame yet, but it obviously can't be Deis as running on OpenShift was never a supported configuration... amirite? Next step for me is to try the newer Deis release on the older MiniShift and vice versa, and see what else is different.

@Cryptophobia
Copy link
Member Author

From @liggitt on September 16, 2017 23:48

FYI, in openshift 3.7, you'll be able to use kube RBAC objects directly. Prior to that, openshift RBAC objects controller authorization policy

@Cryptophobia
Copy link
Member Author

From @kingdonb on September 18, 2017 14:27

@liggitt Thanks! I see there's a new Minishift release an hour ago this A.M. too.

I'd imagine we're always going to need to use OpenShift objects if we want to assign special scc privileges, like anyuid or hostnetwork, unless Kubernetes proper has (or grows) a similar protection that can be mapped to these scc's.

I will go ahead and try with a 3.7.0-alpha build as well, if this can be simplified in newer versions that's great but I'm pretty sure there will remain enough differences that we're going to need more adjustments (and probably will always need a use_openshift_rbac mode) to make everything work.

@Cryptophobia
Copy link
Member Author

From @kingdonb on September 18, 2017 16:9

My first problem was that I had omitted use_openshift_rbac setting this time (in my repo's openshift-deis-v2.18 branch, which is where you should start if you're trying to use this), obviously resulted in all manner of things failing and has been corrected in the master branch there now.

Second problem was I wasn't understanding how roles and scc/securityContext settings interact: first the SA needs to have securityContextConstraint or scc relaxed, then the container that is going to get that securityContext needs to have it assigned within the pod container spec.

I added to my deployment and daemonset files for controller, fluentd, monitor-telegraf, and registry-proxy, inside of use_openshift_rbac blocks:

securityContext:
  privileged: true

Then, from the top:

dvm use 1.12.3
git clone -b openshift-deis-v2.18 https://github.com/kingdonb/openshift-deis workflow
minishift start --vm-driver=virtualbox --memory=4GB --cpus=4
eval $(minishift docker-env)

oc login -u system:admin
oc new-project deis

oc create sa tiller
oc adm policy add-cluster-role-to-user cluster-admin -z tiller
helm init --service-account tiller --tiller-namespace deis
export TILLER_NAMESPACE=deis

# I think this part can be done inside of the YAML, but not sure how yet...
oc adm policy add-cluster-role-to-user cluster-admin -z deis-router
oc adm policy add-cluster-role-to-user cluster-admin -z deis-builder
oc adm policy add-cluster-role-to-user cluster-admin -z deis-controller
oc adm policy add-cluster-role-to-user cluster-admin -z deis-logger-fluentd
oc adm policy add-cluster-role-to-user cluster-admin -z deis-workflow-manager

# ...There is no way to assign SCC's through the API yet, per claytonc
oc adm policy add-scc-to-group anyuid system:serviceaccounts:deis
oc adm policy add-scc-to-user hostnetwork -z deis-router
oc adm policy add-scc-to-user privileged -z deis-controller
oc adm policy add-scc-to-user privileged -z deis-logger-fluentd
oc adm policy add-scc-to-user privileged -z deis-monitor-telegraf
oc adm policy add-scc-to-user privileged -z deis-registry-proxy

oc policy add-role-to-user admin developer -n deis
helm install -n deis --namespace deis workflow/

Everything appears to come online now, almost without issue! You can assign the cluster-admin cluster-role to your developer user if you want to watch your app come up in its own project on the OpenShift dashboard.

I haven't assigned any blanket privileged or cluster-admin permissions, which is better.

I may have granted some privileges unnecessarily (I'll go back and clean it up... I'm not sure) I am still figuring out RBAC and will harden this up but it should probably not be needed to grant anyuid to every serviceaccount in the system:serviceaccounts:deis group including the default serviceaccount there.

This is probably the worst thing that I've still done here, but it gets a little worse as you deploy your first app...

A few problems to still address:

  • deis-registry-proxy does not have any serviceAccount, so from the oadm policy statements above, one granting privileged scc does not work... adding the scc privilege toserviceaccount deis-registry-proxy is the one that does not work. I don't know what registry-proxy does but I'm sure it's important somehow. (Edit: fixed this in kingdonb/openshift-deis:openshift-deis-v2.18)
  • I have not tried deploying anything with this yet, but I am speculating it will fail because At least the first buildpack example I tried (ruby-sinatra-example) expected to be run with UID 0. I have not found a better way to allow this yet than oc adm policy add-scc-to-group anyuid system:authenticated which you may not want to do, but all of the solutions I can think of either involve changes to the internal permissions model of Deis, or are just as permissive.
  • Telegraf pods try unsuccessfully to reach the readonly API for metrics at 10.0.2.15:10255. The pods for telegraf still pass health checks but they are not collecting metrics from the cluster. Unfortunately 10255 is not an option in OpenShift and this will probably need to be adjusted on the telegraf side to use a serviceaccount and the kubernetes.default API endpoint instead.
  • I think I have hardcoded some "deis" namespaces into my changes for use_openshift_rbac and that's also not right. I shouldn't assume you're deploying into the deis namespace, I should use whatever namespace that Deis actually gets deployed into, I'm not sure how yet...

For now, make sure helm deploys the workflow chart into the deis namespace if you want it to work, or adjust the changes I've made accordingly for whatever namespace you're deploying into.

The good news is, all of this works with each of the latest stable releases, of minishift and Deis Workflow. You can start Deis, register a user, deploy an app, and upgrade it. Yahoo!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant