Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions regarding Docker Image Composition #183

Closed
theAkito opened this issue Nov 12, 2022 · 13 comments
Closed

Questions regarding Docker Image Composition #183

theAkito opened this issue Nov 12, 2022 · 13 comments

Comments

@theAkito
Copy link

I've noticed, that the Docker image stack used in this scenario is using Nginx in some way, I cannot quite discern clearly.

https://github.com/grocy/grocy-docker/blob/main/Containerfile-frontend

  1. Is nginx in that Docker image pretty much just the proxy or what else is it doing?
  2. What kind of TLS certificates are used there? Is it possible to use e.g. Let's Encrypt certificates, instead?
  3. Or are these certificates only for internal use?
@jayaddison
Copy link
Contributor

Hi @theAkito - answering your questions:

  1. Of the two containers here (frontend and backend), nginx is only used as part of the frontend container, primarily to serve static content. It does also pass php-related requests to the backend container.
  2. The frontend container currently uses a self-signed certificate, generated at build-time -- see Enable certificate provisioning via letsencrypt #62 for some discussion and recommendations about potential Let's Encrypt integration
  3. Clients generally won't (and shouldn't) trust self-signed certificates by default -- they are not suitable for production use

See also grocy/docs#7.

@theAkito
Copy link
Author

theAkito commented Nov 12, 2022

  1. Of the two containers here (frontend and backend), nginx is only used as part of the frontend container, primarily to serve static content. It does also pass php-related requests to the backend container.

I see. Well, when I was trying to figure out how I would set this server up, I was wondering, if I can replace this Nginx instance with an external one.

If I am going to set up this server, it will be deployed via a self-manufactured Helm Chart on a single-node stock Kubernetes cluster, which already has an Nginx Ingress Controller, that can be freely configured, i.e. making the provided Nginx instance redundant, under the condition the external one is properly set up.

I'm just not sure about this. https://github.com/grocy/grocy-docker/blob/main/Containerfile-frontend#L57-L58

  1. The frontend container currently uses a self-signed certificate, generated at build-time

Ah yes, that's what I feared could be the case. Though, if the backend does not actively expect this certificate, I guess an external Nginx would still work in a manual custom composition.

  1. Clients generally won't (and shouldn't) trust self-signed certificates by default -- they are not suitable for production use

Exactly. Especially, if the cluster has this set up already, anyway. No need for another Nginx.

See also grocy/docs#7.

That's a good one. Would welcome progress on this.


Thank you very much for the quick and thorough reply. It helps in understanding this server scenario.

@jayaddison
Copy link
Contributor

No problem, thanks! - your use-case makes sense, I think -- let me check:

You'd like to deploy the Grocy frontend and backend containers into a Kubernetes cluster, and to use the cluster's Nginx Ingress Controller to select the relevant container for incoming requests.

If so: yes, the PHP proxy-forwarding that the frontend nginx is doing becomes redundant.

However: the Nginx Ingress Controller isn't intended to serve content (HTML, CSS, JS) -- and some of those requests (I think, if I understand correctly) will be for those resources -- so the frontend container will require a webserver of some kind.

Closely related to that: the yarn install command collects a list of additional JavaScript dependencies (packages) and writes them into the container's filesystem, allowing nginx to respond with those.

@theAkito
Copy link
Author

You'd like to deploy the Grocy frontend and backend containers into a Kubernetes cluster, and to use the cluster's Nginx Ingress Controller to select the relevant container for incoming requests.

Yes, at least the backend image & frontend, if this is necessary, too, though wasn't sure about it, as outlined previously.

However: the Nginx Ingress Controller isn't intended to serve content (HTML, CSS, JS) -- and some of those requests (I think, if I understand correctly) will be for those resources -- so the frontend container will require a webserver of some kind.

I guess, that's the case. No way around having a container like this.

Though, the least that can be done is to not fiddle with TLS in this scope, I suppose.
TLS is handled by the cluster's Let's Encrypt management already.

I think the conclusion from this is, that the two images that are already there are pretty much usable in Kubernetes, though the TLS configuration should be made optional. Is that the case?

@jayaddison
Copy link
Contributor

I think the conclusion from this is, that the two images that are already there are pretty much usable in Kubernetes, though the TLS configuration should be made optional. Is that the case?

That seems reasonable, yep - something along the lines of #184? (I'm not sure conditional behaviour or include-from-file statements are valid within Containerfile / Dockerfiles, so the support is initially made optional by duplicating the contents. those files could potentially fall out-of-sync though)

@theAkito
Copy link
Author

theAkito commented Nov 13, 2022

#184 looks very reasonable. (Except the Dockerfile duplication.)

the support is initially made optional by duplicating the contents. those files could potentially fall out-of-sync though

This could be easily circumvented to use a Dockerfile composition tools, though I'm not sure if this would be out of scope, i.e. too much work just for the sake of making TLS optional.

That said, it's also possible to use stock Docker methods for making something optional. For example ARG & ONBUILD may help. Though, I do not have a clear idea yet, how it would be done this way. Would probably be a bit cumbersome.

Maybe the middle path between the two solutions would be to have a configuration file in a volume mounted in an image, which is read on start or first initialisation. If an option in that configuration file tells the server to use a self-signed certificate, then it will generate it on the fly, if not, it won't touch the container.
At least that's more or less the way I think I remember from other dockerised server applications.

Instead of the configuration file, an ENV could be used. This is common across Docker images, anyway.
Either way, there usually must be some initialisation during runtime, to be able to configure such things in a meaningful way.

Most importantly, all these solution proposals are better than having two Dockerfiles for the same stuff, but with different options. It just complects the server scenario in a bad way.

@jayaddison
Copy link
Contributor

Thanks for the additional ideas - I've been thinking about these and experimenting a bit.

I like the certificate-on-demand approach - it does sound slightly complicated/risk-prone though: the co-ordination of a volume, configuration file, and runtime event handling (adding logic to detect during a request that a cert doesn't exist, and then generate one with the appropriate options). That would require careful design and planning.

From some research, caddy can manage a lot of that HTTPS certificate setup itself, so re-using the design and implementation from there could be an option. Some migration would still be required, but it would offload much of the complexity.

I did also take a look into making the container build conditional on env/arg parameters - it's possible, with the exception of the EXPOSE command (maybe not critical -- that's mostly a documentation/informational command).

However: personally I quite like that the containerfiles don't contain any conditional logic - they are easier to understand when they perform a predictable series of steps. The GROCY_VERSION parameter does add a bit of a dynamic element, but only for the content to bake into the container.

Duplicating the file content definitely isn't ideal, but I think the cost of that would primarily be a maintenance burden -- and my guess is that not many people would use it. If that's true, then the duplicate file could be removed after a few months.

@jayaddison
Copy link
Contributor

An attempt to be a bit more precise about a design using caddy to provide TLS provisioning:

I think that this would allow a fairly straightforward way to build a static container that serves local traffic with TLS for most users in a zero-configuration manner.

However - an important question is: what ACME server would the running container contact in order to request updated certificate material? For a private domain, this could be challenging. Worth noting: caddy includes an internal ACME server implementation (see caddyserver/caddy#3021).

@jayaddison
Copy link
Contributor

what ACME server would the running container contact in order to request updated certificate material? For a private domain, this could be challenging.

Self-reply: yep: for a private/internal domain name, this implies that only an internal/private certificate authority is an option, I think.

Provide a default hostname of grocy.home

A detail note: this could produce some unexpected results currently: both caddy v2.5.2 (Alpine 3.16 current version) and v2.6.2 (latest tagged release) depend on certmagic, and the corresponding versions of that library would consider grocy.home to be a non-internal (and therefore valid public) name (https://github.com/caddyserver/certmagic/blob/049e60556bde3cd434ee3619db746abcfd89ad9f/certificates.go#L393-L399, https://github.com/caddyserver/certmagic/blob/2e8dd4496aaa09347eef2b71f9f259d9a161eb81/certificates.go#L398-L404).

(so the request would be considered valid to request a cert from an ACME server -- public or local. a public ACME server shouldn't be able to validate non-public domain like grocy.home -- and even if it did, forwarding that to a CA would likely cause additional issuance validity policy to be applied: cert issuance should (must?) fail)

Back to practical matters, though: in the user-at-home case, the built-in smallstep issuer would be the way to go, probably. And clients using that instance could add it as a trust root (annoying, but one-time, in theory. is it likely to be a stable and reliable cert issuer over time?).

@theAkito
Copy link
Author

I like the certificate-on-demand approach - it does sound slightly complicated/risk-prone though: the co-ordination of a volume, configuration file, and runtime event handling (adding logic to detect during a request that a cert doesn't exist, and then generate one with the appropriate options). That would require careful design and planning.

However: personally I quite like that the containerfiles don't contain any conditional logic - they are easier to understand when they perform a predictable series of steps. The GROCY_VERSION parameter does add a bit of a dynamic element, but only for the content to bake into the container.

The assumption is usually, in the cases of such servers, that there is already a configuration method implemented on application level and adding a certificate is a matter of just adding another option, which might be configured either via file or via ENV. In this scenario it is perhaps a bit different, as you already outlined, since, as far as I understand now, try to streamline & make the setup as easy & quick, as possible.

It's also usually the case that the risk-prone aspect of that approach is minimised with fail-early-and-quick errors, which makes the user solve the instantly appearing errors so quickly, it appears more or less like a streamlined experience. I think the lack of streamlineness kicks in at the point, when something is erroring really late in the setup process, plus the error does not tell the user what is actually wrong, like e.g. "invalid path" instead of "Configuration file not found. This server needs a config.json to run properly. Please check out https://docs.grocy.info/configuration#example-configuration for more information.", plus there is a lack of documentation. This is the biggest issue with lots of projects. They don't offer enough documentation and/or -- if there is any available -- the documentation is really outdated or badly written or incomplete or otherwise insufficient.

From some research, caddy can manage a lot of that HTTPS certificate setup itself, so re-using the design and implementation from there could be an option. Some migration would still be required, but it would offload much of the complexity.

Does not look bad, though I did not have experience with that software yet, so I cannot issue a qualified comment regarding it.

I did also take a look into making the container build conditional on env/arg parameters - it's possible, with the exception of the EXPOSE command (maybe not critical -- that's mostly a documentation/informational command).

We can scratch that one. Does not seem like the right solution in this scenario, because it would require this conditional approach referred to previously, anyway.

Duplicating the file content definitely isn't ideal, but I think the cost of that would primarily be a maintenance burden -- and my guess is that not many people would use it. If that's true, then the duplicate file could be removed after a few months.

I get that. I initially thought about infinitely keeping it "in sync" and maintaining double the amount of Dockerfiles. However, if you plan this as a phasing out method anyway, then it's not a too bad idea. It's just that, if, for whatever reason, the second Dockerfile won't become expendable, then we would be at the same stage of progress, as now. 😄

An attempt to be a bit more precise about a design using caddy to provide TLS provisioning:

* Add the [`caddy` Alpine package](https://pkgs.alpinelinux.org/packages?name=caddy&branch=v3.16) to the container packages

* Create a _static_ `caddy` configuration file (aka Caddyfile) into the container

* Use the [Caddyfile's environment variable support](https://caddyserver.com/docs/caddyfile-tutorial#environment-variables) to dynamically determine the hostname to serve

* Provide a default hostname of `grocy.home` -- this is a [reserved private domain name](https://www.rfc-editor.org/rfc/rfc6762#appendix-G) and will not conflict with the public web

* Enable [`on-demand-tls`](https://caddyserver.com/docs/automatic-https#on-demand-tls) for that domain in the static `caddy` configuration

I think that this would allow a fairly straightforward way to build a static container that serves local traffic with TLS for most users in a zero-configuration manner.

Sounds reasonable, though, as explained earlier, I am unable to respond with a qualified comment regarding that software. I would first need to look into it & research.

However - an important question is: what ACME server would the running container contact in order to request updated certificate material? For a private domain, this could be challenging. Worth noting: caddy includes an internal ACME server implementation (see caddyserver/caddy#3021).

Heard that one often. There are a couple of people, who would like this, though, as far as I know, it's not really possible "the official way".
Some quick results I found regarding this:
https://stackoverflow.com/a/64214458/7061105
https://justmarkup.com/articles/2018-05-31-https-valid-certificate-local-domain/

Self-reply: yep: for a private/internal domain name, this implies that only an internal/private certificate authority is an option, I think.

Indeed. I have never dived into that topic too deeply, though this seems like it is really the case.

A detail note: this could produce some unexpected results currently: both caddy v2.5.2 (Alpine 3.16 current version) and v2.6.2 (latest tagged release) depend on certmagic, and the corresponding versions of that library would consider grocy.home to be a non-internal (and therefore valid public) name (https://github.com/caddyserver/certmagic/blob/049e60556bde3cd434ee3619db746abcfd89ad9f/certificates.go#L393-L399, https://github.com/caddyserver/certmagic/blob/2e8dd4496aaa09347eef2b71f9f259d9a161eb81/certificates.go#L398-L404).

(so the request would be considered valid to request a cert from an ACME server -- public or local. a public ACME server shouldn't be able to validate non-public domain like grocy.home -- and even if it did, forwarding that to a CA would likely cause additional issuance validity policy to be applied: cert issuance should (must?) fail)

It must fail, because the domain must actually be publicly available with a URL like yxyxxyyxxyxyxyxyxxyxxxyyxyxyxy.grocy.home/acme/challenge-blabla/xyyxyxyxyxyxyxxyxyyxyxyxxyxyxyxyxy, which is impossible for every single Grocy user out there, who gets a domain like that "for free", i.e. privately.

Back to practical matters, though: in the user-at-home case, the built-in smallstep issuer would be the way to go, probably. And clients using that instance could add it as a trust root (annoying, but one-time, in theory. is it likely to be a stable and reliable cert issuer over time?).

If it works in a way, which the article from justmarkup I linked previously has shown, then it should be fairly easy & reliable. The problem is the one-time setup. Whether that's annoying/cumbersome or not depends on the user.
Some people find it extremely annoying, other ones don't even notice it happens, because they just do it one time and instantly forget about it.

@jayaddison
Copy link
Contributor

And clients using that instance could add it as a trust root (annoying, but one-time

An aside: it's annoying (and seems like a potential cause of Internet-of-Things problems -- including but not limited to vendor attempts to create lock-in) that in 2022, as far as I know, it's not possible to generate a certificate for a local device/application that browsers will display as trustworthy by default. I realize the technical reasons why -- and for a private domain like that there would have to be other well-defined and very-difficult-or-impossible-to-impersonate constraints that mainstream browsers would need to respect in order for the negotiation between a non-public device and a public issuer to result in trustworthy (in a human, practical sense) certificates... but it just feels like it should (again, must?) be possible somehow. Device attestation might be part of a solution, but it's tricky to figure out how that works in the context of elastically-scalable clusters.

@jayaddison
Copy link
Contributor

Some more rambling ideas:

  • Vendor lock-in, either at an issuer or ecosystem level, could pehaps be mitigated by allowing/requiring multiple issuers to be involved in a certificate's production
  • Issuance of certificates for private environments could perhaps use some kind of mixture of device-level, network-level and peer-shared-confidential data. That derived mixture could be passed to the issuers, and should provide a second verifiable factor on a certificate (not just: the entity you're talking to has the private key for this certificate, but also: it was issued under this context by an environment that has the same properties). I'm not sure about this, maybe what I'm saying is redundant -- or perhaps already solved by certificate transparency logs.
  • In the case of web software trust, it's also potentially relevant that the application content being delivered can be verified - something to do with bill-of-materials or supply chain security. Basically you want to know not only that the data you're being provided with hasn't been modified, and that it's from the source you expect it from, but also that you can inspect the source of that code. Maybe it'd seem weird to reissue TLS certificates per-release -- but maybe not, maybe it makes complete sense (especially if it could include links to the source code).

@jayaddison
Copy link
Contributor

jayaddison commented Nov 14, 2022

@theAkito I'm closing this issue since I think the questions raised have been answered - I've also opened a separate issue to track experimentation with caddy as a replacement webserver (I'm also unfamiliar with it so far, so it'll be a learning experience for me too). Let me know and/or open a separate issue if there's anything else you'd like to discuss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants