Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit container repository #160

Open
7 tasks
SISheogorath opened this issue Feb 18, 2021 · 9 comments
Open
7 tasks

Revisit container repository #160

SISheogorath opened this issue Feb 18, 2021 · 9 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@SISheogorath
Copy link
Contributor

Currently there are various PRs and issues piling up in the repository. The README has aged quite a lot in the past few years and could take a rewrite from scratch. Also we should check the assumptions the containers were build with compared to the state of the art.

Various steps have already took place thanks to @hugopeixoto, but we still have some work to do. I would be open to suggestions and people willing to take the challenge :)

Topics I would like to put on the table:

  • Rewriting README, remove boilerplate content, maybe add a screenshot, link to the new docs and web page.
  • Dropping the alpine image (it doesn't provide much advantage over the debian image, but is unnecessarily inconvenient to work with due to musl-lib and potentially undisclosed security issues)
  • Rework local uploads. Various people voiced concerns over the default image upload location and the current way of handling local upload, is not great.
  • Adding a healthcheck
  • Maybe add sqlite support

There are two things that are probably solved better upstream, but are relevant in the container, mostly:

@SISheogorath SISheogorath added enhancement New feature or request help wanted Extra attention is needed labels Feb 18, 2021
@davidmehren
Copy link
Member

I'd like to take this opportunity to propose merging this repo into hedgedoc/hedgedoc. This has many advantages:

  • We can reduce confusion about the documentation as we don't need to maintain two docs anymore. The container documentation can be integrated into docs.hedgedoc.org
  • Both the container and HedgeDoc can have unified release notes.
  • We do not need to juggle issues between repos anymore (sometimes an issue reported here is actually an issue of the app and vice-versa).
  • New container images can be built on a new git tag automatically, no need to remember to update this repo after a release.
  • Nightly container images can be easily integrated in our CI pipeline.

@SISheogorath
Copy link
Contributor Author

No, I highly advocate against merging this into the source code repository, because it'll cause more confusion, not less. If we integrate it we would copy souce code Form inside the repository instead of cloning a release. That means we have to do a release to update the node version of the container, or looking at the current setup, on every new build of the base images.

We should avoid to mixup deployment and development on a non-rolling-release software.

@davidmehren
Copy link
Member

davidmehren commented Feb 22, 2021

That means we have to do a release to update the node version of the container, or looking at the current setup, on every new build of the base images.

I'm pretty sure we can teach our CI to regularly rebuild tagged releases on top of the latest base image without having to make a new release.

We should avoid to mixup deployment and development on a non-rolling-release software.

Well, having both repos separate has caused real confusion in the past (people not reading release notes, inconsistent upgrade instructions etc).

@hugopeixoto
Copy link
Member

There's a lot going on here. Not sure how to go on about this, so I'll comment on each point separately. We could use the 2.0 release to make some breaking changes here as well.

Rewriting README, remove boilerplate content, maybe add a screenshot, link to the new docs and web page.

If the idea is to move all the docs to docs.hedgedoc.org, I think this makes sense. having docs spread out throughout github repositories is not ideal. The resulting README would become mostly empty, with only a link to the docs / license info.

Dropping the alpine image (it doesn't provide much advantage over the debian image, but is unnecessarily inconvenient to work with due to musl-lib and potentially undisclosed security issues)

Do you think this would make our life easier, or is it mostly a question of not agreeing with alpine's security policies?

Maybe add sqlite support

What does this mean, in practice? Adapting the compose.yml to have a commented section for sqlite3 support? This is how we support mysql, right? Our current testing suite only tests postgresql, from what I understand. It would be nice if we could extend that to test those configs as well. Any idea on how to do this?

Database configuration by env-var without CMD_DB_URL

I'm not a fan of having multiple ways of configuring the same thing, it only makes it harder for us to keep track of them. Having a config.json and loading things from the ENV is already too much for me, tbh. We'll end up having to deal with issues of precedence and overloads.

Our docker-entrypoint.sh could use some work as well:

  • using dockerize to wait for the db means we are duplicating the logic to parse database args and it doesn't support things like default ports
  • if there's any failure the script doesn't stop

merging into hedgedoc/hedgedoc

When I first started using hedgedoc, I also found it a bit confusing that it was a separate repository, but we want to iterate on the container scripts without bumping hedgedoc versions, so that's something that needs to be solved. Even if we merge them, we could continue cloning it instead of copying the source from inside the repository, but it might more confusing to be iterating on the container scripts on the main branch and having it fetch the code from a vX.Y.Z tag.

From what I understand, the plan for 2.0 is to split it into backend/frontend. If this is the case, we will already have a problem when coordinating releases and dealing with routing github issues, so I don't think that merging this into hedgedoc will solve many problems. If the plan is to merge react-client onto the hedgedoc repository, merging container may be beneficial, since we'll end up with a single repo.

Merging them would have the potential drawback that new hedgedoc releases would now have a dependency on the container scripts, meaning you'd have to wait for them to be approved / tested / updated before releasing a new version. As things are right now the container scripts can be released independently.

@davidmehren
Copy link
Member

I started moving the container docs to the main repo and encountered some issues. Please have a look at hedgedoc/hedgedoc#984 (comment).

Another, more fundamental problem I noticed while doing that: The current instructions "clone the container repo and pull to update" means that we either

  • Can never make backwards-incompatible changes to the docker-compose.yml, as this repo is not versioned. We already failed at that after we changed the default database name with the 1.7 release.
  • Have to separately version the container repository and make the user read two changelogs before upgrading/running git pull.

I see these solutions:

  • Do not make the user clone the repo. It should be sufficient to download/copy-paste the docker-compose.yml once and then just increase the image version after that.
  • Merge into the main repo. This makes upgrades simple: checkout the new tag and re-run docker compose. This also makes changelogs simple: there is only one. Breaking changes to the docker images can only be made with a HedgeDoc major release, but they at least can be made without the fear that the user didn't read all changelogs.

@SISheogorath
Copy link
Contributor Author

We should not merge with the main repository. It's all bonkers. It's breaking container image updates. We currently release a new container image version every time the base images are updated. We can not wait for the main repository to keep up with that. An OS release every half year with no updates in between just doesn't work.

You can have your container images in your main repository on a rolling-release deployment. But we don't do that. We have strong versioning for the main repository. But we can't have that with the container images. The container images need continuous updates.

And before you start with "then run apt-get update && apt-get upgrade as part of the container image", that works only for packaged software. But that's not how the node containers work. They download node from the upstream project, which are statically linked. So no update for OpenSSL or alike since the node binaries used by the upstream are statically linked.

I mean, we sure can decide that all this shouldn't bother us and either start to build or install all of it ourselves, but then we lose all the benefits you get from using the official base images. And still have to make sure we rebuild images on a regular basis. Also there exist some container specific things, like the handling of local volume and alike, that don't really fit well in the main repository schema. Not even to mention potential problems from distributions like Debian or Alpine that enjoy renaming a package for no reason, breaking the build process and therefore would push us to release a new HedgeDoc version for no other reason that 3-6 letter changes in a Dockerfile.

@davidmehren
Copy link
Member

It's breaking container image updates.

It isn't.

We currently release a new container image version every time the base images are updated.

We can keep doing that. What stops us from having a CI-job that rebuilds the latest tag with the latest base image every night?

Not even to mention potential problems from distributions like Debian or Alpine that enjoy renaming a package for no reason, breaking the build process and therefore would push us to release a new HedgeDoc version for no other reason that 3-6 letter changes in a Dockerfile.

I don't see why we can't do a patch-release to fix build-issues in our Dockerfile.

I think it's very important to ensure we don't have breaking changes in our docker image and the recommended deployment of it without a corresponding major HedgeDoc release. Merging the repos seems to be the simplest way to achieve that. Maybe you could help to find another way instead of only stating why you think merging is bad.

What I could imagine, is to remove all specific HedgeDoc versions from the container repo, so we don't have to update the repo after a new release. The CI would just build the latest tag of the main repo.
We can then create 1.x and 2.x branches and make sure to not do anything backwards-incompatible in the 1.x branch. All the new stuff could happen in the 2.x branch and then released with 2.0 and documented in its release notes.

@InnayTool
Copy link
Member

Maybe we should use a .env file to get rid of the fact that users have to change things in the docker compose file. This would provide that the file can safely be updated.

@davidmehren
Copy link
Member

Let's update a few things:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants