Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment keys may no longer be deterministic when using destDir option #675

Open
mbrgm opened this issue Jun 1, 2017 · 8 comments
Open

Comments

@mbrgm
Copy link
Member

mbrgm commented Jun 1, 2017

CC @deepfire, @ip1981, @domenkozar

I'm not 100% sure about this, so please correct me if I'm wrong.

Prior to 8f4a67c (see #664), when keys were removed from the declaration, they stayed on the target as long as /run was mounted (which at least meant that they were gone on the next reboot). Now, we allow people to create keys on persistent storage, which is a good thing for having persistent keys without putting them into the store, so basically I consider this a welcome enhancement. But it is also very easy to 'forget' keys on a machine now, as people are used to the declarative nature of nix and nixops, which would mean that the key is removed from the machine once you remove it from the declaration (and redeploy).

So basically, key declarations are no longer deterministic. I have a proposal, which should fix this:

Goals

  1. Allow creating keys os persistent storage, so they can be made persistent across reboots without using storeKeysOnMachine/persistKeysOnMachine, which would mean having them world-readable in the store (see Stop storing keys in the nix store for storeKeysOnMachine. #650).
  2. Key declarations should be actual declarations, i.e. when a key is removed from the declaration, it should also be removed from the target.

Idea

  • Remove destDir option
  • Add a deployment.keys.<K>.persist boolean property (default false)
    • For all keys where persist == true, create the keys on persistent storage, but outside the nix store (e.g. /var/lib/nixops-keys/)
    • For all keys where persist == false, create the keys on volatile storage ('/run/keys/`)
  • On every run of deploy, remove all persistent and volative keys, not just single ones, then create the declared keys

What are your thoughts on this?

@ip1981
Copy link
Contributor

ip1981 commented Jun 2, 2017

So far I remove keys in custom activation script:

{ config, lib, pkgs, ... }:
let

  inherit (builtins) attrNames;
  inherit (lib) concatMapStringsSep escape;
  inherit (config.deployment) keys;

  runkeys = "/run/keys";

in {
  config = {
    system.activationScripts.cleanKeys = ''
      ${pkgs.findutils}/bin/find ${runkeys} -type f \
        -not -name 'done' \
        ${concatMapStringsSep " " (k: "-not -name '${escape [ "[" ] k}'") (attrNames keys)} \
        -printf 'deleting %p\n' -delete
    '';
  };
}

And I don't think more options (persist) and logic are good ideas :)

@ip1981
Copy link
Contributor

ip1981 commented Jun 2, 2017

I was fiddling with NixOps patch as well:

diff --git a/nixops/backends/__init__.py b/nixops/backends/__init__.py
index ef5c108..26099d7 100644
--- a/nixops/backends/__init__.py
+++ b/nixops/backends/__init__.py
@@ -188,6 +188,13 @@ class MachineState(nixops.resources.ResourceState):
         self.warn("machine ‘{0}’ doesn't have a rescue"
                   " system.".format(self.name))

+    def delete_old_keys(self):
+        key_names = self.get_keys().keys()
+        self.run_command("find /run/keys -type f "
+                         + "-not -name done "
+                         + "".join(["-not -name '{0}' ".format(k) for k in key_names])
+                         + "-printf 'deleting %p\n' -delete"
+                        )
     def send_keys(self):
         if self.state == self.RESCUE:
             # Don't send keys when in RESCUE state, because we're most likely
@@ -213,6 +220,7 @@ class MachineState(nixops.resources.ResourceState):
                 chmod.format(opts['permissions'])
             ]))
             os.remove(tmp)
+        self.delete_old_keys()
         self.run_command("touch /run/keys/done")

     def get_keys(self):

@mbrgm
Copy link
Member Author

mbrgm commented Jun 2, 2017

@ip1981 As you mentioned, you'll have to create a custom activation script. The NixOps patch also only removes keys in /run/keys and doesn't take into account the ones put somewhere else by destDir, right?

I'm absolutely on your side when it's about introducing extra options or replacing current logic -- keep it as simple as possible. :) Nevertheless, at the moment, goal (2), which I mentioned above, is not satisfied at all -- although it is one of the main claims of Nix{,Ops,Os}.

My fear is that in the case of destDir and without having a custom activation script like you do, NixOps is (in respect to keys) behaving like any other 'pseudo-declarative' solution, say Saltstack, Chef etc. Why? As I wrote, when I remove something from the declaration/system definition, I expect it to be gone on the target as well. Especially for keys/secrets, this could become pretty important, i.e. I want to be able to tell which keys are still on the system and which ones are not. When I remove a module from the declaration, it's no longer active and is cleaned up when GC'ing, but for keys, they just stay there (possibly on persistent storage when put there via destDir).

@mbrgm
Copy link
Member Author

mbrgm commented Jun 2, 2017

@ip1981 Looking at your patch again, this is what I was thinking about for the second part of my proposal. The first part is basically not an additional option but to replace destDir with a boolean flag, which can be used to put keys in one known persistent location, not in arbitrary directories where we forget about them once they're removed. I think this is nothing new logic-wise -- just the same as for /run/keys keys, but in a persistent location, and your patch on top for both locations, so we can be sure old keys are removed.

@ip1981
Copy link
Contributor

ip1981 commented Jun 2, 2017

Well, the keys specifically are not to worry about. There are a lot of options you shoudn't or can't change after initial deploy, e. g. mysql data dir or innodb log file size.

The point of keys' destDir to me is to have another keys dir, not to change it once in a while. It will be trivial to amend activation script.

Though you are right that keys are not perfectly declarative. I'm fine with that, treating them as data, not as code. If you change mysql data dir, old data is left intact.

@mbrgm
Copy link
Member Author

mbrgm commented Jun 2, 2017

You're right about the mysql data dir -- I was thinking about the same thing when I tried to judge whether the current keys behavior is good or bad. For MySQL data, I would expect the 'data stays' behavior. For keys, to be honest, I would expect them to behave in a declarative way. I think both opinions are valid but also subjective. So I would appreciate other people's feedback on this point.

The point of keys' destDir to me is to have another keys dir, not to change it once in a while.

This leads me to the question why we don't narrow that functionality down to having one known other keys dir, which we can then work on, e.g. by removing keys when they're no longer declared.

To explain the motivation for sticking to my opinion... I'm open to changing my mind on this, once I understand

  • in which way an individual destDir per key is benificial to having a common persistent storage location, which we know about
  • why we should accept the keys mechanism working in a non-declarative fashion although there are ways to address this
  • why we should leave it to users to take care of removing keys when we could implement it as a feature of NixOps (IIRC you already proposed your script as a PR)

If you change mysql data dir, old data is left intact.

I think the reason for MySQL data being non-declarative is that it would be really complicated to implement. However, the main claim of Nix{,Ops,OS) is being 'functional', so I think when there is a solution, which would be relatively easy to implement, we should try to keep up to that claim instead of letting things slide just because this is already the case for other parts.

Maybe there are other good solutions for this problem. One thing I could think of is removing the key when stopping its related service. This would allow to keep the destDir option, keep the key itself out of the store and still remove the key when it's not declared anymore. But I think if we see the destDir option needs to be removed, it should be taken care of before it's released.

@coretemp
Copy link

Given the below description:

NixOps is a tool for deploying sets of NixOS Linux machines, either to real hardware or to virtual machines. It extends NixOS’s declarative approach to system configuration management to networks and adds provisioning.

I was unpleasantly surprised to learn about this particular behavior.

@domenkozar
Copy link
Member

Happy to accept a PR with a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants