Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serializing with "$(printf %q "$VARIABLE")" #20

Open
stuartpb opened this issue Jan 6, 2015 · 18 comments
Open

Serializing with "$(printf %q "$VARIABLE")" #20

stuartpb opened this issue Jan 6, 2015 · 18 comments

Comments

@stuartpb
Copy link

stuartpb commented Jan 6, 2015

This is one of those holes that can open up if you're not being conscious of how you're outputting your strings. If it could have spaces, quotes, or dollar signs (and you should always treat strings as potentially having these things), and you're putting it somewhere where it will be treated as part of a string representing an argument list (to be deserialized using xargs -x, see #17), you should output it with printf %q rather than just naively echoing it (which should also be done with printf, see #18).

@progrium
Copy link
Owner

progrium commented Jan 6, 2015

Interesting. Where can I get the specifics of how this works?

@stuartpb
Copy link
Author

stuartpb commented Jan 6, 2015

Of printf %q? The bash manual just says "causes printf to output the corresponding argument in a format that can be reused as shell input": in my experience, this can be summed up as:

  • Prints '' for the null/empty string.
  • Prints the string normally if it has no spaces, using backslashes to escape shell-recognized characters like quotes or semicolons.
  • Prints the string in single-quotes ("shell quoting") if it has spaces, using '\'' to escape single-quotes.
  • Prints the string in a $'...' "ANSI C" style string, with backslashed-escaped command sequences, if it has special/non-printing characters like newlines.

For the exact specifics:

@progrium
Copy link
Owner

Cool. This sounds like a good tip that some people might want to wrap in a more explanatory function if they use it a fair amount in their script.

@progrium
Copy link
Owner

Actually, can you give some examples where it's useful to be serializing... including the xargs example, but specifically other cases.

@stuartpb
Copy link
Author

Saving variables in a generated script, or a list of environment variables like an app's .profile.d/app-env.sh in buildstep/herokuish (which reminds me of another tip/trick for the High Codex).

@progrium
Copy link
Owner

I think I understand what you mean, but some real examples to point me at would be helpful.

@stuartpb
Copy link
Author

I showed you yesterday how it's used in Pluchu to preserve argument quoting when reproducing the input arguments for ssh's command on the server: https://github.com/plushu/pluchu/blob/master/pluchu#L57

As you could probably guess, Plushu uses it when saving config variables: https://github.com/plushu/plushu-config/blob/master/subcommands/set#L65

Plushu's startup plugin also uses it to export the environment when creating its Upstart unit (but not its systemd unit, I think because systemd doesn't have the same quoting rules as the shell): https://github.com/plushu/plushu-startup-deploy-all/blob/master/install#L24-26

@progrium
Copy link
Owner

Yeah, I get the second two, but I was pretty sure it's unnecessary for the first case as long as it's quoted "$@"

@stuartpb
Copy link
Author

Unfortunately, it is, because ssh takes a string for your command, and approximates multiple argument handling through concatenation. When your command arguments go over the wire, if you provide multiple arguments, they are effectively sent as "$*" (ie. unquoted and concatenated with single spaces), then passed via $SHELL -c (where $SHELL is the user's login shell) to be interpreted as a single-string command line. That's why pluchu serializes "$@" (so sh -c will perform proper deserialization).

You can test this yourself:

ssh [email protected] "printf 'arg: %s\n'" foo "bar baz" 'beep bop' 'these "two' 'together" wrong'

will output:

arg: foo
arg: bar
arg: baz
arg: beep
arg: bop
arg: these
arg: two together
arg: wrong

because it will hit the server as:

bash -c "printf 'arg: %s\n' foo bar baz beep bop these \"two together\" wrong"

@progrium
Copy link
Owner

So that is a special case for SSH wonkiness.

@stuartpb
Copy link
Author

Or really any situation where you're going to pass a command on to a shell (such as docker run ubuntu sh -c).

@progrium
Copy link
Owner

Alright, I'll have to try this more to internalize it. I believe you're right that it is the right way to universally serialize command line arguments though.

@progrium
Copy link
Owner

So I tried replacing this line:
https://github.com/gliderlabs/herokuish/blob/master/include/procfile.bash#L21

With:

unprivileged /bin/bash -c "$(printf '%q' "$@")" 

And it was unable to run the commands. Why didn't it work?

@stuartpb
Copy link
Author

I only took a very brief look at its usage, but isn't the argument to that function already a single serialized command line string (per line 11)?

"$(printf '%q ' "$@")" (also, note the space after the %q needed to separate each argument) is to serialize a list (ie. either $@ or another list constructed using parentheses) to a string. If it's already a single string, it's already in the form you use for passing to sh -c.

@stuartpb
Copy link
Author

Also, I've never had to actually do this within a script (I've only ever passed command strings to xargs to launch something else), but I think you can convert a command line string to a list using:

IFS= read -rd '' -a listname < <(xargs printf '%s\0' <<<"$string")

@progrium
Copy link
Owner

Yes regarding line 11, though not always since that function can be called directly as a subcommand.

@stuartpb
Copy link
Author

So yeah, there's your answer, line 21 should really read:

unprivileged /bin/bash -c "$1"

and then its callers should be passing in lists that have been serialized eg. via printf '%q ' (at some point). When procfile-parse reads a line from the Procfile, it's already going to be quoted and escaped correctly (or it'll have a syntax error, which is the author's fault).

The JS analogy for sh -c would be a function that takes a JSON string. You can pass it "\"null\"" to pass the string "null", or pass it "null" to pass an actual null. If somebody passes you "null" as your value, you need to understand whether you need to serialize (JSON.stringify), deserialize (JSON.parse), or pass it on directly unchanged, or you're going to change the value.

@stuartpb
Copy link
Author

So line 87 in herokuish.bash is the one that should be changed, to this:

        /exec)      procfile-exec "$(printf '%q ' "$@")";;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants