Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"/": upload/download files vs. directories #9

Open
1fish2 opened this issue Aug 20, 2019 · 0 comments
Open

"/": upload/download files vs. directories #9

1fish2 opened this issue Aug 20, 2019 · 0 comments
Labels
bug Something isn't working

Comments

@1fish2
Copy link
Collaborator

1fish2 commented Aug 20, 2019

Currently it's possible to trip up by downloading a GCS "foo/" entry as a file "foo", which then breaks attempts by Sisyphus or by the worker task itself to create files or subdirectories in that "directory." Then one can spend a lot of time debugging.

The Step and the Command together specify each input and output path:

  1. Step foo, Command foo: upload/download a file
  2. Step foo/, Command foo: illegal combination
  3. Step foo, Command foo/: upload/download an entire directory tree
  4. Step foo/, Command foo/: illegal combination

The Workflow builder always constructs the Step path without a trailing /, so those workflows are fine.

In Sisyphus

  • cloud/download-blob! is overly-forgiving about whether the remote or local path ends with /.
  • cloud/download-tree! assumes it's given a key that doesn't end with / which names a directory entry that does end with /. If the key does end with / it will find the same directory entry (and perhaps more) and then throw an IndexOutOfBoundsException (IIRC) trying to strip off the key + '/' "preamble".
  • I think the calling code distinguishes which one of these to call by whether the local path ends with /.

I don't know how to reproduce today's nasty case with the Sisyphus code:

sisyphus.core=> (cloud/download! storage "sisyphus" "data/jerry/20190817.114027/metadata" "/tmp/metadata")
ERROR: file unavailable to download sisyphus:data/jerry/20190817.114027/metadata
nil
sisyphus.core=> (cloud/download! storage "sisyphus" "data/jerry/20190817.114027/metadata/" "/tmp/metadata")
download data/jerry/20190817.114027/metadata/ data/jerry/20190817.114027/metadata/ to /tmp/metadata
true ; downloaded just the metadata/ directory, as a directory
sisyphus.core=> (cloud/download! storage "sisyphus" "data/jerry/20190817.114027/metadata" "/tmp/metadata/")
ERROR: file unavailable to download sisyphus:data/jerry/20190817.114027/metadata
nil
sisyphus.core=> (cloud/download! storage "sisyphus" "data/jerry/20190817.114027/metadata/" "/tmp/metadata/")
download data/jerry/20190817.114027/metadata/ data/jerry/20190817.114027/metadata/ to /tmp/metadata/
true ; downloaded just the metadata/ directory, as a directory

sisyphus.core=> (cloud/download-tree! storage "sisyphus" "data/jerry/20190817.114027/metadata" "/tmp/metadata")
nil ; downloaded the tree
sisyphus.core=> (cloud/download-tree! storage "sisyphus" "data/jerry/20190817.114027/metadata/" "/tmp/metadata")

Execution error (StringIndexOutOfBoundsException) at java.lang.String/substring (String.java:1931).
String index out of range: -1
sisyphus.core=> (cloud/download-tree! storage "sisyphus" "data/jerry/20190817.114027/metadata" "/tmp/metadata/")
nil ; downloaded the tree
sisyphus.core=> (cloud/download-tree! storage "sisyphus" "data/jerry/20190817.114027/metadata/" "/tmp/metadata/")

Execution error (StringIndexOutOfBoundsException) at java.lang.String/substring (String.java:1931).
String index out of range: -1

Proposals:

  • Consider simplifying the above rule so the local and remote path endings must match, that is, either both end with / or neither ends with /. That's easier to remember and makes Steps clear about which paths are directories, just like Commands are. Manage the transition.
  • Both Gaia and Sisyphus should reject illegal combinations as early as they can, giving a helpful error message so people can quickly fix the problem.
@1fish2 1fish2 added the bug Something isn't working label Aug 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant