Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fwd slash #1089

Open
wants to merge 4 commits into
base: generic-storage
Choose a base branch
from
Open

Fwd slash #1089

wants to merge 4 commits into from

Conversation

hkethi002
Copy link
Contributor

@hkethi002 hkethi002 commented Feb 28, 2018

Fixes #1071

Changes

  • Any file uploaded via the engine will have the ':' converted to '/' which on download will translate to folders
    • Example: if the engine uploads MyFolder:MyFile.exe, it's name will be MyFolder/MyFile.exe in Flywheel. On download, MyFile.exe will be a file inside the folder MyFolder.
  • Engine upload will match on the basename of the filename, i.e. when uploading MyFolder/MyFile.exe, the filename should be MyFolder/MyFile.exe in the metadata, and the file name in Flywheel will be MyFolder/MyFile.exe
  • When downloading a single file using the three dot menu, the downloaded file will only have its basename.

Note: Only applies to engine uploads

Breaking Changes

  • file info endpoint changed to .../files/info/{FileName}

Review Checklist

  • Tests were added to cover all code changes
  • Documentation was added / updated
  • Code and tests follow standards in CONTRIBUTING.md

@kofalt
Copy link
Contributor

kofalt commented Feb 28, 2018

Can you explain your thoughts around the choice of escape character?

@codecov-io
Copy link

codecov-io commented Feb 28, 2018

Codecov Report

Merging #1089 into generic-storage will increase coverage by <.01%.
The diff coverage is 100%.

@@                 Coverage Diff                 @@
##           generic-storage    #1089      +/-   ##
===================================================
+ Coverage            91.05%   91.05%   +<.01%     
===================================================
  Files                   49       49              
  Lines                 7067     7068       +1     
===================================================
+ Hits                  6435     6436       +1     
  Misses                 632      632

@hkethi002
Copy link
Contributor Author

I'm not attached to it at all, I used because when I rename a file using the Finder on my mac, it uses a ':', but just shows it as a '/' in the Finder.

@kofalt
Copy link
Contributor

kofalt commented Feb 28, 2018

Is there a need for that artificial separator? If we're supporting folders on containers now, the engine could send you the whole path.

I'd be worried about unintentionally triggering this. If nothing else we'd need to document it in the gear spec, but gears by design don't have this external concept so it'd be a little awkward.

@hkethi002
Copy link
Contributor Author

hkethi002 commented Feb 28, 2018

@kofalt Would it require significant changes to the engine to send the desired path in the metadata?

@kofalt
Copy link
Contributor

kofalt commented Feb 28, 2018

I don't think so, but we'd want to tell the engine to turn that behavior on. Maybe we could add a parameter to the job ticket response that indicates support for the feature.

@ehlertjd
Copy link
Contributor

ehlertjd commented Mar 5, 2018

We will have to update the resolver (on the client) to handle slashes in filenames. My suggestion would be to allow backslash escaped filenames, which means we would also have to allow escaped backslashes.

@ehlertjd ehlertjd added the SDK label Mar 5, 2018
@kofalt
Copy link
Contributor

kofalt commented Mar 5, 2018

I was about to agree, but that will get really awkward if there are folders on any non-leaf container.
What would you say to the following:

  1. If there are no non-folder paths to walk, walk a folder path.
  2. If there are non-folder paths (another container name, or anything), walk that.
  3. In the second case, escaping the slash will walk the folder path.

Given an ambiguous non-leaf container A, with children containers & path-embedded files:

├── A
│   │
│   ├── B
│   │   ├── one.txt
│   │
│   ├── C
│   │   ├── two.txt
│   │
│   ├── three.txt
│   ├── B/four.txt
│   ├── D/five.txt

The following resolves could occur:

Input Array Succeeds
A ['A'] Y
A/three.txt ['A', 'three.txt'] Y
A/B/one.txt ['A', 'B', one.txt'] Y
A/B/four.txt ['A', 'B', 'four.txt'] N
A/B\/four.txt ['A', 'B\/four.txt'] Y
A/D/five.txt ['A', 'D', 'five.txt'] Y

This includes static non-folder paths. For example the string collections when that is added.
Thoughts?

@kofalt
Copy link
Contributor

kofalt commented Mar 5, 2018

We'll have enough information to know if a node is ambiguous when displaying it, so we can even output the escaped versions preemptively so copy & paste works.

@ehlertjd
Copy link
Contributor

ehlertjd commented Mar 5, 2018

@kofalt I think that works. Instead of backslashes, we could also just allow segments to be quoted when referring to a file, e.g. A/"B/four.txt". But I agree that we should do the right thing when we can, without requiring any escaping.

I think in any scenario, we would unescape path segments by the time they reach the server.

@kofalt
Copy link
Contributor

kofalt commented Mar 5, 2018

Yeah, we could do either or both. Maybe instead of sending (from my example) ['A', 'B\/four.txt'] to the API, we just send ['A', 'B/four.txt'], because splitting is the client's problem.

In command-line usage, stuff will probably be quoted because spaces. So it might be weird to say "group/weird name/a/'B/four.txt'". I would vote \/, less neat but easier to read. And keep that convention when interacting API <--> other tools.

@kofalt
Copy link
Contributor

kofalt commented Mar 5, 2018

This is also weird for exports. Up till now scitran maps to a tree. I really am not a big fan of breaking that.

@kofalt
Copy link
Contributor

kofalt commented Mar 5, 2018

What if, for tree-mapping purposes, files are always under files/, and containers cannot be named / labeled files? PTAL @nagem

@nagem
Copy link
Contributor

nagem commented Mar 5, 2018

What if, for tree-mapping purposes, files are always under files/, and containers cannot be named / labeled files? PTAL @nagem

It is the intention with the new download endpoint work that certain container names will be off limits, including files, analyses, projects, subjects, sessions, acquisitions. We could move some of that work up to support this change. Here is the a rough version of the hierarchy for the metadata download:

proj_label
├── proj_label.flywheel.json
├── ANALYSES
│   └── ana_label
│       ├── ana_label.flywheel.json
│       ├── INPUT
│       └── OUTPUT
├── FILES
│   ├── filename.ext
│   └── filename.ext.flywheel.io
└── SUBJECTS
    └── subj_label
        ├── subj_label.flywheel.json
        ├── ANALYSES
        ├── FILES
        └── SESSIONS
            └── sess_label
                ├── sess_label.flywheel.json
                ├── ANALYSES
                ├── FILES
                └── ACQUISITIONS
                    └── acq_label
                        ├── acq_label.flywheel.json
                        └── FILES

@kofalt
Copy link
Contributor

kofalt commented Mar 5, 2018

Okay, the only difference I can see is that yours is a forcibly-unambiguous tree, which you'd need for a download generator. Does anyone object to that differing from the resolver tree?

@nagem
Copy link
Contributor

nagem commented Mar 5, 2018

I don't think they need to be exactly the same, but they could use similar features like explicitly calling out sub documents that deviate from the normal hierarchy like files and analyses.

@kofalt
Copy link
Contributor

kofalt commented Mar 5, 2018

Perfect. Let's go with that.

@hkethi002
Copy link
Contributor Author

hkethi002 commented Mar 7, 2018

So I'm not sure if I understand this correctly, but with this case

 ___A
│   ├── B/four.txt

If the name of the file is 'B/four.txt' in the file form, the metadata would have a list ['A', 'B/four.txt'] as the filename?

Copy link
Contributor

@kofalt kofalt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I am 👎 on fake-folders in filenames. It's an interesting idea, but we seem to be slowly re-implementing a half-backed file system / VCS (#878, #1082), and the idea of mapping the entire set of files back together to figure out the folders is pretty strange. I vote back to the drawing board on this one.

@gsfr gsfr removed the request for review from nagem May 11, 2018 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants