You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Mar 22, 2021. It is now read-only.
The proposed directory structure is such that the same data will be stored in multiple locations, as it will be duplicated into (possibly multiple) input folders. Though the structure with the input folders adds clarity for the workflow, an approach like this could use up a lot of disk space very quickly (except if symbolic links are used, though I doubt that those would work across different platforms).
The text was updated successfully, but these errors were encountered:
the key is to keep different pipeline stages portable. i.e., you can work on analysis and I have prepped the dataset. I know that for the main guy on this project, you do have a lot of duplicate files. I kind of am fine with this because disk space is cheap. if you can find a solution, let me know.
another issue is that for this minimal example, we could host a zip with the raw data on TilburgScienceHub, as we strictly need to avoid teaching you can store your data on GitHub. makes sense? download data via R script is platform-independent...
Portability definitely is a good point. I'll try to implement this in the example some time soon, but one thing is a bit unclear to me from the site (though I might just have missed this part). What is the best approach to keep the input folders up-to-date with upstream changes? Should this be done by upstream, or downstream?
Good idea regarding the raw data. I'll try adding the zip to the page through a PR, and will update the example.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
The proposed directory structure is such that the same data will be stored in multiple locations, as it will be duplicated into (possibly multiple) input folders. Though the structure with the input folders adds clarity for the workflow, an approach like this could use up a lot of disk space very quickly (except if symbolic links are used, though I doubt that those would work across different platforms).
The text was updated successfully, but these errors were encountered: