You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi everyone, following YODA principles, I regularly run into the following "issue": To keep datasets modular, I usually add them as subdatasets in an inputs directory while they also exist at the project-directory level. Here is an example: I have a BIDS DataLad dataset (bids) that I add as a subdataset to my fmriprep DataLad dataset:
Now when I run fMRIprep, I give it ./fmriprep/inputs/bids as the input path. But this involves running datalad get to actually get the files of the BIDS dataset into that place. To speed this up, I usually configure a local DataLad sibling for ./fmriprep/inputs/bids like this datalad siblings add -s local --url ../../../bids. Then datalad get can retrieve the data from local. But then I have the full size of the BIDS dataset in two locations which takes up additional disk space. Of course, I could datalad drop the files again but, and here comes the idea, maybe there is a way to adjust the path such that the data does not have to be retrieved and copied again, while still staying in line with YODA principles.
I am not even sure if this is something that can or / should be handled on the DataLad side but maybe you know other nice workarounds for this? Thanks!
The text was updated successfully, but these errors were encountered:
isn't --reckless=ephemeral mode is exactly what you need, where .git/annex/objects would be shared from the original repository, thus you would not need to actually "get" any load? Related issues worth reviewing/chiming in
what stroke me after is to realize that --reckless=ephemeral wouldn't work if used with datalad-containers without ad-hoc/adjustments to bind mount some higher level folder from which .git/annex would be symlinked from, so it wouldn't be the complete solution as is
Description
Hi everyone, following YODA principles, I regularly run into the following "issue": To keep datasets modular, I usually add them as subdatasets in an
inputs
directory while they also exist at the project-directory level. Here is an example: I have a BIDS DataLad dataset (bids
) that I add as a subdataset to myfmriprep
DataLad dataset:Now when I run fMRIprep, I give it
./fmriprep/inputs/bids
as the input path. But this involves runningdatalad get
to actually get the files of the BIDS dataset into that place. To speed this up, I usually configure alocal
DataLad sibling for./fmriprep/inputs/bids
like thisdatalad siblings add -s local --url ../../../bids
. Thendatalad get
can retrieve the data fromlocal
. But then I have the full size of the BIDS dataset in two locations which takes up additional disk space. Of course, I coulddatalad drop
the files again but, and here comes the idea, maybe there is a way to adjust the path such that the data does not have to be retrieved and copied again, while still staying in line with YODA principles.I am not even sure if this is something that can or / should be handled on the DataLad side but maybe you know other nice workarounds for this? Thanks!
The text was updated successfully, but these errors were encountered: