From 01025da63297468a3801f8fe43f97a140f2f9551 Mon Sep 17 00:00:00 2001 From: Remi Gau Date: Mon, 22 Apr 2024 16:51:31 +0200 Subject: [PATCH] [ENH] Add warning about deidentification when sharing sourcedata (#1769) * reset * Apply suggestions from code review Co-authored-by: Oscar Esteban * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * semantic line break * Update src/common-principles.md --------- Co-authored-by: Oscar Esteban Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Chris Markiewicz --- src/common-principles.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/src/common-principles.md b/src/common-principles.md index ac05174e59..9b10ba19f2 100644 --- a/src/common-principles.md +++ b/src/common-principles.md @@ -309,6 +309,28 @@ field in `dataset_description.json` of each subdirectory of `derivatives` to: } ``` +!!! danger "Caution" + + Sharing source data may help amend errors and missing data discovered + only with the reuse of the raw dataset in practice. + Therefore, from an Open Science perspective, it is RECOMMENDED to share + the source data whenever it is possible. + + However, more stringent sharing limitations may apply to the source data + than those applicable to the raw data. + For example, human data almost always requires deidentification + before they can be redistributed, + or the subjects' consent form did not explicitly state that the source files + would be shared after deidentification. + Further examples in which sharing source data may not be possible + include original data formats that are not redistributable + as per the acquisition device's license. + + As for raw data, all regulatory, ethical, and legal aspects SHOULD + be carefully considered before sharing data + through the `sourcedata/` directory mechanism. + In the case of source data, these aspects are likely more stringent. + ### Storage of derived datasets Derivatives can be stored/distributed in two ways: