From 301009a647755bad1a11582f809958230efaf2bf Mon Sep 17 00:00:00 2001 From: Remi Gau Date: Thu, 11 Apr 2024 14:58:59 +0200 Subject: [PATCH 1/5] reset --- src/common-principles.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/src/common-principles.md b/src/common-principles.md index 461842ddfc..a8dc5756aa 100644 --- a/src/common-principles.md +++ b/src/common-principles.md @@ -309,6 +309,11 @@ field in `dataset_description.json` of each subdirectory of `derivatives` to: } ``` +!!! warning "non-anonymised data in `sourcedata`" + + Source files requiring anonymization MAY be stored in this subfolder. + However, be aware that this could make it harder to share the dataset. + ### Storage of derived datasets Derivatives can be stored/distributed in two ways: From a3b88426736557013df82895dcdc93902102a8c8 Mon Sep 17 00:00:00 2001 From: Remi Gau Date: Fri, 12 Apr 2024 10:37:44 +0200 Subject: [PATCH 2/5] Apply suggestions from code review Co-authored-by: Oscar Esteban --- src/common-principles.md | 24 ++++++++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/src/common-principles.md b/src/common-principles.md index a8dc5756aa..4f69068d30 100644 --- a/src/common-principles.md +++ b/src/common-principles.md @@ -309,10 +309,26 @@ field in `dataset_description.json` of each subdirectory of `derivatives` to: } ``` -!!! warning "non-anonymised data in `sourcedata`" - - Source files requiring anonymization MAY be stored in this subfolder. - However, be aware that this could make it harder to share the dataset. +!!! danger "Caution" + + Sharing source data may help amend errors and missing data discovered + only with the reuse of the raw dataset in practice. + Therefore, from an Open Science perspective, it is RECOMMENDED to share + the source data whenever it is possible. + + However, more stringent sharing limitations may apply to + the source data than those applicable to the raw data. + For example, human data almost always requires anonymization before they + can be redistributed, or the subjects' consent form did not explicitly + stated that the source files would be shared after anonymization. + Further examples in which sharing source data may not be possible + include original data formats that are not redistributable as per the + acquisition device's license. + + As for raw data, all regulatory, ethical, and legal aspects SHOULD + be carefully considered before sharing data through the `sourcedata/` folder + mechanism. + In the case of source data, these aspects are likely more stringent. ### Storage of derived datasets From 97fee7f8c773f0a13147e9690ddc88b158297236 Mon Sep 17 00:00:00 2001 From: "pre-commit-ci[bot]" <66853113+pre-commit-ci[bot]@users.noreply.github.com> Date: Fri, 12 Apr 2024 08:38:29 +0000 Subject: [PATCH 3/5] [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --- src/common-principles.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/common-principles.md b/src/common-principles.md index 4f69068d30..735ea2d7c7 100644 --- a/src/common-principles.md +++ b/src/common-principles.md @@ -315,7 +315,7 @@ field in `dataset_description.json` of each subdirectory of `derivatives` to: only with the reuse of the raw dataset in practice. Therefore, from an Open Science perspective, it is RECOMMENDED to share the source data whenever it is possible. - + However, more stringent sharing limitations may apply to the source data than those applicable to the raw data. For example, human data almost always requires anonymization before they @@ -324,7 +324,7 @@ field in `dataset_description.json` of each subdirectory of `derivatives` to: Further examples in which sharing source data may not be possible include original data formats that are not redistributable as per the acquisition device's license. - + As for raw data, all regulatory, ethical, and legal aspects SHOULD be carefully considered before sharing data through the `sourcedata/` folder mechanism. From a211b58521c4e28aaa8b5acc4a17391f4e313ab6 Mon Sep 17 00:00:00 2001 From: Remi Gau Date: Mon, 15 Apr 2024 11:51:55 +0200 Subject: [PATCH 4/5] semantic line break --- src/common-principles.md | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/src/common-principles.md b/src/common-principles.md index 735ea2d7c7..6f3fddcdbc 100644 --- a/src/common-principles.md +++ b/src/common-principles.md @@ -316,18 +316,19 @@ field in `dataset_description.json` of each subdirectory of `derivatives` to: Therefore, from an Open Science perspective, it is RECOMMENDED to share the source data whenever it is possible. - However, more stringent sharing limitations may apply to - the source data than those applicable to the raw data. - For example, human data almost always requires anonymization before they - can be redistributed, or the subjects' consent form did not explicitly - stated that the source files would be shared after anonymization. + However, more stringent sharing limitations may apply to the source data + than those applicable to the raw data. + For example, human data almost always requires anonymization + before they can be redistributed, + or the subjects' consent form did not explicitly state that the source files + would be shared after anonymization. Further examples in which sharing source data may not be possible - include original data formats that are not redistributable as per the - acquisition device's license. + include original data formats that are not redistributable + as per the acquisition device's license. As for raw data, all regulatory, ethical, and legal aspects SHOULD - be carefully considered before sharing data through the `sourcedata/` folder - mechanism. + be carefully considered before sharing data + through the `sourcedata/` directory mechanism. In the case of source data, these aspects are likely more stringent. ### Storage of derived datasets From f9fd169af98b1232d28db835fae4f443cc5a8e23 Mon Sep 17 00:00:00 2001 From: Chris Markiewicz Date: Mon, 22 Apr 2024 09:49:45 -0400 Subject: [PATCH 5/5] Update src/common-principles.md --- src/common-principles.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/common-principles.md b/src/common-principles.md index 6f3fddcdbc..a2bf5fed15 100644 --- a/src/common-principles.md +++ b/src/common-principles.md @@ -318,10 +318,10 @@ field in `dataset_description.json` of each subdirectory of `derivatives` to: However, more stringent sharing limitations may apply to the source data than those applicable to the raw data. - For example, human data almost always requires anonymization + For example, human data almost always requires deidentification before they can be redistributed, or the subjects' consent form did not explicitly state that the source files - would be shared after anonymization. + would be shared after deidentification. Further examples in which sharing source data may not be possible include original data formats that are not redistributable as per the acquisition device's license.