Skip to content

Commit

Permalink
Merge pull request #176 from Remi-Gau/remi-dev
Browse files Browse the repository at this point in the history
update doc general organization
  • Loading branch information
Remi-Gau authored Oct 16, 2020
2 parents 5fb9350 + d25e047 commit 2e3576e
Show file tree
Hide file tree
Showing 13 changed files with 186 additions and 133 deletions.
3 changes: 2 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,5 @@ branches:
before_script:
- npm install `cat npm-requirements.txt`
script:
- remark . --frail
- remark ./docs/ --frail
- remark ./inputs/ --frail
7 changes: 6 additions & 1 deletion docs/10-motivations.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,20 +47,25 @@ For example, a researcher would have to write or rewrite some aspects of the
methods used when

1. when preparing a pre-registration describing the planned study

1. during the data curation process that usually involves adding metadata
elements that relate to the details of the data acquisition,

1. when actually working the methods and results section where we have to back
and forth to the code we used to run the experiment and to the dataset to
make sure important details are accurately reported,

1. when sharing raw or derived data which also usually involves adding a
minimum of methods-related metadata if the shared data is to be meaningfully
reusable.

Another source of inefficiency is the time lost trying to figure out:

- what the authors of a paper actually did

- when we would like to compare our results to theirs
- when reviewing papers
- when reviewing papers

- what we actually did 6 months ago but forgot to make a note of it.

So a potential side effect of using a checklist to systematically capture how
Expand Down
5 changes: 5 additions & 0 deletions docs/20-goals.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,14 +20,18 @@ We envision that using checklists to report methods and results can:
we want an app to help **document** pipelines to improve the reproducibility
of our work and to reduce inefficiencies and frictions when trying to build
on each other's work.

1. facilitate the creation and preparation of pre-registration and registered
reports by reminding of future analysis steps that we might otherwise
overlook or forget about: in other words we want an app to help us think
about and **create** pipelines before we start collecting data.

1. help make peer-review more objective: we want an app to help us **check**
pipelines.

1. facilitate systematic literature reviews and meta-analyses (use the app to
**read** pipelines)

1. facilitate data sharing (use the app to **standardize** the report of
information on a pipeline)

Expand All @@ -37,6 +41,7 @@ The implementation of this project should remain flexible enough to:

- accommodate the inclusion of new items in the checklist as new methods
mature (e.g. new multivariate analysis, high-resolution MRI...),

- easily fork the project and convert it to create a checklist-website for a
different field.

Expand Down
6 changes: 6 additions & 0 deletions docs/21-short-term.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,13 @@ So far the common short goals of all the versions of the app (for MRI, PET...)
are:

- Create a set of tools and a proof of concept web-app that can:

- convert a set of spreadsheet of items into a schema that represents all

- from this schema generate a checklist to be clicked through by users,

- outputs a set of JSON-LD files once the user is done,

- generate a method section using these JSON-LD files and some boilerplate
template of a method section where the content of the JSON-LD files
could be reinjected.
Expand All @@ -25,6 +28,7 @@ For the spreadsheets that represent the recommendation guidelines, the initial
curation process must:

- identify high-priority items for each checklist,

- ensure that those high priority items has been properly atomized (meaning
that it is only made of a single question) and curated (define an item name,
a question, the type of response expected and an eventual list of response
Expand Down Expand Up @@ -60,8 +64,10 @@ The main short term goals for the MEEG version are:
both versions by extracting the common parts into standalone spreadsheets:
for example there could be one common spreadsheet for participant sample
description.

- Consolidate the other items of the spreadsheet, as it is still missing a lot
of information

- Identify high-priority items in the checklist (similar to Carp 2012 for
fMRI, e.g.
[Luck & Gaspelin 2015](https://doi.org/10.1111/psyp.12639))
Expand Down
13 changes: 7 additions & 6 deletions docs/22-mid-term.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,16 @@ they used this technique in a previous section of the checklist.

### Improving the wording

The questions of the checklist must be as unambiguous as possible. This
should be improved through early user feedback.
The questions of the checklist must be as unambiguous as possible. This should
be improved through early user feedback.

## Extended checklists

Right now, several of the [prototypes](https://github.com/Remi-Gau/eCobidas/tree/master/README.md#prototypes) contains only a
subset of all the questions from the reports they came from. For example, the
MRI checklist only contain the items corresponding to the metadata of a
collection of results uploaded on [Neurovault](https://neurovault.org/).
Right now, several of the
[prototypes](https://github.com/Remi-Gau/eCobidas/tree/master/README.md#prototypes)
contains only a subset of all the questions from the reports they came from. For
example, the MRI checklist only contain the items corresponding to the metadata
of a collection of results uploaded on [Neurovault](https://neurovault.org/).

In the near future, we want to be able to extend those checklists so they
include **all** the items listed in their guidelines.
Expand Down
1 change: 1 addition & 0 deletions docs/23-long-term.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ standards for data and results like:

1. the brain imaging data structure ([BIDS](http://bids.neuroimaging.io/)) used
for that study,

1. the [NIDM results](http://nidm.nidash.org/specs/nidm-results_130.html) of
any mass-univariate analysis performed for this study.

Expand Down
194 changes: 97 additions & 97 deletions docs/30-general-organization.md
Original file line number Diff line number Diff line change
@@ -1,97 +1,117 @@
# General organization

<!-- TODO
- mention OSF
- mention Zotero
-->
The general workflow of this project is the following:

<!-- ```
.
├── activities
├── communication
├── docs
├── inputs
├── node_modules
├── protocols
├── response_options
├── schema
├── scripts
└── tests
``` -->


This repository hosts the work that will turn the report published by the
Committee on Best Practices in Data Analysis and Sharing (COBIDAS) of the
organization for human brain mapping (OHBM) into a practical tool for improving
methods and results reporting in (f)MRI, (i)EEG, MEG.
- turning the recommendation guidelines into spreadsheets

There are 3 repositories behind this checklist:
- turning the spreadsheets into a "schema" representation

1. this
[COBIDAS_chckls repository](https://github.com/Remi-Gau/eCobidas/tree/master)
where you are currently reading this. It contains:
- the [neurovault spreadsheet](https://github.com/Remi-Gau/eCobidas/tree/master/inputs/csv/cobidas_neurovault.csv)
- the python [script](https://github.com/Remi-Gau/eCobidas/tree/master/scripts/convert_csv_to_schema.py) to turn that
spreadsheet into a Repronim schema (basically a bunch hierarchically
organized json files that link to each other).
1. Reproschema
1. the ui
- using a "front-end" user-interface that will read those schema and serve a
web-app to the user.

the ui repository that does
the actual rendering of the checklist app by reading the schema hosted by the
previous repository.
To execute that work, this project is organized around several "repositories":

There is a general explanation of how the app works in this
issue
- the [eCOBIDAS repository](https://github.com/Remi-Gau/eCobidas) centralises
most of the information and workflow to convert the guidelines into a
checklist webapp,

- the [Reproschema user interface](https://github.com/ReproNim/reproschema-ui)
contains the "front-end" code of the user interface to render the checklist
webapp,

- the [ReproSchema](https://github.com/ReproNim/reproschema) repository
contains the formal "definition" of the terms used to describe the content
of the checklist as a schema,

ohbm/cobidas repo
- the repositories containing a) an instance of the user interface and b) a
schema to serve a specific checklist :

add info on reproschema template about what to modify
- the [one for the MRI version](https://github.com/ohbm/cobidas) based of
the Neurovault metadata "checklist" hosted on the OHBM Github
organization and that serves this
[checklist]((https://ohbm.github.io/cobidas/#/)),

## Spreadsheet content and organization
- the one for the
[PET imaging version](https://github.com/Remi-Gau/cobidas-PET) that
serves this [checklist](https://remi-gau.github.io/cobidas-PET/#/),

See the dedicated [document](./spreadsheet-content.md)
- the
[google drive](https://drive.google.com/drive/folders/1wg5k-6pSB3mQm_a30abX6qb-lzTn_S-Y?usp=sharing)
where we work synchronously on the
[spreadsheets](https://drive.google.com/drive/folders/1ydwALHDzl21dcef3qhkju8JKKAT3Y72V?usp=sharing),

## How is the Reproschema organized
- an associated
[zotero library](https://www.zotero.org/groups/2349772/cobidas_checklist) to
keep track of references related to this project,

- a [project on the open-science framework](https://osf.io/anvqy/) that allows
to "connect" all those elements together in one place.

## the eCOBIDAS repository

<!--
This repository hosts the workflow that will turn the reports published by the
Committee on Best Practices in Data Analysis and Sharing (COBIDAS) of the
organization for human brain mapping (OHBM) into a checklists for improving
methods and results reporting in (f)MRI, (i)EEG, MEG.

By extension, this workflow can also be used on other types of guidelines (like
the ones for PET imaging and eyetracking).

TODO link to reproschema doc
```text
.
├── .github <-- continuous integration "scripts"
├── activities <-- schema of the different "sections"of the checklistss with their items
├── communication <-- abstracts and presentations about the project
├── docs <-- content of the documentation
├── inputs <-- checklists spreadsheets as CSV files, boilerplate for method section generation
├── protocols <-- schema for the checklists putting together several "sections" together
├── response_options <-- contains the pre-set list of response options to some checklist items
├── schema <-- obsolete: ignore this
├── scripts <-- python scripts to convert the CSV spreadsheets into schemas
└── tests <-- python script to test that the schema files are valid JSON-LD
```

### Spreadsheet content and organization

https://www.repronim.org/reproschema/
The first step of the workflow involves taking the recommendation guidelines and
converting that into a spreadsheet that contains all the items of the future
checklist.

https://www.repronim.org/reproschema/98_FAQ/
This step is by far the most labor intensive and has its dedicated page in the
[documentation](./40-spreadsheets.md)

https://github.com/ReproNim/reproschema/pull/399
### Converting the spreadsheet into a schema

https://github.com/Remi-Gau/reproschema/tree/remi-documentation/docs
Most of that is covered in the section on
[how the checklist is rendered](./50-how-to-render-the-checklist.md) and in the
README in the `scripts` folder.

-->
## How is the Reproschema organized

The first step to create the checklist involves taking a spreadsheet that
contains all the items and turning that into a representation that can
The first step of the workflow involves taking a spreadsheet that contains all
the items of the checklist and turning that into a representation that can
efficiently link the metadata about each item to the data imputed by the user.
Basically it means turning your 'dumb' spreadsheet into an equivalent but
'smarter' representation of it: in this case a bunch hierarchically organized
json files that link to each other.
We are using the [ReproSchema](https://github.com/ReproNim/reproschema)
initiative from [ReproNim](http://www.repronim.org/) to do this. Basically, it
means turning your 'dumb' spreadsheet into an equivalent but 'smarter'
representation of it: a bunch of hierarchically organized json files that link
to each other.

In terms of choice of representation we are using the
[reproschema](https://github.com/ReproNim/reproschema) initiative from
[ReproNim](http://www.repronim.org/) to do this. On top of the inherent
On top of the inherent
[advantages](https://github.com/ReproNim/reproschema#30-advantages-of-current-representation)
of this schema representation:

- its use simplifies the rendering of the checklist by using the
[schema-ui](https://github.com/ReproNim/schema-ui) made for it,
[reproschema-ui](https://github.com/ReproNim/reproschema-ui) made for it,

- this representation allows specification of user interface option that can
simplify the user experience: it allows us to specify a `branching logic`
that will prevent users to be presented with items that are not relevant to
them (e.g answer PET related when they have only run an fMRI study).
simplify the user experience: it allows us to specify the conditions that
will make certain items visible or not and thus will prevent users to be
presented with items that are not relevant to them (e.g answer PET related
when they have only run an fMRI study).

The repronim schema is organized in a hierarchical manner with 3 levels.
The reproschema is organized in a hierarchical manner with several levels, the
main ones being

1. The lowest level is the `item level` where there is one question for each
item with an expected format for the user interface: is this yes / no
Expand All @@ -100,41 +120,21 @@ The repronim schema is organized in a hierarchical manner with 3 levels.
1. The second level is the `activity level` that contains a set of items. In
the original repronim project this would constitute usually a questionnaire:
like all the items of the Edinburgh handedness inventory would constitute
one activity. In the COBIDAS case, it seems that we will most likely use
this level to define some 'big' section of a method section (e.g.
preprocessing, design, participants...)

1. The highest level is the `activity_set` or protocol level that originally
define a set of activities to be included in a given study. At the moment
this level is underused in the COBIDAS checklist but could be used to define
activity sets for different use case: fMRI, MEEG, pre-registration...

So far we have a [script](https://github.com/Remi-Gau/eCobidas/tree/master/scripts/create_ecobidas_schema.py) to turn the
neurovault [list of required inputs](https://github.com/Remi-Gau/eCobidas/tree/master/inputs/csv/cobidas_neurovault.csv) into a
schema that can then be render with the schema-ui.
one activity. In our case, we are using it to define to break a checklist
into sub-sections of a method section like preprocessing, design,
participants...

1. The highest level is the `protocol level` that contains a set of activities.
At the moment this level is under-used in our checklist but could be used to
define activity sets for different use case: fMRI, MEEG, pre-registration...

## Implementation

<!-- TODO -->

The first step of the implementation involves taking a spreadsheet that contains
all the items of the checklist and turning that into a representation that can
efficiently link the metadata about each item to the data imputed by the user.
We are currently using the
[ReproSchema](https://github.com/ReproNim/reproschema) initiative from
[ReproNim](http://www.repronim.org/) to do this. Basically, it means turning
your 'dumb' spreadsheet into an equivalent but 'smarter' representation of it: a
bunch of hierarchically organized json files that link to each other.
If you want to know more about Reproschema, we suggest you have look at the
documentation

On top of the inherent
[advantages](https://github.com/ReproNim/reproschema#30-advantages-of-current-representation)
of this schema representation:

- its use simplifies the rendering of the checklist by using the
[schema-ui](https://github.com/ReproNim/schema-ui) made for it,
- [main documentation](https://www.repronim.org/reproschema/)
- [FAQ](https://www.repronim.org/reproschema/98_FAQ/)

- this representation allows specification of user interface options that can
simplify the user experience: it allows us to specify a branching logic that
will prevent users to be presented with items that are not relevant to them
(e.g. answer PET-related questions when they have only run an fMRI study).
We are also trying to extend the content of the documentation of the
reproschema. You can keep track of this in this
[pull request](https://github.com/ReproNim/reproschema/pull/399) and here on
[github](https://github.com/Remi-Gau/reproschema/tree/remi-documentation/docs)
Loading

0 comments on commit 2e3576e

Please sign in to comment.