From 549a513c166d43a8400e0c0a66ebe519c1aa48f5 Mon Sep 17 00:00:00 2001 From: "Matthew N. White" Date: Wed, 9 Oct 2024 16:35:45 -0400 Subject: [PATCH] Added descriptive content to README, moved details The technical details of using REMARK are now in HowToUseREMARK.md. Material added to the top of the README was taken from "Illustrated Guide to REMARK", which has no illustrations. --- HowToUseREMARK.md | 108 +++++++++++++++++++++++++++++++++++++++ README.md | 126 +++++++++------------------------------------- 2 files changed, 131 insertions(+), 103 deletions(-) create mode 100644 HowToUseREMARK.md diff --git a/HowToUseREMARK.md b/HowToUseREMARK.md new file mode 100644 index 0000000..9116d33 --- /dev/null +++ b/HowToUseREMARK.md @@ -0,0 +1,108 @@ +# How to Use REMARK + +This document contains technical instructions for how to add new projects to the REMARK +archive, and how to work with and maintain the project repository itself. + +## For Authors + +Each project lives in its own repository. To make a new REMARK, you can start with a skeleton +[REMARK-starter-example](https://github.com/econ-ark/REMARK-starter-example) and add to it, +or from an example of a complete project using the toolkit, [BufferStockTheory](https://github.com/econ-ark/BufferStockTheory), whose content (code, text, figures, etc) you can replace with +your own. + +REMARKs should adhere to the [REMARK Standard](https://github.com/econ-ark/REMARK/blob/master/STANDARD.md). + +## For Editors + +The REMARK catalog and Econ-ARK website configuration will be maintained by Editors. + +Editorial guidelines are [here](https://github.com/econ-ark/REMARK/blob/master/EDITORIAL.md). + +## For Maintainers + +**Command Line Interface** `cli.py` + +`cli.py` is an automated tool that facilitates: +- cloning of REMARK repositories +- linting (detection of missing files from a given REMARK) +- building conda environments/docker images + - uses `conda`/`repo2docker` under the hood +- executing `reproduce.sh` scripts within the built environments. + +All artifacts generated by `cli.py` are stored in a newly created `_REMARK` folder. + +1. Once you clone a REMARK you'll be able to find its contents inside of `_REMARK/repos/…` +2. Once you build/execute a REMARK you'll be able to find a corresponding log +file from that process inside of `_REMARK/logs/…` + +`cli.py` has built-in parallelization specified by the `-J` flag for many actions. + +### Requirements + +- python 3.9 or newer. +- contents `requirements.txt` + +### Action + +**Clone/Pull** + +pulling REMARKs (these are populated in the `_REMARKS` folder) + +```bash +python cli.py pull --all # git clone all REMARKS +python cli.py pull {remark_name} # git clone one or more REMARK(s) +``` + +**Lint** + +Shows what files are missing from given REMARK(s). The linter uses the +file-tree print out from STANDARD.md and compares it to the current files found +in the currently cloned REMARK(s). + +```bash +python cli.py lint --all # detect missing files from all REMARKs +python cli.py lint {remark_name} # detect missing files from one or more REMARK(s) +``` + +**Build** + +Building conda environments and/or docker images. + +```bash +python cli.py build conda --all # build conda environments for all REMARKs (stored as a `condaenv` folder inside the cloned REMARK repo) +python cli.py build docker --all # build docker images for all REMARKs (stored as a `condaenv` folder inside the cloned REMARK repo) +python cli.py build conda {remark_name} # build conda environments for one or more REMARK(s) +python cli.py build docker {remark_name} # build docker image(s) for one or more REMARK(s) +``` + +The primary difference between `conda` and `docker` for builds are that `docker` will be more flexible for multilanguage REMARKs. It leverages +repo2docker (same tool that mybinder uses) to create docker images from repositories. + +**Execute** + +Automated execution within built conda environments/docker containers. + +```bash +python cli.py execute conda --all # execute reproduce.sh via conda for all REMARKs +python cli.py execute docker --all # execute reproduce.sh via docker for all REMARKs +python cli.py execute conda {remark_name} # execute reproduce.sh via conda for one or more REMARK(s) +python cli.py execute docker {remark_name} # execute reproduce.sh via docker for one or more REMARK(s) +``` + +*Both the build and execute subcommands have an optional --jobs argument to +specify the number of jobs to run in parallel when building/executing.* + +**Logs/Summarize** + +```bash +python cli.py logs # view most recent logs for all previous building/executing commands +``` + +**Clean/Remove** + +```bash +python cli.py clean conda --all # remove all built conda environments +python cli.py clean docker --all # remove all build docker images +python cli.py clean conda {remark_name} # remove conda environment(s) from specified REMARK(s) +python cli.py clean docker {remark_name} # remove docker images built from specified REMARK(s) +``` diff --git a/README.md b/README.md index 08cf40b..962e9eb 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,28 @@ # R[eplications/eproductions] and Explorations Made using ARK (REMARK) +The Economics profession has fallen behind other fields in the adoption of modern software development and reproducibility practices. +Economists generally lack robust software engineering training and resort to writing new software from scratch. +This practice of 'reinventing the wheel' increases the likelihood of errors and undermines the reproducibility of computational results. + +Poor software development practices also slow down the advancement of Economics as a science. +When researchers cannot reproduce the results of a paper, they are unable to build on the work of others, which leads to wasted time and effort and slows down progress. +Additionally, when researchers cannot reproduce the results of a paper, they are unable to verify the validity of said results. +This leads to a lack of trust in the research, which can have a negative impact on the influence of the field. + +Other disciplines have adapted to modern software development methods, and it is time for Economics to catch up. + +To address the problem discussed above, Econ-ARK has been working on developing a set of standards and tools for reproducibility in our own work in Economics. Our reproducibility initiative is called REMARK, which stands for "R[eplications/eproductions] and Explorations Made using ARK". The term REMARK is used to represent the standard, as well as any project that follows the standard. The objective of REMARKs is to be self-contained and complete projects, whose contents should be executable by anyone on any computer that meets a minimal set of requirements and software. + +The REMARK standard is focused on 3 key principles: + +- **Reproduction**: The ability to reproduce the results of a project using the same data and code on a different computer. +- **Archiving**: Storing the project in a way that it can be accessed and used in the future. +- **Publishing**: Making the project available to the public and incentivizing the sharing of code and data. + +Detailed technical instructions for using or contributing to REMARK can be found in UsingREMARK.md in this directory. + +## Description + REMARKs are self-contained and complete projects, whose content here should be executable by anyone with a suitably configured computer or using [nbreproduce](https://econ-ark.github.io/nbreproduce/). Types of content include (see below for elaboration): @@ -11,21 +34,6 @@ Types of content include (see below for elaboration): 1. Reproductions * Code that reproduces ALL of the results of some paper that was originally written using the toolkit -## For Authors - -Each project lives in its own repository. To make a new REMARK, you can start with a skeleton -[REMARK-starter-example](https://github.com/econ-ark/REMARK-starter-example) and add to it, -or from an example of a complete project using the toolkit, [BufferStockTheory](https://github.com/econ-ark/BufferStockTheory), whose content (code, text, figures, etc) you can replace with -your own. - -REMARKs should adhere to the [REMARK Standard](https://github.com/econ-ark/REMARK/blob/master/STANDARD.md). - -## For Editors - -The REMARK catalog and Econ-ARK website configuration will be maintained by Editors. - -Editorial guidelines are [here](https://github.com/econ-ark/REMARK/blob/master/EDITORIAL.md). - ## REMARK Catalog A catalog of all REMARKs is available under the `REMARK` tab at [econ-ark.org](https://econ-ark.org/materials). @@ -59,91 +67,3 @@ Jupyter notebook(s) should: The key difference with the contents of the [DemARK](https://github.com/econ-ark/DemARK) repo is that REMARKs are allowed to rely on the existence of local files and subdirectories (figures; data) at a predictable filepath relative to the location of the root. -## For Maintainers - -**Command Line Interface** `cli.py` - -`cli.py` is an automated tool that facilitates: -- cloning of REMARK repositories -- linting (detection of missing files from a given REMARK) -- building conda environments/docker images - - uses `conda`/`repo2docker` under the hood -- executing `reproduce.sh` scripts within the built environments. - -All artifacts generated by `cli.py` are stored in a newly created `_REMARK` folder. - -1. Once you clone a REMARK you'll be able to find its contents inside of `_REMARK/repos/…` -2. Once you build/execute a REMARK you'll be able to find a corresponding log -file from that process inside of `_REMARK/logs/…` - -`cli.py` has built-in parallelization specified by the `-J` flag for many actions. - -### Requirements - -- python 3.9 or newer. -- contents `requirements.txt` - -### Action - -**Clone/Pull** - -pulling REMARKs (these are populated in the `_REMARKS` folder) - -```bash -python cli.py pull --all # git clone all REMARKS -python cli.py pull {remark_name} # git clone one or more REMARK(s) -``` - -**Lint** - -Shows what files are missing from given REMARK(s). The linter uses the -file-tree print out from STANDARD.md and compares it to the current files found -in the currently cloned REMARK(s). - -```bash -python cli.py lint --all # detect missing files from all REMARKs -python cli.py lint {remark_name} # detect missing files from one or more REMARK(s) -``` - -**Build** - -Building conda environments and/or docker images. - -```bash -python cli.py build conda --all # build conda environments for all REMARKs (stored as a `condaenv` folder inside the cloned REMARK repo) -python cli.py build docker --all # build docker images for all REMARKs (stored as a `condaenv` folder inside the cloned REMARK repo) -python cli.py build conda {remark_name} # build conda environments for one or more REMARK(s) -python cli.py build docker {remark_name} # build docker image(s) for one or more REMARK(s) -``` - -The primary difference between `conda` and `docker` for builds are that `docker` will be more flexible for multilanguage REMARKs. It leverages -repo2docker (same tool that mybinder uses) to create docker images from repositories. - -**Execute** - -Automated execution within built conda environments/docker containers. - -```bash -python cli.py execute conda --all # execute reproduce.sh via conda for all REMARKs -python cli.py execute docker --all # execute reproduce.sh via docker for all REMARKs -python cli.py execute conda {remark_name} # execute reproduce.sh via conda for one or more REMARK(s) -python cli.py execute docker {remark_name} # execute reproduce.sh via docker for one or more REMARK(s) -``` - -*Both the build and execute subcommands have an optional --jobs argument to -specify the number of jobs to run in parallel when building/executing.* - -**Logs/Summarize** - -```bash -python cli.py logs # view most recent logs for all previous building/executing commands -``` - -**Clean/Remove** - -```bash -python cli.py clean conda --all # remove all built conda environments -python cli.py clean docker --all # remove all build docker images -python cli.py clean conda {remark_name} # remove conda environment(s) from specified REMARK(s) -python cli.py clean docker {remark_name} # remove docker images built from specified REMARK(s) -```