Skip to content

Commit

Permalink
Added descriptive content to README, moved details
Browse files Browse the repository at this point in the history
The technical details of using REMARK are now in HowToUseREMARK.md. Material added to the top of the README was taken from "Illustrated Guide to REMARK", which has no illustrations.
  • Loading branch information
mnwhite committed Oct 9, 2024
1 parent 720de7b commit 549a513
Show file tree
Hide file tree
Showing 2 changed files with 131 additions and 103 deletions.
108 changes: 108 additions & 0 deletions HowToUseREMARK.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# How to Use REMARK

This document contains technical instructions for how to add new projects to the REMARK
archive, and how to work with and maintain the project repository itself.

## For Authors

Each project lives in its own repository. To make a new REMARK, you can start with a skeleton
[REMARK-starter-example](https://github.com/econ-ark/REMARK-starter-example) and add to it,
or from an example of a complete project using the toolkit, [BufferStockTheory](https://github.com/econ-ark/BufferStockTheory), whose content (code, text, figures, etc) you can replace with
your own.

REMARKs should adhere to the [REMARK Standard](https://github.com/econ-ark/REMARK/blob/master/STANDARD.md).

## For Editors

The REMARK catalog and Econ-ARK website configuration will be maintained by Editors.

Editorial guidelines are [here](https://github.com/econ-ark/REMARK/blob/master/EDITORIAL.md).

## For Maintainers

**Command Line Interface** `cli.py`

`cli.py` is an automated tool that facilitates:
- cloning of REMARK repositories
- linting (detection of missing files from a given REMARK)
- building conda environments/docker images
- uses `conda`/`repo2docker` under the hood
- executing `reproduce.sh` scripts within the built environments.

All artifacts generated by `cli.py` are stored in a newly created `_REMARK` folder.

1. Once you clone a REMARK you'll be able to find its contents inside of `_REMARK/repos/…`
2. Once you build/execute a REMARK you'll be able to find a corresponding log
file from that process inside of `_REMARK/logs/…`

`cli.py` has built-in parallelization specified by the `-J` flag for many actions.

### Requirements

- python 3.9 or newer.
- contents `requirements.txt`

### Action

**Clone/Pull**

pulling REMARKs (these are populated in the `_REMARKS` folder)

```bash
python cli.py pull --all # git clone all REMARKS
python cli.py pull {remark_name} # git clone one or more REMARK(s)
```

**Lint**

Shows what files are missing from given REMARK(s). The linter uses the
file-tree print out from STANDARD.md and compares it to the current files found
in the currently cloned REMARK(s).

```bash
python cli.py lint --all # detect missing files from all REMARKs
python cli.py lint {remark_name} # detect missing files from one or more REMARK(s)
```

**Build**

Building conda environments and/or docker images.

```bash
python cli.py build conda --all # build conda environments for all REMARKs (stored as a `condaenv` folder inside the cloned REMARK repo)
python cli.py build docker --all # build docker images for all REMARKs (stored as a `condaenv` folder inside the cloned REMARK repo)
python cli.py build conda {remark_name} # build conda environments for one or more REMARK(s)
python cli.py build docker {remark_name} # build docker image(s) for one or more REMARK(s)
```

The primary difference between `conda` and `docker` for builds are that `docker` will be more flexible for multilanguage REMARKs. It leverages
repo2docker (same tool that mybinder uses) to create docker images from repositories.

**Execute**

Automated execution within built conda environments/docker containers.

```bash
python cli.py execute conda --all # execute reproduce.sh via conda for all REMARKs
python cli.py execute docker --all # execute reproduce.sh via docker for all REMARKs
python cli.py execute conda {remark_name} # execute reproduce.sh via conda for one or more REMARK(s)
python cli.py execute docker {remark_name} # execute reproduce.sh via docker for one or more REMARK(s)
```

*Both the build and execute subcommands have an optional --jobs argument to
specify the number of jobs to run in parallel when building/executing.*

**Logs/Summarize**

```bash
python cli.py logs # view most recent logs for all previous building/executing commands
```

**Clean/Remove**

```bash
python cli.py clean conda --all # remove all built conda environments
python cli.py clean docker --all # remove all build docker images
python cli.py clean conda {remark_name} # remove conda environment(s) from specified REMARK(s)
python cli.py clean docker {remark_name} # remove docker images built from specified REMARK(s)
```
126 changes: 23 additions & 103 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,28 @@
# R[eplications/eproductions] and Explorations Made using ARK (REMARK)

The Economics profession has fallen behind other fields in the adoption of modern software development and reproducibility practices.
Economists generally lack robust software engineering training and resort to writing new software from scratch.
This practice of 'reinventing the wheel' increases the likelihood of errors and undermines the reproducibility of computational results.

Poor software development practices also slow down the advancement of Economics as a science.
When researchers cannot reproduce the results of a paper, they are unable to build on the work of others, which leads to wasted time and effort and slows down progress.
Additionally, when researchers cannot reproduce the results of a paper, they are unable to verify the validity of said results.
This leads to a lack of trust in the research, which can have a negative impact on the influence of the field.

Other disciplines have adapted to modern software development methods, and it is time for Economics to catch up.

To address the problem discussed above, Econ-ARK has been working on developing a set of standards and tools for reproducibility in our own work in Economics. Our reproducibility initiative is called REMARK, which stands for "R[eplications/eproductions] and Explorations Made using ARK". The term REMARK is used to represent the standard, as well as any project that follows the standard. The objective of REMARKs is to be self-contained and complete projects, whose contents should be executable by anyone on any computer that meets a minimal set of requirements and software.

The REMARK standard is focused on 3 key principles:

- **Reproduction**: The ability to reproduce the results of a project using the same data and code on a different computer.
- **Archiving**: Storing the project in a way that it can be accessed and used in the future.
- **Publishing**: Making the project available to the public and incentivizing the sharing of code and data.

Detailed technical instructions for using or contributing to REMARK can be found in UsingREMARK.md in this directory.

## Description

REMARKs are self-contained and complete projects, whose content here should be executable by anyone with a suitably configured computer or using [nbreproduce](https://econ-ark.github.io/nbreproduce/).

Types of content include (see below for elaboration):
Expand All @@ -11,21 +34,6 @@ Types of content include (see below for elaboration):
1. Reproductions
* Code that reproduces ALL of the results of some paper that was originally written using the toolkit

## For Authors

Each project lives in its own repository. To make a new REMARK, you can start with a skeleton
[REMARK-starter-example](https://github.com/econ-ark/REMARK-starter-example) and add to it,
or from an example of a complete project using the toolkit, [BufferStockTheory](https://github.com/econ-ark/BufferStockTheory), whose content (code, text, figures, etc) you can replace with
your own.

REMARKs should adhere to the [REMARK Standard](https://github.com/econ-ark/REMARK/blob/master/STANDARD.md).

## For Editors

The REMARK catalog and Econ-ARK website configuration will be maintained by Editors.

Editorial guidelines are [here](https://github.com/econ-ark/REMARK/blob/master/EDITORIAL.md).

## REMARK Catalog

A catalog of all REMARKs is available under the `REMARK` tab at [econ-ark.org](https://econ-ark.org/materials).
Expand Down Expand Up @@ -59,91 +67,3 @@ Jupyter notebook(s) should:

The key difference with the contents of the [DemARK](https://github.com/econ-ark/DemARK) repo is that REMARKs are allowed to rely on the existence of local files and subdirectories (figures; data) at a predictable filepath relative to the location of the root.

## For Maintainers

**Command Line Interface** `cli.py`

`cli.py` is an automated tool that facilitates:
- cloning of REMARK repositories
- linting (detection of missing files from a given REMARK)
- building conda environments/docker images
- uses `conda`/`repo2docker` under the hood
- executing `reproduce.sh` scripts within the built environments.

All artifacts generated by `cli.py` are stored in a newly created `_REMARK` folder.

1. Once you clone a REMARK you'll be able to find its contents inside of `_REMARK/repos/…`
2. Once you build/execute a REMARK you'll be able to find a corresponding log
file from that process inside of `_REMARK/logs/…`

`cli.py` has built-in parallelization specified by the `-J` flag for many actions.

### Requirements

- python 3.9 or newer.
- contents `requirements.txt`

### Action

**Clone/Pull**

pulling REMARKs (these are populated in the `_REMARKS` folder)

```bash
python cli.py pull --all # git clone all REMARKS
python cli.py pull {remark_name} # git clone one or more REMARK(s)
```

**Lint**

Shows what files are missing from given REMARK(s). The linter uses the
file-tree print out from STANDARD.md and compares it to the current files found
in the currently cloned REMARK(s).

```bash
python cli.py lint --all # detect missing files from all REMARKs
python cli.py lint {remark_name} # detect missing files from one or more REMARK(s)
```

**Build**

Building conda environments and/or docker images.

```bash
python cli.py build conda --all # build conda environments for all REMARKs (stored as a `condaenv` folder inside the cloned REMARK repo)
python cli.py build docker --all # build docker images for all REMARKs (stored as a `condaenv` folder inside the cloned REMARK repo)
python cli.py build conda {remark_name} # build conda environments for one or more REMARK(s)
python cli.py build docker {remark_name} # build docker image(s) for one or more REMARK(s)
```

The primary difference between `conda` and `docker` for builds are that `docker` will be more flexible for multilanguage REMARKs. It leverages
repo2docker (same tool that mybinder uses) to create docker images from repositories.

**Execute**

Automated execution within built conda environments/docker containers.

```bash
python cli.py execute conda --all # execute reproduce.sh via conda for all REMARKs
python cli.py execute docker --all # execute reproduce.sh via docker for all REMARKs
python cli.py execute conda {remark_name} # execute reproduce.sh via conda for one or more REMARK(s)
python cli.py execute docker {remark_name} # execute reproduce.sh via docker for one or more REMARK(s)
```

*Both the build and execute subcommands have an optional --jobs argument to
specify the number of jobs to run in parallel when building/executing.*

**Logs/Summarize**

```bash
python cli.py logs # view most recent logs for all previous building/executing commands
```

**Clean/Remove**

```bash
python cli.py clean conda --all # remove all built conda environments
python cli.py clean docker --all # remove all build docker images
python cli.py clean conda {remark_name} # remove conda environment(s) from specified REMARK(s)
python cli.py clean docker {remark_name} # remove docker images built from specified REMARK(s)
```

0 comments on commit 549a513

Please sign in to comment.