Skip to content

Commit

Permalink
Remove comments, add missing citation
Browse files Browse the repository at this point in the history
  • Loading branch information
mmore500 authored Oct 20, 2023
1 parent 50f53ce commit ceb3048
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 122 deletions.
11 changes: 11 additions & 0 deletions paper.bib
Original file line number Diff line number Diff line change
Expand Up @@ -267,3 +267,14 @@ @article{10.1145/3185517
}

@article{empirical_2020, title={devosoft/Empirical}, DOI={10.5281/zenodo.2575606}, publisher={Zenodo}, author={Charles Ofria and Matthew Andres Moreno and Emily Dolson and Alex Lalejini and rodsan0 and Jake Fenton and perryk12 and Steven Jorgensen and hoffmanriley and grenewode and et al.}, year={2020}, month={Oct}}

@article{woelfle2011open,
title={Open science is a research accelerator},
author={Woelfle, Michael and Olliaro, Piero and Todd, Matthew H},
journal={Nature chemistry},
volume={3},
number={10},
pages={745--748},
year={2011},
publisher={Nature Publishing Group UK London}
}
126 changes: 4 additions & 122 deletions paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,37 +41,6 @@ bibliography: paper.bib
breaks: false
---

<!-- [Scraps document here](https://hackmd.io/@DSAbJHksRtqYhjQ4NWzmGA/S1z3oQDWv) -->
<!-- @AML (2021-06-12): Doing a comment clean up, moving important ones (e.g., old meeting notes) into the scraps document. -->

<!-- INSTRUCTIONS
- The paper should be between 250-1000 words.
Paper should include:
- A summary describing the high-level functionality and purpose of the software for a diverse, non-specialist audience.
- A clear Statement of Need that illustrates the research purpose of the software.
- A list of key references, including to other software addressing related needs.
- Mention (if applicable) a representative set of past or ongoing research projects using the software and recent scholarly publications enabled by it.
- Acknowledgement of any financial support.
-->

<!-- RESOURCES
- Empirical coding guidelines: https://github.com/devosoft/Empirical/wiki/Coding-Guidelines
- Empirical README: https://github.com/devosoft/Empirical/blob/master/README.md
-->

<!-- SUMMARY
JOSS: "Begin your paper with a summary of the high-level functionality of your software for a non-specialist reader. Avoid jargon in this section."
i.e., What does this software do?
-->

# Summary

Empirical is a C++ library designed to promote open science and facilitate the development of scientific software that is efficient, reliable, and easily distributable to researchers and non-experts alike.
Expand All @@ -88,14 +57,10 @@ In addition to many helpful utilities to improve the scientific programming expe
2. implementations of general-purpose data structures and algorithms,
3. integrated, end-to-end frameworks for data and configuration management, and
4. object-oriented bindings for Emscripten/WebAssembly GUI elements.

# Statement of Need
<!-- STATEMENT OF NEED
JOSS: illustrates the research purpose of the software.
JOSS reviewer checklist: "clearly state what problems the software is designed to solve and who the target audience is"
-->

Modern web-based interfaces give computational research the unique potential to embody open science objectives: they can make the scientific process more transparent with auditable and extensible code, clear and replicable methodologies, and production of accessible results [CITE osf.io].
<!-- [@world_academy_of_science_engineering_and_technology_2019] -->
Modern web-based interfaces give computational research the unique potential to embody open science objectives: they can make the scientific process more transparent with auditable and extensible code, clear and replicable methodologies, and production of accessible results [@woelfle2011open].
In practice, however, many scientific software applications are difficult to obtain, install, or use, and produce data in proprietary formats.

High quality open-science tools encourage researchers to follow effective software development practices by simplifying development and helping them improve code quality, scientific rigor, and ease of replication or extension.
Expand All @@ -112,27 +77,6 @@ Empirical's debugging suite helps protect against common C++ programming pitfall
Bundled algorithms and data structures provide optimized, well-tested drop-in implementations for common scientific computing tasks.
Throughout, library design obviates trade-offs between performance and safety; compile-time switches toggle safety checks for undefined or incorrect behavior.

<!-- However, many scientists do not know how to create browser-based GUIs and lack the resources or incentives to adequately invest in learning.
Empirical addresses these challenges by providing tools for implementing browser-based GUIs using intuitive C++, and without requiring a substantial time investment.
-->
<!-- @CAO: The below is a critical point, but should probably going in the Availability section below OR in a future paper; I don't think it goes in statement of need. -->
<!-- Even with GUI domain knowledge, maintaining a GUI-based, public-facing code base separately from a command-line, research code base is a burden that often leads to the GUI version lagging behind the research version.
In this circumstance, the research version of the software is no longer truly available to the world.
To avoid this pitfall, Empirical privides tools to implement GUI applications by _wrapping_ (instead of replicating) the research version of the software, making it easier for researchers to keep the most recent version of scientific software available widely.
-->
<!-- To conduct computationally expensive experiments, researchers often need to run scientific software without incurring overhead associated with GUI integration.In addition, the overhead associated with GUIs (particularly cross-platform or in-browser execution of core code components) must be minimized to be able to realize the full capabilities of scientific software. -->
<!-- Empirical fulfills these requirements by:
* allowing software to be compiled with or without the GUI,
* building the GUI off of highly-efficient Web Assembly, and
* computing everything possible at compile time to increase execution efficiency.
-->
<!-- Even with an easily available web-based interface, scientific software cannot fulfill open science objectives if its credbility is compromised by undiscovered errors.
Determining the presence of subtle runtime errors in native executables generated using C++ can be difficult, and debugging compiled WASM output from Emscripten can be nearly impossible.
Tools to support researchers in detecting and correcting such errors are critical to open science objectives. -->
<!-- To improve researchers' ability to debug their software, Empirical provides a number of improved debugging functionalities, including wrappers around standard library containers that include extra safety checks, smart pointers in debug mode that convert to raw pointers in release mode, and improved assertions.
These safeguards can be fully disabled by a compiler flag in order to ensure maximum efficiency in performance-critical contexts.
-->

# Empirical Features

## Facilitating Better Code for Scientific Software
Expand Down Expand Up @@ -177,10 +121,8 @@ Empirical amplifies the potential of Emscripten by fleshing out its rudimentary
At the lowest level, Empirical provides tools for reciprocal data transfer between C++ code and the browser.
DOM elements (such as `<button>`, `<div>`, and `<canvas>`) are given corresponding C++ objects (`emp::Button`, `emp::Div`, and `emp::Canvas`) and can be easily used from within C++ code.
With these tools, users no longer need to manage JavaScript resources, and thus need much less preexisting web-programing knowledge.
<!-- This facility relieves users of bookkeeping for JavaScript resources (particularly useful for those without web programming domain expertise). -->
At a higher level of abstraction, Empirical packages pre-configured, pre-styled collections of DOM elements as prefabricated widgets (e.g., configuration managers, collapsible read-outs, modal messages, etc.).
Empirical's tools aim to make generating a mobile-friendly, web-based GUI for existing software so trivial that the practice becomes ubiquitous.
<!-- In particular, we are focused on lowering the barrier to entry for developers without domain knowledge in HTML, CSS, and JavaScript by abstracting these matters away behind a C++ interface. -->

Below, we give an example of Empirical's DOM interface in action.
This example creates a button that increments an on-screen counter every time the button is clicked.
Expand Down Expand Up @@ -214,15 +156,8 @@ HTML source:
<script type="text/javascript" src="main.js"></script>
```

<!-- @MAM: add a code snippet with a brief demo and a screenshot of the resulting webpage -->

A live demo of more sophisticated Empirical widgets, presented alongside their source C++ code, is available on our [prefab demos page](https://devosoft.github.io/empirical-prefab-demo/empirical-prefab-demo).

<!-- # Empirical Development Practices -->
<!-- @mmore500 moved to documentation -->
<!-- @AML: talk here about testing/coverage setup, cookiecutter template, etc? maybe cookiecutter could go in re-invent wheel section? -->


## Facilitating Runtime Efficiency

WebAssembly's runtime efficiency is a major driver of its increasing popularity for web app development.
Expand All @@ -243,38 +178,28 @@ Benchmark-informed development practices ensure that optimizations translate int
At a more fundamental level, Empirical's header-only design prioritizes ease of use and runtime performance at the cost of somewhat longer compilation times.

## Facilitating Debugging
<!-- @MAM: help you make your own code reliable -->

Identifying and correcting incorrect program behavior consumes a large fraction of developer hours for any software project.
Software bugs that slip through into production can inflict even greater costs, especially in scientific contexts where the validity of generated data and analyses is paramount.

In conjunction with unit tests and integration tests, runtime safety checks are commonly used to flag potential bugs.
Assert statements typify runtime safety checks.
These statements abort program execution at the point of failure with a helpful error message if an expected runtime condition is not met.
<!-- Users can write `assert` statements into their own code to ensure that program behavior matches expectations.
-->
Runtime safety checks like `assert` don't necessarily oblige a performance cost to compute the asserted runtime condition; these checks can be verified only in debug mode and ignored in production mode to maximize performance.

Indeed, the C++ standard library's `assert` macro follows this paradigm.
Empirical provides an extended `emp_assert` macro that prints custom error messages with current values of specified expressions, and dispatches a UI alert when triggered in a web environment.
<!-- These features help compensate for the limited tooling currently available in the Empirical web runtime. -->

In addition to user-defined asserts, most programming languages (Java, Python, Ruby, Rust, etc.) provide built-in support to detect common runtime violations, such as out-of-bounds indexing or bad type conversions.
<!-- These built-in protections against runtime violations are considered so critical that many programming languages do not provide a mechanism to disable them for speedups in production code. -->
C++ does not in an effort to maximize performance.
<!-- provide any standard mechanisms for safety-checking library features. -->
However, standard library vendors --- like [GCC's `libstdc++`](https://web.archive.org/web/20210118212109/https://gcc.gnu.org/onlinedocs/libstdc++/manual/debug_mode_using.html), [Clang's `libc++`](https://web.archive.org/web/20210414014331/https://libcxx.llvm.org/docs/DesignDocs/DebugMode.html), and [Microsoft's `stl`](https://web.archive.org/web/20210121201948/https://docs.microsoft.com/en-us/cpp/standard-library/checked-iterators?view=msvc-160) --- do provide some proprietary support for such safety checks.
In addition to user-defined asserts, most programming languages (Java, Python, Ruby, Rust, etc.) provide built-in support to detect common runtime violations, such as out-of-bounds indexing or bad type conversions.C++ does not in an effort to maximize performance.
However, standard library vendors --- like [GCC's `libstdc++`](https://web.archive.org/web/20210118212109/https://gcc.gnu.org/onlinedocs/libstdc++/manual/debug_mode_using.html), [Clang's `libc++`](https://web.archive.org/web/20210414014331/https://libcxx.llvm.org/docs/DesignDocs/DebugMode.html), and [Microsoft's `stl`](https://web.archive.org/web/20210121201948/https://docs.microsoft.com/en-us/cpp/standard-library/checked-iterators?view=msvc-160) --- do provide some proprietary support for such safety checks.
This support, however, is limited and poorly documented[^1].
Empirical supplements vendors' runtime safety checking by providing drop-in replacements for `std::array`, `std::optional`, and `std::vector` with stronger runtime safety checks, but only while in debug mode.
In addition, Empirical furnishes a safety-checked pointer wrapper, `emp::Ptr`, that identifies memory leaks and invalid memory access in debug mode while retaining the full speed of raw pointers in release mode.

[^1]: For example, neither GCC 10.3 nor Clang 12.0.0 detect `std::vector` iterator invalidation when appending to a `std::vector` happens to fall within existing allocated buffer space ([GCC live example](https://perma.cc/6WDU-3C8X); [Clang live example](https://perma.cc/6SU9-CUKY)).

<!-- todo add more explanation of emp::Ptr and its rationale -->

Because of poor support for built-in runtime safety checks, C++ developers typically use an external toolchain to detect and diagnose runtime violations.
Popular tools include Valgrind, GDB, and runtime sanitizers.
<!-- (Perhaps, to some degree, this rich toolchain ecosystem enables the ongoing lack of support for such checks within the standard language.) -->
Although this tooling is very mature and quite powerful, there are fundamental limitations to the runtime violations it can detect.
For example, Clang 12.0.0's sanitizers cannot detect the iterator invalidation described above ([live example](https://godbolt.org/z/z6ocqn87W)).
Additionally, most of this tooling is not available when debugging WASM code compiled with Emscripten --- a core use case targeted by the Empirical library.
Expand All @@ -296,17 +221,10 @@ To this end, we look forward to welcoming new collaborations and supporting a wi

# Related Software Packages

<!-- JOSS:(including to other software addressing related needs. a representative set of past or ongoing research projects using the software and recent scholarly publications enabled by it.) -->

## Software Addressing Related Needs
<!-- JOSS: Do the authors describe how this software compares to other commonly-used packages?-->


There are many existing software platforms that provide functionalities overlapping with Empirical.
However, most are not in C++, and there is value in this functionality being easily available to C++ programmers.
<!-- being easily available to programmers who are most comfortable in C++ -->
<!-- TODO C++ as a high-efficiency language -->
<!-- Therefore, here we focus only on software platforms that support development in C++. -->
See the Non-C++ Comparable Software section for citations to software platforms that provide some of Empirical's functionality in different languages.

### RepastHPC
Expand All @@ -327,23 +245,13 @@ Emscripten is available at <https://emscripten.org/> [@zakai2011emscripten].
It provides cross-compilation from C++ to WebAssembly and we use it in Empirical.
Empirical's tools build abstractions from Emscripten intrinsics tailored to visualization and interactive control of scientific simulations.


### Cheerp

Cheerp, another C++ to WebAssembly compiler, is available at <https://leaningtech.com/cheerp/>.
Like Emscripten, Cheerp provides primariliy low-level APIs for interaction with browser GUI elements.

### Non-C++ Comparable Software

<!-- Not going to include discussion, just citations -->

<!--
these are more agent-based simulations which aren't the focus of this paper
* FlameGPU [@richmond2010high]
* NetLogo [@tisue2004netlogo]
* Cell Collective [@helikar2012cell]
-->

* [TinyGo](https://tinygo.org/) <!-- in-browser web interface compiler for go -->
* [WebIO](https://juliagizmos.github.io/WebIO.jl/latest/) <!-- in-browser web interface library for julia -->
* [GWT](http://www.gwtproject.org/) <!-- in-browser web interface compiler for java -->
Expand All @@ -369,32 +277,6 @@ these are more agent-based simulations which aren't the focus of this paper
* [Model of cancer evolution on an oxygen gradient](http://emilydolson.github.io/memic_model/web/memic_model.html)
* A companion model to a series of wet lab experiments on cancer evolution in spatially heterogenous environments

<!-- # Packages used by Empirical -->

<!--Citations to entries in paper.bib should be in
[rMarkdown](http://rmarkdown.rstudio.com/authoring_bibliographies_and_citations.html)
format.
If you want to cite a software repository URL (e.g. something on GitHub without a preferred
citation) then you can do it with the example BibTeX entry below for @fidgit.
For a quick reference, the following citation commands can be used:
- `@author:2001` -> "Author et al. (2001)"
- `[@author:2001]` -> "(Author et al., 2001)"
- `[@author1:2001; @author2:2001]` -> "(Author1 et al., 2001; Author2 et al., 2002)"
# Figures
Figures can be included like this:
![Caption for example figure.\label{fig:example}](figure.png)
and referenced from text using \autoref{fig:example}.
Fenced code blocks are rendered with syntax highlighting:
```python
for n in range(10):
yield f(n)
```
-->
# Acknowledgements

This research was supported in part by NSF grants DEB-1655715 and DBI-0939454 as well as by Michigan State University through the computational resources provided by the Institute for Cyber-Enabled Research.
Expand Down

0 comments on commit ceb3048

Please sign in to comment.