Programmable, reproducible, interactive documents
👋 Intro • 🚴 Roadmap • 📜 Docs • 📥 Install • 🛠️ Develop
🙏 Acknowledgements • 💖 Supporters • 🙌 Contributors
Stencila is a platform for creating and publishing, dynamic, data-driven content. Our aim is to lower the barriers for creating truly programmable documents, and to make it easier to publish them as beautiful, interactive, and semantically rich, articles and applications. Our roots are in scientific communication, but our tools are useful beyond.
This is v2
of Stencila, a rewrite in Rust focussed on the synergies between three recent and impactful innovations and trends:
-
Conflict-free replicated data types (CRDTs) for de-centralized collaboration and version control.
-
Large language models (LLMs) for assisting in writing and editing, prose and code.
-
The blurring of the lines between documents and applications as seen in tools such as Notion and Coda.
We are embarking on a rewrite because CRDTs will now be the foundational synchronization and storage layer for Stencila documents. This requires fundamental changes to most other parts of the platform. Furthermore, a rewrite allow us to bake in, rather than bolt on, new modes of interaction between authors and LLM assistants and add mechanisms to mitigate the risks associated with using LLMs (e.g. by recording the actor, human or LLM, that made the change to a document). Much of the code in the v1
branch will be reused (after some tidy-ups and refactoring), so v2
is not a complete rewrite.
Our general strategy is to iterate horizontally across the feature set, rather than fully developing features sequentially. This will better enable early user testing of workflows and reduce the risk of finding ourselves painted into an architectural corner. So expect initial iterations to have limited functionality and be buggy.
We'll be making alpha and beta releases of v2
early and often across all products (e.g. CLI, desktop, SDKs). We're aiming for a 2.0.0
release by the end of Q3 2024.
🟢 Stable • 🔶 Beta •
The Stencila Schema is the data model for Stencila documents (definition here, generated reference documentation here). Most of the schema is well defined but some document node types are still marked as under development. A summary by category:
Category | Description | Status |
---|---|---|
Works | Types of creative works (e.g. Article , Figure , Review ) |
🟢 Stable (mostly based on schema.org) |
Prose | Types used in prose (e.g. Paragraph , List , Heading ) |
🟢 Stable (mostly based on HTML, JATS, Pandoc etc) |
Code | Types for executable (e.g. CodeChunk ) and non-executable code (e.g. CodeBlock ) |
🔶 Beta (may change) |
Math | Types for math symbols and equations (e.g. MathBlock ) |
🔶 Beta (may change) |
Data | Fundamental data types (e.g. Number ) and validators (e.g. NumberValidator ) |
🔶 Beta (may change) |
Flow | Types for control flow within a document (e.g. If , For , Call ) |
🔶 Beta (may change) |
Style | Types for styling parts of a documents (Span and Division ) |
🔶 Beta (may change) |
Edits | Types related to editing a documents (e.g. InstructionBlock , DeleteInline ) |
🔶 Beta (may change) |
In v2
, documents can be stored as binary Automerge CRDT files, branched and merged, and with the ability to import and export the document in various formats. Collaboration, including real-time, is made possible by exchanging fine-grained changes to the CRDT over the network. In addition, we want to enable interoperability with a Git-based workflow.
Functionality | Description | Status |
---|---|---|
Documents read/write-able | Able to write a Stencila document to an Automerge binary file and read it back in | |
Documents import/export-able | Able to import or export document as alternative formats, using tree diffing to generate CRDT changes | |
Documents fork/merge-able | Able to create a fork of a document in another file and then later merge with the original | 🧠Planned |
Documents diff-able | Able to view a diff, in any of the supported formats, between versions of a document and between a document and another file | 🧠Planned |
Git merge driver | CLI can act as a custom Git merge driver | 🧠Planned |
Relay server | Documents can be synchronized by exchanging changes via a relay server | 🧠Planned |
Rendezvous server | Documents can be synchronized by exchanging changes peer-to-peer using TCP or UDP hole punching | âť” Maybe |
Interoperability with existing formats has always been a key feature of Stencila. We are bringing over codecs (a.k.a. converters) from the v1
branch and porting other functionality from encoda
to Rust.
Format | Encoding | Decoding | Coverage | Notes |
---|---|---|---|---|
JSON | 🟢 | 🟢 | ||
JSON5 | 🟢 | 🟢 | ||
JSON-LD | 🟢 | 🟢 | ||
CBOR | 🟢 | 🟢 | ||
CBOR+Zstandard | 🟢 | 🟢 | ||
YAML | 🟢 | 🟢 | ||
Plain text | 🔶 | - | ||
HTML | 🚧 | 🧠| ||
JATS | 🚧 | 🚧 | Planned for completion. Port decoding and tests from encoda . |
|
Markdown | ||||
R Markdown | 🧠| 🧠| Relies on Markdown; v1 |
|
Myst Markdown | 🚧 | 🚧 | In progress; PR | |
Jupyter Notebook | 🧠| 🧠| Relies on Markdown; v1 |
|
Scripts | 🧠| 🧠| Relies on Markdown; v1 |
|
Pandoc | 🧠| 🧠| Planned. v1 |
|
LaTeX | 🧠| 🧠| Relies on Pandoc; v1 ; discussion |
|
Org | 🧠| 🧠| Relies on Pandoc; PR | |
Microsoft Word | 🧠| 🧠| Relies on Pandoc; v1 |
|
ODT | 🧠| 🧠| Relies on Pandoc | |
Google Docs | 🧠| 🧠| Planned v1 |
|
🧠| 🧠| Planned, relies on HTML; v1 |
||
Codec Plugin API | 🧠| 🧠| An API allowing codecs to be developed as plugins in Python, Node.js, and other languages |
Kernels are what executes the code in Stencila CodeChunk
s and CodeExpression
s, as well as in control flow document nodes such as ForBlock
and IfBlock
. In addition, there are kernels for rendering math (e.g. MathBlock
) and styling (e.g. StyledInline
) nodes.
Kernel | Purpose | Status |
---|---|---|
Bash | Execute Bash code | 🔶 Beta |
Zsh | Execute Zsh code | âť” Maybe; v1 |
Python | Execute Python code | 🔶 Beta |
R | Execute R code | |
QuickJs | Execute JavaScript in embedded sandbox | 🔶 Beta |
Node.js | Execute JavaScript in a Node.js env | 🔶 Beta |
Deno | Execute TypeScript code | âť” Maybe; v1 |
SQLite | Execute SQL code | 🧠Planned; v1 |
Jupyter kernels | Execute code in Jupyter kernels | 🚧 In progress; PR |
Rhai | Execute a sand boxed, embedded language | 🔶 Beta |
AsciiMath | Render AsciiMath symbols and equations | 🔶 Beta |
TeX | Render TeX math symbols and equations | 🔶 Beta |
Graphviz | Render Graphviz DOT to SVG images | |
Jinja | Interpolate document variables into styling and other code | |
Style | Transpile Tailwind and CSS for styling | 🔶 Beta |
HTTP | Interact with RESTful APIs | âť” Maybe; v1 |
[TIP] Run
stencila kernels
(orcargo run -p cli kernels
in development) for an up to date list of kernels, including those available through plugins.
Tools are what we call the self-contained Stencila products you can download and use locally on your machine to interact with Stencila documents.
Environments | Purpose | Status |
---|---|---|
CLI | Manage documents from the command line and read and edit them using a web browser | |
Desktop | Manage, read and edit documents from a desktop app | |
VSCode extension | Manage, read and edit documents from within VSCode |
Stencila's software development kits (SDKs) enable developers to create plugins to extend Stencila's core functionality or to build other tools on top of. At this stage we are planning to support Python, Node.js and R but more languages may be added if there is demand.
Language | Description | Status | Coverage |
---|---|---|---|
Python | Types and function bindings for using Stencila from Python | ||
TypeScript | JavaScript classes and TypeScript types for the Stencila Schema | ||
Node.js | Types and function bindings for using Stencila from Node.js |
Making sure Stencila v2
is well tested, fast, secure, and accessible, is important. Here's what where doing towards that:
What | Description | Status |
---|---|---|
Property-based testing | Establish property-based (a.k.a generative) testing for Stencila documents | 🟢 Done |
Round-trip testing | Establish property-based tests of round-trip conversion to/from supported formats and reading/writing from/to Automerge CRDTs | 🟢 Done here and here |
Coverage reporting | Report coverage by feature (e.g. by codec) to give developers better insight into the status of each | 🟢 Done Codecov |
Dependency audits | Add dependency audits to continuous integration workflow. | 🟢 Done |
Accessibility testing | Add accessibility testing to continuous integration workflow. | 🟢 Done here |
Performance monitoring | Establish continuous benchmarking | 🟢 Done here |
Security audit | External security audit sponsored by NLNet. | 🧠Planned Q2 2023 (after most v2 functionality added and before public beta) |
Accessibility audit | External accessibility audit sponsored by NLNet. | 🧠Planned Q3 2023 (before v2.0.0 release) |
At this stage, documentation for v2
is mainly reference material, much of it generated:
More reference docs as well as guides and tutorial will be added over the coming months. We will be bootstrapping the publishing of all docs (i.e. to use Stencila itself to publish HTML pages) and expect to have an initial published set in.
Although v2
is in early stages of development, and functionality may be limited or buggy, we are releasing alpha versions of the Stencila CLI and SDKs. Doing so allows us to get early feedback and monitor what impact the addition of features has on build times and distribution sizes.
Windows
To install the latest release download stencila-<version>-x86_64-pc-windows-msvc.zip
from the latest release and place it somewhere on your PATH
.
MacOS
To install the latest release in /usr/local/bin
,
curl --proto '=https' --tlsv1.2 -sSf https://stencila.dev/install.sh | sh
To install a specific version, append -s vX.X.X
. Or, if you'd prefer to do it manually, download stencila-<version>-x86_64-apple-darwin.tar.gz
from the one of the releases and then,
tar xvf stencila-*.tar.gz
cd stencila-*/
sudo mv -f stencila /usr/local/bin # or wherever you prefer
Linux
To install the latest release in ~/.local/bin/
,
curl --proto '=https' --tlsv1.2 -sSf https://stencila.dev/install.sh | sh
To install a specific version, append -s vX.X.X
. Or, if you'd prefer to do it manually, download stencila-<version>-x86_64-unknown-linux-gnu.tar.gz
from the one of the releases and then,
tar xvf stencila-*.tar.gz
mv -f stencila ~/.local/bin/ # or wherever you prefer
Docker
The CLI is also available in a Docker image you can pull from the Github Container Registry,
docker pull stencila/stencila
and use locally like this for example,
docker run -it --rm -v "$PWD":/work -w /work --network host stencila/stencila --help
The same image is also published to the Github Container Registry if you'd prefer to use that,
docker pull ghcr.io/stencila/stencila
Python
Use your favorite package manager to install Stencila's SDK for Python:
python -m pip install stencila
[!NOTE] If you encounter problems with the above command, you may need to upgrade Pip using
pip install --upgrade pip
.
poetry add stencila
Node
Use your favorite package manager to install @stencila/node
:
npm install @stencila/node
yarn add @stencila/node
pnpm add @stencila/node
TypeScript
Use your favorite package manager to install @stencila/types
:
npm install @stencila/types
yarn add @stencila/types
pnpm add @stencila/types
This repository is organized into the following modules. Please see their respective READMEs, where available, for guides to contributing to each.
-
schema
: YAML files which define the Stencila Schema, an implementation of, and extensions to, schema.org, for programmable documents. -
json
: A JSON Schema and JSON LD@context
, generated from Stencila Schema, which can be used to validate Stencila documents and transform them to other vocabularies -
rust
: Several Rust crates implementing core functionality and a CLI for working with Stencila documents. -
python
: A Python package, with classes generated from Stencila Schema and bindings to Rust functions, so you can work with Stencila documents from within Python. -
ts
: A package of TypeScript types generated from Stencila Schema so you can create type-safe Stencila documents in the browser, Node.js, Deno etc. -
node
: A Node.js package, using the generated TypeScript types and bindings to Rust functions, so you can work with Stencila documents from within Node.js. -
prompts
: Prompts for used to instruct AI assistants in different contexts and for different purposes. -
docs
: Documentation, including reference documentation generated fromschema
and CLI tool. -
examples
: Examples of documents conforming to Stencila Schema, mostly for testing purposes. -
scripts
: Scripts used for making releases and during continuous integration.
Several Github Action workflows are used for testing and releases. All products (i.e CLI, Docker image, SKDs) are released at the same time with the same version number. To create and release a new version:
bash scripts/bump-version.sh <VERSION>
git push && git push --tags
Workflow | Purpose | Status |
---|---|---|
test.yml |
Run linting, tests and other checks. Commit changes to any generated files. | |
pages.yml |
Publish docs, JSON-LD, JSON Schema, etc to https://stencila.dev hosted on GitHub Pages | |
version.yml |
Trigger the release.yml workflow when a version tag is created. |
|
release.yml |
Create a release, including building and publishing CLI, Docker image and SDKs. | |
install.yml |
Test installation and usage of CLI, Docker image and SDKs across various operating systems and language versions. |
Stencila is built on the shoulders of many open source projects. Our sincere thanks to all the maintainers and contributors of those projects for their vision, enthusiasm and dedication. But most of all for all their hard work! The following open source projects in particular have an important role in the current version of Stencila. We sponsor these projects where, and to an extent, possible through GitHub Sponsors and Open Collective.
Link | Summary | |
---|---|---|
Automerge | A Rust library of data structures for building collaborative applications. | |
Clap | A Command Line Argument Parser for Rust. | |
NAPI-RS | A framework for building pre-compiled Node.js addons in Rust. | |
PyO3 | Rust bindings for Python, including tools for creating native Python extension modules. | |
Rust | A multi-paradigm, high-level, general-purpose programming language which emphasizes performance, type safety, and concurrency. | |
Serde | A framework for serializing and deserializing Rust data structures efficiently and generically. | |
Similar | A Rust library of diffing algorithms including Patience and Hunt–McIlroy / Hunt–Szymanski LCS. | |
Tokio | An asynchronous runtime for Rust which provides the building blocks needed for writing network applications without compromising speed. |
We wouldn’t be doing this without the support of these forward looking organizations.
Thank you to all our contributors (not just the ones that submitted code!). If you made a contribution but are not listed here please create an issue, or PR, like this.