-
Notifications
You must be signed in to change notification settings - Fork 30
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Adding sphinx boilerplate (taken from Xarray-Beam). I've added an index page and a "Why Xee" section. This PR also includes an outline of what the remaining documentation will look like. We include other project clean-up work here, too. PiperOrigin-RevId: 571189219
- Loading branch information
Showing
10 changed files
with
296 additions
and
25 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# Minimal makefile for Sphinx documentation | ||
# | ||
|
||
# You can set these variables from the command line, and also | ||
# from the environment for the first two. | ||
SPHINXOPTS ?= | ||
SPHINXBUILD ?= sphinx-build | ||
SOURCEDIR = . | ||
BUILDDIR = _build | ||
|
||
# Put it first so that "make" without argument is like "make help". | ||
help: | ||
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) | ||
|
||
.PHONY: help Makefile | ||
|
||
# Catch-all target: route all unknown targets to Sphinx using the new | ||
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). | ||
%: Makefile | ||
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
index.md |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
# API docs | ||
|
||
```{eval-rst} | ||
.. currentmodule:: xee | ||
``` | ||
|
||
## Core extension | ||
|
||
```{eval-rst} | ||
.. autosummary:: | ||
:toctree: _autosummary | ||
EarthEngineBackendEntrypoint | ||
EarthEngineStore | ||
EarthEngineBackendArray | ||
``` | ||
|
||
## Utility functions | ||
|
||
```{eval-rst} | ||
.. autosummary:: | ||
:toctree: _autosummary | ||
geometry_to_bounds | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
# Configuration file for the Sphinx documentation builder. | ||
# | ||
# This file only contains a selection of the most common options. For a full | ||
# list see the documentation: | ||
# https://www.sphinx-doc.org/en/master/usage/configuration.html | ||
|
||
# -- Path setup -------------------------------------------------------------- | ||
|
||
# If extensions (or modules to document with autodoc) are in another directory, | ||
# add these directories to sys.path here. If the directory is relative to the | ||
# documentation root, use os.path.abspath to make it absolute, like shown here. | ||
# | ||
# import os | ||
# import sys | ||
# sys.path.insert(0, os.path.abspath('.')) | ||
|
||
# Print Python environment info for easier debugging on ReadTheDocs | ||
|
||
import sys | ||
import subprocess | ||
import xee # verify this works | ||
|
||
print('python exec:', sys.executable) | ||
print('sys.path:', sys.path) | ||
print('pip environment:') | ||
subprocess.run([sys.executable, '-m', 'pip', 'list']) # pylint: disable=subprocess-run-check | ||
|
||
print(f'xee: {xee.__version__}, {xee.__file__}') | ||
|
||
# -- Project information ----------------------------------------------------- | ||
|
||
project = 'Xee' | ||
copyright = '2023, Google LCC' # pylint: disable=redefined-builtin | ||
author = 'The Xee authors' | ||
|
||
|
||
# -- General configuration --------------------------------------------------- | ||
|
||
# Add any Sphinx extension module names here, as strings. They can be | ||
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom | ||
# ones. | ||
extensions = [ | ||
'sphinx.ext.autodoc', | ||
'sphinx.ext.autosummary', | ||
'sphinx.ext.napoleon', | ||
'myst_nb', | ||
] | ||
|
||
# Add any paths that contain templates here, relative to this directory. | ||
templates_path = ['_templates'] | ||
|
||
# List of patterns, relative to source directory, that match files and | ||
# directories to ignore when looking for source files. | ||
# This pattern also affects html_static_path and html_extra_path. | ||
exclude_patterns = ['_build', '_templates', 'Thumbs.db', '.DS_Store'] | ||
|
||
intersphinx_mapping = { | ||
'xarray': ('https://xarray.pydata.org/en/latest/', None), | ||
} | ||
|
||
# -- Options for HTML output ------------------------------------------------- | ||
|
||
# The theme to use for HTML and HTML Help pages. See the documentation for | ||
# a list of builtin themes. | ||
# | ||
html_theme = 'sphinx_rtd_theme' | ||
|
||
# Add any paths that contain custom static files (such as style sheets) here, | ||
# relative to this directory. They are copied after the builtin static files, | ||
# so a file named "default.css" will overwrite the builtin "default.css". | ||
html_static_path = ['_static'] | ||
|
||
# -- Extension config | ||
|
||
autosummary_generate = True | ||
|
||
# https://myst-nb.readthedocs.io/en/latest/use/execute.html | ||
jupyter_execute_notebooks = 'cache' | ||
# https://myst-nb.readthedocs.io/en/latest/use/formatting_outputs.html#removing-stdout-and-stderr | ||
nb_output_stderr = 'remove-warn' | ||
|
||
# https://stackoverflow.com/a/66295922/809705 | ||
autodoc_typehints = 'description' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# Xee: A Google Earth Engine extension for Xarray | ||
|
||
Xee is an Xarray extension for Google Earth Engine. It aims to help users view | ||
Earth Engine's [data catalog](https://developers.google.com/earth-engine/datasets) | ||
through the lense of arrays. | ||
|
||
In this documentation, we assume readers have some familiarity with | ||
[Earth Engine](https://earthengine.google.com/), [Xarray](https://xarray.dev/), | ||
and Python. Here, we'll dive into core concepts related to the integration | ||
between these tools. | ||
|
||
## Contents | ||
|
||
<!-- TODO(#38): Documentation Plan | ||
- Why Xee? | ||
- Core features | ||
- `open_dataset()` | ||
- `open_mfdatasets()` | ||
- Projections & Geometry | ||
- Xarray slicing & indexing 101 | ||
- Combining ee.ImageCollection and Xarray APIs. | ||
- Plotting | ||
- Lazy Evaluation & `load()` | ||
- Advanced projections | ||
- Performance tuning: A tale of two chunks | ||
- Walkthrough: calculating NDVI | ||
- Integration with Xarray-Beam | ||
- Integration with ML pipeline clients --> | ||
|
||
```{toctree} | ||
:maxdepth: 1 | ||
why-xee.md | ||
api.md | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
@ECHO OFF | ||
|
||
pushd %~dp0 | ||
|
||
REM Command file for Sphinx documentation | ||
|
||
if "%SPHINXBUILD%" == "" ( | ||
set SPHINXBUILD=sphinx-build | ||
) | ||
set SOURCEDIR=. | ||
set BUILDDIR=_build | ||
|
||
if "%1" == "" goto help | ||
|
||
%SPHINXBUILD% >NUL 2>NUL | ||
if errorlevel 9009 ( | ||
echo. | ||
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx | ||
echo.installed, then set the SPHINXBUILD environment variable to point | ||
echo.to the full path of the 'sphinx-build' executable. Alternatively you | ||
echo.may add the Sphinx directory to PATH. | ||
echo. | ||
echo.If you don't have Sphinx installed, grab it from | ||
echo.http://sphinx-doc.org/ | ||
exit /b 1 | ||
) | ||
|
||
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% | ||
goto end | ||
|
||
:help | ||
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% | ||
|
||
:end | ||
popd |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
# doc requirements | ||
Jinja2==3.1.2 | ||
myst-nb==0.17.2 | ||
myst-parser==0.18.1 | ||
sphinx_rtd_theme==1.2.1 | ||
sphinx==5.3.0 | ||
scipy==1.10.1 | ||
|
||
# xee requirements | ||
xee[examples] @ git+https://github.com/google/xee.git |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
# Why Xee? | ||
|
||
We noticed two clusters of users working with climate and weather data at | ||
Google Research: Some were [Xarray](https://xarray.dev) (and | ||
[Zarr](https://zarr.dev/)) centric and others, Google Earth Engine centric. Xee | ||
came about as an effort to bring these two groups of developers closer together. | ||
|
||
## Goals | ||
|
||
Primary Goals: | ||
|
||
- Make [EE-curated data](https://developers.google.com/earth-engine/datasets) | ||
accessible to users in the Xarray community and to the wider scientific Python | ||
ecosystem. | ||
- Make it trivial to avoid quota limits when computing pixels from Earth Engine. | ||
- Provide an easy way for scientists and ML practitioners to coalesce Earth data | ||
at different scales into a common resolution. | ||
|
||
Secondary Goals: | ||
|
||
- Provide a succinct interface for querying Earth Engine data at scale (i.e. via | ||
[Xarray-Beam](https://xarray-beam.readthedocs.io/)). | ||
- Make it trivial to quickly [export Earth Engine data to Zarr](https://github.com/google/xee/tree/main/examples#export-earth-engine-imagecollections-to-zarr-with-xarray-beam). | ||
- Provide compelling alternative for the need to export Zarr in the first | ||
place (e.g. during the ML training process). | ||
|
||
## Approach | ||
|
||
With the addition of Earth Engine's [Pixel API](https://medium.com/google-earth/pixels-to-the-people-2d3c14a46da6), | ||
it became possible to easily get NumPy array data from `ee.Image`s. In building | ||
tools atop of this, we noticed that the best practices for managing data were | ||
Xarray-shaped. For example: | ||
|
||
- Our codebases involved many similar LOC to translate between Earth Engine and | ||
arrays: Users typically thought in NumPy and molded EE's Python client to fit | ||
those idioms. | ||
- We often needed to page `computePixel()` requests in a way that's strikingly | ||
similar to Dask/Xarray's concept of [`chunks`](https://docs.xarray.dev/en/stable/user-guide/dask.html#what-is-a-dask-array). | ||
- Users were wrapping NumPy arrays within dataclasses to associate metadata and | ||
labels with data. | ||
|
||
In an attempt to group these disparate solutions into a singular interface, we | ||
experimented with wrapping `computePixels()` into | ||
[Xarray's standard mechanism for defining backends](https://docs.xarray.dev/en/stable/internals/how-to-add-new-backend.html). The result of this effort is Xee. | ||
|
||
|
||
## An array by any other name? (Xee vs Zarr) | ||
|
||
[Zarr](https://zarr.dev/) has been growing in relevance to the world of [cloud-based scientific data](https://doi.org/10.1109/MCSE.2021.3059437). | ||
Members of the open source community have [demonstrated](https://www.youtube.com/watch?v=0bqpxX3Nn_A) | ||
that Zarr is more of a data protocol rather than a data format. In many ways, | ||
Xee is inspired by this work. To this end, we'd like to point out some | ||
similarities and differences between Zarr backed and Earth Engine backed data in | ||
Xarray. | ||
|
||
Similarities: | ||
- **Xarray-compatible**: Of course, this library proves that both types of data | ||
stores can be compatible with Xarray. [Zarr](https://docs.xarray.dev/en/stable/user-guide/io.html#zarr) | ||
reading and writing is deeply integrated into Xarray as well. | ||
- **Optimal IO Chunks**: Ultimately, cloud-based data stores will inherently | ||
involve networking overhead. There are similarities in the best way to page | ||
data across a network into a local context: the optimal Zarr chunk | ||
size is around [10-100 MBs](https://esipfed.github.io/cloud-computing-cluster/optimization-practices.html#chunk-size). With Earth Engine's backend, the maximum chunk size possible | ||
is 48 MBs. | ||
|
||
Differences: | ||
- **Quota vs No Quota**: Since Earth Engine is API based, there are quota | ||
restrictions that limit IO, namely a 100 QPS limit on data requests. Readers | ||
all need to be authenticated and tied to a GCP project quota. Zarr, on the | ||
other hand, has a lower level access pattern. Reading is delegating to basic | ||
permissions on cloud buckets. | ||
- **On the fly vs up-front data shaping**: In Zarr, the representation of data | ||
at rest fundamentally influences performance at query time. For this reason, | ||
[rechunking](https://xarray-beam.readthedocs.io/en/latest/rechunking.html) and | ||
projecting is a common routine performed up front on Zarr when data does not | ||
quite fit the problem at hand. Earth Engine provides a more flexible interface | ||
than this. Since datasets are pyramided (either at [ingestion](https://developers.google.com/earth-engine/help_collection_criteria) or server-side), users are free to request the | ||
resolution and projection of the data during dataset open. Similarly, while | ||
Earth Engine's internal dataset does fit an internal chunking scheme, chunking | ||
schemes are a lot more fungibile. | ||
|
||
We hope that this comparison provides the user of a set of useful precedents | ||
for working with cloud-based datasets. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters