This package adds additional drivers for the
Intake library using the
Satpy library to read data files.
This package also depends on
intake-xarray to define
the Xarray container type (xarray Dataset
) which these Satpy-based drivers
produce.
To add this package to an existing pip
based environment, run:
pip install intake-satpy
Or if you have a conda-based environment you can install it from the conda-forge channel:
conda install -c conda-forge intake-satpy
This package currently only supplies one intake driver named satpy
.
As with any intake driver, the satpy
driver can be used in a couple
different ways. A few examples are shown below.
Once the intake-satpy
package is installed, you can use this driver by
calling intake.open_satpy
. At the time of writing, it is best to provide
as much information to configure/control Satpy as you can by passing the
scene_kwargs
and load_kwargs
.
import intake
from glob import glob
data_source = intake.open_satpy(
glob("/data/satellite/abi/*.nc"),
scene_kwargs={"reader": "abi_l1b"},
load_kwargs={"wishlist": ["C01"]},
)
dataset = data_source.read_chunked()
The read_chunked
method will return an xarray Dataset
object that will
contain the products that Satpy was able to create. Data will be represented
as dask arrays underneath. The data_source.to_dask()
method will also
produce this result. The data_source.read()
method will return the same
xarray Dataset
object but data will be loaded into memory as numpy arrays.
Care must be taken as the large satellite formats read by Satpy can quickly
fill up your system's memory if loaded in this way.
By default, if wishlist
is not provided as a load keyword argument
(see above), then all available "reader" level products will be loaded. This
means those that can be read directly from the file and does not include
any Satpy "composites".
Also by default the loaded dataset is "resampled" using Satpy's "native"
resampler to the finest resolution of the loaded products. This allows
for all products to exist in a single xarray Dataset
object. This behavior
can be customized by providing resample_kwargs
to the source creation
(open_satpy
call).
The satpy
driver can also be used in a catalog definition. See the
examples/local_abi_l1b.yaml
catalog definition file for an example. With a catalog like this you could then
do:
import intake
cat = intake.open_catalog("examples/local_abi_l1b.yaml")
source = cat.abi_l1b(base_dir="/data/satellite/abi")
dataset = source.read_chunked()
A wishlist of products to load can be provided to the source when creating it:
cat = intake.open_catalog("examples/local_abi_l1b.yaml")
source = cat.abi_l1b(base_dir="/data/satellite/abi", load_kwargs={"wishlist": ["C01"]})
dataset = source.read_chunked()
As with the inline usage, if wishlist
is not provided then all reader-level
products will be loaded.
Some of Satpy's readers can also read data from remote storage like S3 buckets.
An example catalog is included in the examples/
directory of the
intake-satpy
repository.
Note that Satpy's performance for reading S3 files is currently very slow, but is being worked on. It is likely not suitable for loading data outside of the network where the S3 storage is (AWS in this example) until future updates to Satpy and NetCDF are made.
import intake
cat = intake.open_catalog("examples/aws_abi_l1b_20220101_18.yaml")
source = cat.abi_l1b()
dataset = source.read_chunked()