Skip to content

✂️ calculate polygon mask for netCDF/GRIB/raster

License

Notifications You must be signed in to change notification settings

nzahasan/pyscissor

Repository files navigation

pyscissor

Supported Version Action: Build publish License: MIT
A Python3 module for extracting data from netcdf file under a shapefile region.

Installation

pyscissor can be installed using the following commands

$ git clone https://github.com/nzahasan/pyscissor.git
$ cd pyscissor
$ python3 setup.py install

or using pip

$ pip install pyscissor

Using pyscissor

import fiona
import numpy as np
from netCDF4 import Dataset
from shapely.geometry import shape
from pyscissor import scissor 


# read shapefile
sf = fiona.open('data/shape.geojson')
shapely_shp =shape(sf.get(0)['geometry'])


# read netcdf
nf = Dataset('data/sample_2.nc','r')
lats = nf.variables['lat'][:]
lons = nf.variables['lon'][:]

# create scissor object 
pys = scissor(shapely_shp,lats,lons)

weight_grid = pys.get_masked_weight() #=> returns a masked array containing weights

# get weighted average
avg = np.average(var,weights=weight_grid)

# if only intersection mask with shape is needed use `weight_grid.mask`

A detailed use case can be found in the following jupyter notebooks

Using nc2ts_by_shp.py

this package contains a nc2ts_by_shp.py script. A command line tool that can be used to quickly extract reduced(min/max/average/weighted average) time-series form netcdf file with shapefile

# with 3d array [data/sample_2.nc] generel case
$ nc2ts_by_shp.py -nc=sample_2.nc -nci='Y=lat;X=lon;T=time;V=tmin;' -s=shape_esri.zip \
		-sp='ADM2_EN;ADM3_EN' -r=avg -o=test2.csv

# with 4d array [data/sample_1.nc]
$ nc2ts_by_shp.py -nc=sample_1.nc -nci='Y=lat;X=lon;T=time;V=temperature;slicer=[:,0,:,:]' -sf=shape_esri.zip \
		-sfp='ADM2_EN;ADM3_EN' -r=wavg -o=test1.csv

Options:

-nc  = netcdf file

-nci = netcdf variable and dimension information
		available options:
		X = x dimension variable name,
		Y = y dimension variable name,
		T = time dimension variable name,
		V = variable name,
		slicer = slicing index for obtaining 3d array [optional]
				
		note: `slicer` is required if variable has more than three dimension

-sf  = shape file ( can be zipped shapefile, shapefile or geojson )

-sfp = shapefile properties
		only required when shapefile contains multiple records

-r   = reducer, default is average
		Available options: min,max,avg,wavg

-o   = output file name

Causes of Erroneous output

- when shapefile and netcdf file have different projection
- shapefile dosen't fully reside within netcdf bounds