-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process & display the Circum-Arctic permafrost and ground ice map #41
Comments
This is a great idea. I'm curious how that relates to the two other general permafrost layers we have --
It seems like we could also show the Obo layers (soil temp and permafrost probability) on our imagery viewer. |
Update: Started staging the file
Staging has been ongoing on Datateam for almost a full day and is at >604,000 staged files. I will continue to let it run while I take care of other higher priority tasks. |
Workflow updateI set up a parsl job on Datateam to execute rasterization and web-tiling in parallel, since rasterization was going very slowly without parallelization. After running over the weekend, rasters were produced for the staged tiles for z-11 (the highest z-level I set in the config), but a parsl error occurred during rasterization for z-level 10: |
I started a new parsl job to pick up where the workflow left off: restarting z-10, then the lower z-levels, then web-tiling. |
Color palette based on an attributeVisualized the web-tiles for just z-11 and z-10 (since those are the only z-levels that I've been able to produce on Datateam so far without running into an OOM error, even with parallelization) using "coverage" for the statistic since this is just a first pass. I think it makes sense to instead use an attribute of the data, such as "EXTENT". The documentation explains what the codes for this attribute mean:
47% of the observations have Could also use the attribute "RELICT" which only contains "yes" or |
Update on dataset processing and assigning color palette to attribute
Over the weekend, the script staged and rasterized all tiles for all z-levels with no OOM errors. However, the web-tiling failed with: Debugging revealed that the issue was within to_image() def to_image(self, image_data):
"""
Create a PIL image from the pixel values.
Parameters
----------
pixel_values : pandas.DataFrame
A dataframe with a pixel_row, pixel_col, and values column.
Returns
-------
PIL.Image
A PIL image.
"""
min_val = self.min_val
max_val = self.max_val
nodata_val = self.nodata_val
image_data = image_data.copy()
rgba_list = self.rgba_list
height = self.height
width = self.width
image_data = image_data.astype(float)
no_data_mask = image_data == nodata_val
# set the nodata value to np.nan
if(len(no_data_mask)):
image_data[no_data_mask] = np.nan
... There will be more aspects of the viz-workflow to tweak before this dataset is complete, such as the palette. Hopefully, this is one step closer to adjusting the workflow to enable styling the 3D tiles and web tiles based on an attribute of interest. See viz-workflow issue#9). |
Removed Invalid GeometriesThe invalid geometries that were displayed like rings around the Arctic Circle as shown above were resulting from the conversion to the CRS of the TMS during staging only from the geometries that intersected the antimeridian. There were 6 geometries that intersected, and removing them before starting the workflow resulted in the following tiles: The following script was used to process the data before executing the workflow. clean_data.py
Next StepsThe final details to work out are:
Note: 66 = 40% alpha and looks good, considering we want users to overlay IWP on this layer and still see the terrain |
Layer with blue paletteLayer with the default 40% opacity and IWP overlaid: When this layer was first uploaded to demo, the incorrect legend order revealed a bug, described in this issue. A fix in the XML was made so the value of each legend item determines the order (1,2,3,4), rather than the string label. The data for this layer are archived at the ADC at
The web tiles are in: To do:
|
@julietcohen, just exploring the CRS from this data... perm.crs.name
# 'Sphere_ARC_INFO_Lambert_Azimuthal_Equal_Area'
perm.crs.to_epsg()
# None More CRS Infoperm.crs.to_json_dict()
# output:
{'$schema': 'https://proj.org/schemas/v0.7/projjson.schema.json',
'type': 'ProjectedCRS',
'name': 'Sphere_ARC_INFO_Lambert_Azimuthal_Equal_Area',
'base_crs': {'name': 'GCS_Sphere_ARC_INFO',
'datum': {'type': 'GeodeticReferenceFrame',
'name': 'D_Sphere_ARC_INFO',
'ellipsoid': {'name': 'Sphere_ARC_INFO', 'radius': 6370997}},
'coordinate_system': {'subtype': 'ellipsoidal',
'axis': [{'name': 'Longitude',
'abbreviation': 'lon',
'direction': 'east',
'unit': {'type': 'AngularUnit',
'name': 'Degree',
'conversion_factor': 0.0174532925199433}},
{'name': 'Latitude',
'abbreviation': 'lat',
'direction': 'north',
'unit': {'type': 'AngularUnit',
'name': 'Degree',
'conversion_factor': 0.0174532925199433}}]}},
'conversion': {'name': 'unnamed',
'method': {'name': 'Lambert Azimuthal Equal Area (Spherical)',
'id': {'authority': 'EPSG', 'code': 1027}},
'parameters': [{'name': 'Latitude of natural origin',
'value': 90,
'unit': {'type': 'AngularUnit',
'name': 'Degree',
'conversion_factor': 0.0174532925199433},
'id': {'authority': 'EPSG', 'code': 8801}},
{'name': 'Longitude of natural origin',
'value': 180,
'unit': {'type': 'AngularUnit',
'name': 'Degree',
'conversion_factor': 0.0174532925199433},
'id': {'authority': 'EPSG', 'code': 8802}},
{'name': 'False easting',
'value': 0,
'unit': 'metre',
'id': {'authority': 'EPSG', 'code': 8806}},
{'name': 'False northing',
'value': 0,
'unit': 'metre',
'id': {'authority': 'EPSG', 'code': 8807}}]},
'coordinate_system': {'subtype': 'Cartesian',
'axis': [{'name': 'Easting',
'abbreviation': '',
'direction': 'east',
'unit': 'metre'},
{'name': 'Northing',
'abbreviation': '',
'direction': 'north',
'unit': 'metre'}]}} This seems to be a projection without a EPSG code. If you try setting it to EPSG:9820 ( I'm not sure, but I have a feeling that some of the projection information needed is not handled properly by geopandas (or more accurately: by the PROJ library that geopandas uses under the hood). I wonder if it's possible to re-project in ARC first. Also, I don't know if this might help in splitting the polygons, but there is some prime meridian info in the CRS object:
|
Thanks for the insight, @robyngit! That helps clarify the confusing outputs and info (and lack of info) that I found online about this projection as well. I think you're right that there's information that geopandas can't handle, because an error is also returned from reprojecting an antimeridian line in WGS84 to the CRS of the input data. reproduce error# raw input data
input = "/home/jcohen/permafrost_ground_layer/data/permaice.shp"
perm = gpd.read_file(input)
# drop rows that have missing value for extent attribute
# (subsets the data to only the polygons of interest)
perm.dropna(subset = ['EXTENT'], inplace = True)
# create new gdf of just the polygons that have a neg min longitude and a pos max longitude
# this indicates that the polygon crosses the antimeridian or prime meridian (7 polygons)
meridian_polys = (perm['geometry'].bounds['minx'] < 0 ) & (perm['geometry'].bounds['maxx'] > 0)
# subset the data to just polygons that cross either of the merdidians
prob_polys = perm[meridian_polys].copy()
# subset the prob_polys to those that cross antimeridian, not prime meridian
# the only polygon that crosses the prime meridian is at index 60
polys_AM = prob_polys.drop([60], inplace = False)
# create a LineString from the north pole to south pole along antimeridian
line = LineString([(180, 90), (180, -90)])
# Create a GeoDataFrame with the line
line = gpd.GeoDataFrame({'geometry': [line]}, crs = 'EPSG:4326')
# reproject to the crs of input data
line.to_crs(polys_AM, inplace=True)
output:
ValueError: The truth value of a GeoDataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). I will look into your suggestion to reproject this shapefile in ARCmap. |
Update: splitting polygons to enable clean transformation to EPSG:4326Converting the original data to EPSG:4326 in QGIS did not work because the uploaded file does not have a known EPSG. The Lambert projection is labeled as a custom CRS. QGIS is different than ArcMap (although the specific differences are unknown to me), so there is a chance that still may work in ArcMap, but I do not have access to ArcMap at this moment (maybe I can through a coworker at NCEAS). Instead of doing that CRS conversion for the data file with that software, I continued to try to split the polygons with a buffer at the antimeridian. Attempted programatic approachesIn order to split the polygons the cross the antimeridian, I tried to use buffer AM line# read in polygons that cross AM
polys = gpd.read_file("~/permafrost_ground_layer/split_AM_polys/AM_polygons.gpkg")
# first create a line gdf at the 180th longitude,
# that's where the center is in this dataset according to metadata,
# but the units are meters, not degrees, so use 20,000,000 instead of 180
line_geometry = LineString([(0, -20000000), (0, 20000000)])
am = gpd.GeoDataFrame(geometry = [line_geometry])
# set the CRS to that of the polygons that need to be split
am.set_crs(polys.crs, inplace = True)
buffer = 600
am_buffered = am.buffer(buffer)
# create empty lists to store the split polygons and their attributes
geoms_all = []
atts_all = []
# iterate over each geometry that crosses the antimeridian
for index, row in polys.iterrows():
# define the geometry and attributes separately
geom = row['geometry']
atts = row.drop('geometry')
# append the atts to atts_all so we can append them to geoms after loop
atts_all.append(atts)
# return a boolean series that fits both the conditions:
# 1) the line and geom do intersect,
# 2) the line and geom do not touch (share a boundary)
line = am_buffered.loc[am_buffered.geometry.intersects(geom)]
# split the single geometry at the buffered antimeridian,
# outputing multiple geometries stored within a "geometrycollection" per row
split_polys = split(geom, line.geometry.iloc[0])
# convert the split polygons to a gdf in order to concatenate with the attributes
#split_polys_gdf = gpd.GeoDataFrame(geometry = split_polys)
#polys_with_atts = pd.concat([split_polys, atts], axis = 1)
# add split polygons to the output list
geoms_all.append(split_polys)
# create a GeoDataFrame from the split polygons
geoms_all_gdf = gpd.GeoDataFrame(geometry = geoms_all)
geoms_all_gdf.set_crs(polys.crs, inplace = True)
geoms_all_gdf_exploded = geoms_all_gdf.explode().reset_index(drop = True)
split_84 = geoms_all_gdf_exploded.to_crs(epsg="4326", inplace = False) Unfortunately, an error resulted: Next I tried to split the polygons with the original (not buffered) antimeridian for split_geom in split_polys.geoms:
if split_geom.intersects(line.geometry.iloc[0]):
buffered_split = split_geom.buffer(buffer_distance, single_sided = True)
geoms_all.append(buffered_split) The output does not look buffered when mapped, even when the buffer distance is huge. QGIS approachWhile I have not 100% given up hope that Converting these geoms to EPSG:4326 shows no wrapping around the world: |
I am not sure how to define an object or a |
I have come up with a new cleaning script that does the following steps programmatically:
clean_perm_v3.py# Clean permaice data from Brown et al. 2002
# Data source: https://nsidc.org/data/ggd318/versions/2
# Author: Juliet Cohen
# Date: 2024-05-01
# Cleaning steps include:
# 1. remove NA values from extent attribute
# 2. add a dummy coding for the extent attribute
# 3. split polygons that intersect the antimeridian
# conda env: viz_3-10_local
import geopandas as gpd
import pandas as pd
import numpy as np
from shapely.geometry import LineString
input = "/home/jcohen/permafrost_ground_layer/data/permaice.shp"
perm = gpd.read_file(input)
# drop rows that have missing value for extent attribute
# since this is the attribute to visualize
perm.dropna(subset = ['EXTENT'], inplace = True)
# create new gdf of just the polygons that have a neg min longitude and a pos max longitude
# this indicates that the polygon crosses the antimeridian or prime meridian (7 polygons)
meridian_polys = (perm['geometry'].bounds['minx'] < 0 ) & (perm['geometry'].bounds['maxx'] > 0)
# subset the data to just polygons that cross either of the merdidians
# and retain all attributes by appending `.copy()`
prob_polys = perm[meridian_polys].copy()
# subset the prob_polys to those that cross antimeridian, not prime meridian
# the only polygon that crosses the prime meridian is at index 60
polys_AM = prob_polys.drop([60], inplace = False)
# remove the antimeridian polygons from the original gdf
# so when the split ones are appeneded later, there won't be duplicates
perm.drop(polys_AM.index, inplace = True)
# Split polygons that cross the antimeridian
# Step 1. create a line gdf at the 180th longitude,
# that's where the center is in this dataset according to metadata,
# the units are meters, not degrees, so use 20,000,000 instead of 180
am = gpd.GeoSeries(LineString([(0, -20000000), (0, 20000000)]))
am.set_crs(perm.crs, inplace = True)
# buffer the line with 1 meter (units of CRS) to convert it to a polygon
am_buffered = am.buffer(distance = 1,
cap_style = 2,
join_style = 0,
mitre_limit = 2)
# create empty lists to store the split polygons and their attributes
all_data = []
# iterate over each geometry that crosses the antimeridian
for index, row in polys_AM.iterrows():
# define the geometry and attributes separately
geom = row['geometry']
atts = gpd.GeoDataFrame(row.drop('geometry').to_frame().T)
# split the single geometry with the buffered antimeridian GeoSeries,
# outputing multiple geoms stored within a MultiPolygon
split_geoms_MP = geom.difference(am_buffered)
# make the index match the atts to concat correctly
split_geoms_MP.index = atts.index
# convert to GDF to define the geometry col
split_geoms_MP_gdf = gpd.GeoDataFrame(geometry = split_geoms_MP)
split_geoms_MP_gdf.set_crs(perm.crs, inplace = True)
MP_with_atts = pd.concat([split_geoms_MP_gdf, atts], axis = 1)
# MP_with_atts.reset_index(inplace = True) # not sure if i need this
P_with_atts = MP_with_atts.explode(ignore_index = False,
index_parts = False)
# concatenate the exploded geometries with their attributes
all_data.append(P_with_atts)
# create empty gdf to store final result
all_data_concat = gpd.GeoDataFrame()
# iterate over each gdf in all_data, concatenate into single gdf
for gdf in all_data:
all_data_concat = pd.concat([all_data_concat, gdf],
ignore_index = True)
all_data_concat.reset_index(drop = True, inplace = True)
# append the split polygons to the same gdf as other polygons
perm = pd.concat([perm, all_data_concat], ignore_index = True)
# add column that codes the categorical extent strings into numbers
# in order to do stats with the workflow and assign palette to this
# first, define the conditions and choices for new extent_code attribute
conditions = [
(perm['EXTENT'] == "C"),
(perm['EXTENT'] == "D"),
(perm['EXTENT'] == "S"),
(perm['EXTENT'] == "I")
]
choices = [4, 3, 2, 1]
# assign extent_code based on the conditions and choices
perm['extent_code'] = np.select(conditions, choices)
# save as a cleaned input file for the viz workflow
output_file = "/home/jcohen/permafrost_ground_layer/split_AM_polys/cleaned_0501/permaice_clean_split.gpkg"
perm.to_file(output_file, driver = "GPKG")
print("Cleaning complete.") In order to integrate this approach as a generalized step in |
There is potential for the limiting factor to be in the bounding box created in viz-staging here. Perhaps the bottom of the bounding box is hard coded to a certain latitude somehow. |
In viz-staging, we use geopandas |
It was requested that we display a permafrost layer as the base layer when someone visits the PDG for the first time. The permafrost layer could be combined with the ice-wedge polygon map, so two layers presented as base layers.
Feedback collected by Moss & Andrew (the K12 teacher in Bethel) indicated that anyone coming to the PDG does not see/understand (right away) that the PDG focus is on permafrost.
The ideal permafrost layer would be the Circum-Arctic Map of Permafrost and Ground-Ice Conditions, Version 2.
According to the metadata, it looks to be a 24 MB shapefile.
The text was updated successfully, but these errors were encountered: