You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Rather than implicitly setting (or assuming) a Coordinate Reference System (CRS) for a Sedona dataframe derived from a shapefile, it would be helpful to include the known CRS as a dataframe property or as EWKT geometry.
Suggestion:
The CRS can be obtained from the corresponding shapefile .prj file (if one exists) using the parent ID property of the WKT CRS definition based on an OGC WKT CRS standard (https://www.ogc.org/standard/wkt-crs/). If the .prj file is not formatted to this standard or does not exist then no CRS shall be retrieved.
The parent ID for this WKT CRS example is ["EPSG": 7930]
When reading a shapefile from a folder path using the ShapefileReader class (using any method: readToGeometryRDD, readToPolygonRDD, readToPolygonRDD, readToLineStringRDD), include a dataframe property (or return geometry as EWKT) that stores the shapefile CRS as defined in shapefile.prj (if it exists in the input folder).
Actual behavior
Shapefile geometry is retrieved as WKT without any CRS information, even if it exists in the .prj file. This makes it difficult to work with a variety of shapefile inputs that may be based on a variety of coordinate systems.
Steps to reproduce the problem
Example usage here is to transform to a desired coordinate system regardless of the input coordinate system (i.e. without calling ST_SetSRID), making use of the sedona-1.5.1 supported format ST_Transform (A: Geometry, TargetCRS: String) as described here: https://sedona.apache.org/1.5.0/api/sql/Function/#st_transform.
shp_rdd=ShapefileReader.readToGeometryRDD(SEDONA, shp_file)
shp_df=Adapter.toDf(shp_rdd, SEDONA)
shp_df.createOrReplaceTempView("shp_data")
output_df=SEDONA.sql("select ST_Transform(geometry, 'EPSG:4326') as geometry from shp_data")
This returns the following error:
Source CRS must be specified. No SRID found on geometry.
Settings
Sedona version = 1.5.1
Apache Spark version = 3.3.0
API type = Python
Python version = 3.10
Environment = AWS Glue 4.0 using sedona-spark-shaded-3.0_2.12-1.5.1.jar and geotools-wrapper-1.5.1-28.2.jar
The text was updated successfully, but these errors were encountered:
@adamaps Thanks for raising this issue. We are thinking of completely re-writing our Shapefile reader using Dataframe API.
For now, you can use ST_SetSRID to specify SRID on your geometries assume all geometries are in the same CRS. Or ST_Transform(geometry, 'EPSG:XXXX', 'EPSG:4326').
Rather than implicitly setting (or assuming) a Coordinate Reference System (CRS) for a Sedona dataframe derived from a shapefile, it would be helpful to include the known CRS as a dataframe property or as EWKT geometry.
Suggestion:
The CRS can be obtained from the corresponding shapefile .prj file (if one exists) using the parent
ID
property of the WKT CRS definition based on an OGC WKT CRS standard (https://www.ogc.org/standard/wkt-crs/). If the .prj file is not formatted to this standard or does not exist then no CRS shall be retrieved.The parent ID for this WKT CRS example is
["EPSG": 7930]
Expected behavior
When reading a shapefile from a folder path using the ShapefileReader class (using any method:
readToGeometryRDD
,readToPolygonRDD
,readToPolygonRDD
,readToLineStringRDD
), include a dataframe property (or return geometry as EWKT) that stores the shapefile CRS as defined in shapefile.prj (if it exists in the input folder).Actual behavior
Shapefile geometry is retrieved as WKT without any CRS information, even if it exists in the .prj file. This makes it difficult to work with a variety of shapefile inputs that may be based on a variety of coordinate systems.
Steps to reproduce the problem
Example usage here is to transform to a desired coordinate system regardless of the input coordinate system (i.e. without calling
ST_SetSRID
), making use of thesedona-1.5.1
supported formatST_Transform (A: Geometry, TargetCRS: String)
as described here: https://sedona.apache.org/1.5.0/api/sql/Function/#st_transform.This returns the following error:
Settings
Sedona version = 1.5.1
Apache Spark version = 3.3.0
API type = Python
Python version = 3.10
Environment = AWS Glue 4.0 using
sedona-spark-shaded-3.0_2.12-1.5.1.jar
andgeotools-wrapper-1.5.1-28.2.jar
The text was updated successfully, but these errors were encountered: