You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have around 200k polygons in a shapefile, and I want to dissolve the polygons that are connected to each other. ArcGIS offers simple techniques to achieve this, but I was wondering if there are quicker ways to do it. I’ve tried the following but it took ages to execute.
importdask-geopandsasdd# Read the shapefileddf=dd.read_file(input_shapefile, npartitions=10)
# Dissolve polygons that are connected with each otherddf['dissolve'] =1# Create a dummy column for dissolvingdissolved_gdf=ddf.dissolve('dissolve', split_out=11, sort=False)
# Explode the dissolved multipolygon into individual polygons and reset indexdissolved_gdf=dissolved_gdf.explode().reset_index(drop=True)
# Add an index columndissolved_gdf['index'] =dissolved_gdf.indexdissolved_gdf.compute().to_file(output_shapefile_filled, use_arrow=True)
The text was updated successfully, but these errors were encountered:
Do you need dask-geopandas? Because if you are fine with vanilla geopandas, it will be much easier. And 200k should be perfectly fine.
You need to identify connected components and dissolve by a component label. That is tricky in distributed setting. But in a single GeoDataFrame, it is easy with the help of libpysal / (or scipy only).
Thanks! Yes, I do need Dask since I’ll be processing millions of polygons. I added map_partitions to my function, and it worked. However, now the problem is that it’s taking a long time to transfer it to a GeoPandas DataFrame.
map_partitions will work only if you ensure that a single component is always within a single partition. If it stretches across multiple, the approach will not work.
I have around 200k polygons in a shapefile, and I want to dissolve the polygons that are connected to each other. ArcGIS offers simple techniques to achieve this, but I was wondering if there are quicker ways to do it. I’ve tried the following but it took ages to execute.
The text was updated successfully, but these errors were encountered: