-
Hello, I'm using osmextract to extract large portions of the osm.pbf file for Europe. I am using a standard config.ini and tweaked the vectortranslate_options argument to select only certain fields and tags. This works fine up to the point that after finishing the process tells me (very unspecificly): "Error in sf::gdal_utils(util = "vectortranslate", source = normalizePath(file_path), : gdal_utils vectortranslate: an error occured. The warnings all relate to "In CPL_gdalvectortranslate(source, destination, options, ... : GDAL Message 1: Non closed ring detected. To avoid accepting it, set the OGR_GEOMETRY_ACCEPT_UNCLOSED_RING configuration option to NO. It seems that the vectortranslation finishes despite the error I receive, but does not extract all features (while when I do the same extraction for a smaller bounding box, no errors occur and many of valid features seem to have been excluded in the first extract as well). Does the "Error in sf::gdal_utils" lead to osmextract stopping the process right away? Is there a possibility to change the "OGR_GEOMETRY_ACCEPT_UNCLOSED_RING configuration option to NO" and would this help in this situation? I am also not sure how to get more information on the "Error in sf::gdal_utils" that could give more hints to debug this situation. In another extract I also had the "OGR_GEOMETRY_ACCEPT_UNCLOSED_RING" warning but no "Error in sf::gdal_utils" and the extraction seemed to complete just fine. Sorry for the long question and hoping that anyone has some clues on this issue. |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 9 replies
-
Hi @jkaucic. Thank you very much for your question and for using this package. I will try to investigate the issue that you mentioned as soon as possible. Just a couple of questions:
|
Beta Was this translation helpful? Give feedback.
-
Thank you for your swift reply! I can provide you with the portion of the code, but it is a very complex query from the multipolygons layer, which took about 6 hours on a i5-8500 3,00 Gig processor with 16 GB RAM. It is about green spaces for the whole Europe.osm.pbf, combined from a selection of OSM tags (leisure, landuse and natural). Admittedly, these are huge data quantities, which is why I resorted to osmextract. First, I define the vectortranslate_options:
Then, I extract the gpkg layer from the pre-downloaded Europe.osm.pbf from Geofabrik (27.5 gig):
This first extract for all of Europe yielded 802,277 features extracted (with 2,278 geometries empty) (strange that there are empty geometries there as well). To debug, I tried to extract only landuse=grass and this yielded 3,783,685 features (in about 7 hours processing time). Therefore, it is clear that in the complex extraction with the error message, a lot of features were not extracted. Also visual inspection of an extract with all relevant tags only for Austria compared with the one for whole of Europe showed that the full extraction finished with a lot of features missing. So it seems that due to the error message the extraction finished, but was incomplete. However, I do not quite know what exactly the error was and how to define the query to have a full extraction of features. Currently I'm trying to run a code with |
Beta Was this translation helpful? Give feedback.
-
Wow, this is probably the largest example of using OSM data I've ever seen. It sounds like a super interesting use case (and, unfortunately, also a quite challenging problem to diagnose...). I'll do my best to run some tests in the next few days and report here in case I find any solution.
What is the largest OSM extract where you observe this problem? |
Beta Was this translation helpful? Give feedback.
-
Hi! Today I run some tests with the code that you attached (on a powerful VM with a lot of free space) considering the OSM extract for FR and GB and the function was executed without any problem. Tomorrow I will test it with the whole Europe.
At the moment, I don't know... According to the tests I previously mentioned, the
To be honest, I don't know all the details. I think that GDAL writes intermediate data steps to a so-called
I don't think so since I see exactly the same behaviour but the process completes without any problem.
During the weekend I will run some tests using my laptop (which has just a few GB of free space) and I will try to replicate the error that you mentioned. I'll add some updates here ASAP. |
Beta Was this translation helpful? Give feedback.
-
Because my large vectortranslate operations always slow down considerably starting with 60% progress: Will setting -lco SPATIAL_INDEX=NO speed up the vectortranslate? As far as I understand, it is set to YES by ogr2ogr default. When I understand correctly the sf package https://r-spatial.org/r/2017/06/22/spatial-index.html , a spatial index is created from scratch for spatial operations on sf objects anyways - so a spatial index is not really needed (and added later to the gpkg anyways) and might slow the conversion from pbf to gpkg? |
Beta Was this translation helpful? Give feedback.
Hi! Today I run some tests with the code that you attached (on a powerful VM with a lot of free space) considering the OSM extract for FR and GB and the function was executed without any problem. Tomorrow I will test it with the whole Europe.
At the moment, I don't know... According to the tests I previously mentioned, the
.gpkg
file returned by the previous operations is smaller than the corresponding.pbf
file (and I think that's reasonable since the.pbf
file contains data for 5 d…