-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extend NextRelation/FindInRelation to nodes #632
Conversation
This also makes `NextRelation` return a tuple of ID, role rather than just ID. According to https://www.lua.org/pil/5.1.html, I think that's not a breaking change. Motivation: When I naively try to create labels from OSM entities with the `place` tag, I occasionally get duplicates due to relations. For example, the city of Guelph has [relation 7486148](https://www.openstreetmap.org/relation/7486148), which has [node 36576733](https://www.openstreetmap.org/node/36576733) as its `label` member. They both have `place=city`, so my Lua script faithfully spits out two labels. This PR allows me to suppress the label from the node, which is a start. Ideally, I'd actually prefer to use the node, as it'll likely have a nicely human-curated location. For that, I'd need the relation to be able to interrogate its members. Would you be open to me extending this PR to add `NextMember`, `FindInMember` and `RestartMembers` functions that mirror the ones used by nodes/ways ?
Thanks! Reading roles, and extending support to nodes, is a big win. Providing NextMember etc. sounds good too. Relations are generally a mess, and each The only (reluctant) exceptions are So I'm a little anxious about hard-coding
which would write a node to the |
Great feedback. I'll rework this:
Something that gives me pause: I am separately considering implementing support for polylabel. I think this would address issues #392 and #447 (and perhaps lessen the urgency of #467). I hadn't thought about how that would surface to the user, though. Did you have anything in mind? One option in line with what you'd proposed here would be:
But I wonder if the user would almost always want polylabel as the fallback? So maybe |
Polylabel support would be excellent. Maybe something like
where the 2nd parameter onwards are optional, and strategy is an enum for the calculation algorithm: 0 for polylabel, 1 for centroid, 2 for Boost.Geometry's point_on_surface, and so on. Lua doesn't actually have enums of course, but we could potentially define constants somewhere. |
I dunno what I was thinking, this is much better. (And I see now that there are many more roles than these four.)
That ended up being pretty straightforward. No more duplicate labels, and I can now use the |
Ugh, I guess this could be considered a breaking change after all. If a user previously passed `LayerAsCentroid("layername", false)`, we'd ignore the false. With this change, the false causes us to fail, because we expect a string. I'd normally just say this was undefined behaviour, and those users deserve to be broken...but it's tricky, since this is a script from the official tilemaker repo.
D'oh, I guess this could be considered a breaking change after all: 154d3ab |
Looks good! I'm not too worried about breaking undocumented behaviour even if we do erroneously use that behaviour ourselves (oops) - and I think we're on the verge of 3.0 anyway. Passing thought - is there any benefit in using some sort of table, or PooledString maybe, for relation roles rather than std::string? A quick check with taginfo suggests there's only 129 roles in widespread use (https://gist.github.com/systemed/29ea4c8d797a20dcdffee8ba907d62ea). (taginfo doesn't show roles for relations used <100 times.) |
I guess there are two concerns:
Re 1 - these don't spill to disk and are each taking 24 bytes. PooledString would get us down to 16 bytes, so save 8 bytes. Or we could do some custom mapping that could fit in a short for 2 bytes. So if we did the most aggressive thing, we could save 22 bytes per relation member. The UK has 3.9M relation members in a 1.6GB PBF. If I scale that to a 72GB planet file, that suggests 175M relation members, so the savings could be as much as 3.8GB. Yeah, that's probably worth doing. I'll have a think on how to do it without introducing a lot of locking. Re 2 - for the most popular roles, I think if we saved a reference to a string in the Lua stack, we could avoid the expense of Lua's defensive copies. Relation roles are a good candidate for this, as they're a small set used many times. I expect this to be relatively small compared to other Lua interop costs we're paying, so wouldn't do anything about it immediately. |
Yes - it was memory I was mostly thinking of, in particular for route relations. Way/node ID, plus an index into a vector of strings, should fit in a 64-bit struct. |
This is just the translation of the algorithm and some smoke testing that it seems to do the right thing. It's not integrated into Tilemaker yet, that's blocked on systemed#632
These seems to work well and, at least for large polygons (city parks, national parks), is faster than Boost's centroid algorithm. That surprised me! I haven't benchmarked it on building polygons yet. There are several todos here around making it configurable, and making it play well with lazy geometries. Going to finish off the relation memory stuff then fix those.
You can now pass the preferred algorithm to use to LayerAsCentroid. Still to do: teach lazy geometries about which algorithm was used.
This ensures that the user gets the same, correct behaviour both in --materialize-geometries and --lazy-geometries. We can extend support to materialize geometries, but this PR is already getting big (and it has conflicts with the other 2 PRs), so I'm leery of getting further out over my skis.
Updated to reduce memory use for tracking roles, and to add polylabel support. The parks and forests in the Appalachians of America are a little better now. Note the locations of Cherokee National Forest and Pisgah National Forest. (This is a truncated extract - on the full map, Pisgah would likely be in a different location.) This switches polylabel to be the default. There are some areas for improvement:
These areas can all be improved -- I'm a little hesitant to do it while there are other Lua PRs outstanding that also have perf implications and will have code conflicts with this set of changes. Maybe we should start merging the Lua things into a shared branch, or start merging them to master? |
Reading through https://github.com/mapbox/polylabel/issues, I think there's also some risk of infinite loops if we don't pick the It seemed to work for the areas I tried. I think before cutting a new release, I'd want to run tilemaker with all the Lua changes on the planet in order to confirm there are no regressions in runtime or memory usage. |
Excellent - that’s a great improvement in the screenshot. I’m AFK today/tomorrow but will run it on the planet at the weekend unless you beat me to it! |
I've just run this (only) over the planet and it works really well. No hangs and I couldn't see any impact on runtime speed. (Planet completed in 4h38m53 with no runtime options other than Label placement is pretty good. Here's Rutland Water, which is U-shaped. Before, it's over the peninsula: After, it's in the water as it should be: We don't get all the country labels we'd like at lower zoom levels (particularly the USA) but this is because the points are near tile boundaries - so more of a style/rendering issue than a tile generation issue. At some point in the 3.X series I'd like to ship a new, better style (possibly to go with Shortbread rather than OMT) but that's for the future! |
Great, thanks for running it on the planet!
I see this, too, using OSM Bright: For OSM Bright, at least, I think the issue is unrelated to polylabel. It looks like the OMT Lua script only writes to the place layer in the tilemaker/resources/process-openmaptiles.lua Lines 170 to 175 in 7405050
OSM Bright only draws a rank 1 country label if there is an
For Canada, both its relation and its label node have such an attribute. For the US, only its relation has that attribute, not its label node. Hey, that's exactly the sort of thing that this PR can help with. I've pushed a commit that uses the new support for FindInRelation in nodes to propagate it from the relation if needed, and now USA shows up in OSM Bright. |
Well hunted! Think this is ready to go? |
Yup! |
Superb - thank you! |
This also makes
NextRelation
return a tuple of ID, role rather than just ID.According to https://www.lua.org/pil/5.1.html, I think that's not a breaking change.
Motivation: When I naively try to create labels from OSM entities with the
place
tag, I occasionally get duplicates due to relations.For example, the city of Guelph has relation 7486148, which has node 36576733 as its
label
member.They both have
place=city
, so my Lua script faithfully spits out two labels.This PR allows me to suppress the label from the node, which is a start.
Ideally, I'd actually prefer to use the node, as it'll likely have a nicely human-curated location. For that, I'd need the relation to be able to interrogate its members.
Would you be open to me extending this PR to add
NextMember
,FindInMember
andRestartMembers
functions that mirror the ones used by nodes/ways ?