Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeat Parks results #57

Open
allthesignals opened this issue Dec 17, 2014 · 8 comments
Open

Repeat Parks results #57

allthesignals opened this issue Dec 17, 2014 · 8 comments
Labels

Comments

@allthesignals
Copy link
Contributor

When filtering parks in the API:
http://0.0.0.0:8000/parks/search?neighborhoods=15

The database returns, for example, Mystic River Reservation multiple times. The actual database table has these repeated entries, and they have unique ids, but I'm not sure why they are repeated and I don't want it to break the map.

So, how do we eliminate duplicate results and not break the map?

@cspanring
Copy link
Contributor

I can imagine 2 strategies:

A) data cleanup on the database layer: aggregate all "duplicate" records to multi-part geometries
B) masquerade and return a unique list of park names by using a set or the distinct QuerySet method

In case of B you'll need to do an aggregation on the park detail view then too, to show all geometries of the park.

@allthesignals
Copy link
Contributor Author

Thank you, @cspanring. I like strategy A. It looks like the parks_park geometry field is already a MultiPolygon type.

Quickly pulling it in QGIS, I can dissolve on the name field. That way there aren't "duplicates". Dissolving or grouping by a string is not my favorite way of doing this, but I don't see another field to use for this operation.

Presumably, if I write a query that does this, it'll have to update the foreign keys in other tables.

Thanks for any help you're able to provide!

@allthesignals
Copy link
Contributor Author

I think I will have to create a join key for matching old IDs to aggregated IDs. Then somehow update the foreign-key constraints and references. Here's the \d+ output:

Foreign-key constraints:
    "parks_park_friendsgroup_id_fkey" FOREIGN KEY (friendsgroup_id) REFERENCES parks_friendsgroup(id) DEFERRABLE INITIALLY DEFERRED
    "parks_park_parkowner_id_fkey" FOREIGN KEY (parkowner_id) REFERENCES parks_parkowner(id) DEFERRABLE INITIALLY DEFERRED
    "parks_park_parktype_id_fkey" FOREIGN KEY (parktype_id) REFERENCES parks_parktype(id) DEFERRABLE INITIALLY DEFERRED
Referenced by:
    TABLE "parks_park_images" CONSTRAINT "park_id_refs_id_3ddc7e99c10af801" FOREIGN KEY (park_id) REFERENCES parks_park(id) DEFERRABLE INITIALLY DEFERRED
    TABLE "parks_park_neighborhoods" CONSTRAINT "park_id_refs_id_ba495371" FOREIGN KEY (park_id) REFERENCES parks_park(id) DEFERRABLE INITIALLY DEFERRED
    TABLE "parks_park_events" CONSTRAINT "park_id_refs_id_d04aff1e" FOREIGN KEY (park_id) REFERENCES parks_park(id) DEFERRABLE INITIALLY DEFERRED
    TABLE "parks_facility" CONSTRAINT "parks_facility_park_id_fkey" FOREIGN KEY (park_id) REFERENCES parks_park(id) DEFERRABLE INITIALLY DEFERRED
    TABLE "parks_story" CONSTRAINT "parks_story_park_id_fkey" FOREIGN KEY (park_id) REFERENCES parks_park(id) DEFERRABLE INITIALLY DEFERRED

I don't think it's possible to update a table with group by, then update referencing tables.

@allthesignals
Copy link
Contributor Author

Ah, theres a m:m between parks and neighborhoods. That join table will have to be updated especially.

@allthesignals
Copy link
Contributor Author

Same with park_images. At first, I wasn't sure why, but many images are related to many "parks" which are split into separate geometries. Similarly, this join table would need to be update.

@cspanring I remember you mentioned that serializing GeoJSON was the reason for using only singlepart geometries. I'm not sure where in the app we needed to serialize GeoJSON - the map data itself is a static topo.json.

  • Update parks:images join table
  • Update parks:neighborhoods join table
  • Update parks_facility park_id column

@cspanring
Copy link
Contributor

I think the initial version used GeoJSON. The park detail view also uses GeoJSON on the map I believe.

@allthesignals
Copy link
Contributor Author

CREATE TABLE update_parks_test AS SELECT ST_UNION(geometry), name FROM parks_park WHERE ST_IsValid(geometry) = true GROUP BY name;

Works nicely, but excludes 5 geometries that are invalid. Running something like update clean_parks_test set geometry=ST_MakeValid(geometry); gives this error: ERROR: Geometry type (GeometryCollection) does not match column type (MultiPolygon)

I'm repairing the geometries in ArcGIS instead and will move forward after that!

@allthesignals
Copy link
Contributor Author

There are a number of data migrations required if we use strategy A.

Strategy B, #distinct on 'name' works perfectly. This will break the existing map, but that can be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants