-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Create script to restructure shape files #10
base: staging
Are you sure you want to change the base?
Conversation
Next step is to write a script to run the `loadshp` command to load the shapefiles into the dir.
May be I missed it but why is this still on Draft @DavidTheProgrammer ? |
I think my network died when I clicked the Button. It was supposed to be marked ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
- Don't know much about shapefiles to know know how useful this would be
- Shouldn't (level) keys be part of the app itself? (profile, settings, etc.) i.e. basically run this command to ensure the given shapefiles will work for your app.
- ./Country/Province/province.{shp,shx,dbf} | ||
- ./Country/Municipality/municipality.{shp,shx,dbf} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And wazimap needs and can handle this nested directory structure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @kilemensi, no it doesn't need the nested directory structure for the output, but it's super helpful if you have your files grouped into directories for input. I've updated it in the other PR to specify an output directory. So what's happening is actually:
Input:
Country/
└── country.shp
Output:
Country/
└── <output-dir>/
├── City.shp
├── Town.shp
└── District.shp
For simplicity I'm only showing the .shp
file. But each shape file comes with .dbf
and shx
files to complete the set. They are always in threes and so the chances of you having them in a directory in the first place is very high.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool @DavidTheProgrammer ... how does this affect loading of the shapefiles i.e. does loadshp
command also supports nested structure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kilemensi the loadshp
command loads file by file. The other PR I've worked on can now take advantage of this split directory and load all the files in one go by calling the loadshp
command with appropriate arguments for each shape file. It takes a glob pattern that matches all the .shp
files you intend to load.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial thoughts, few typos in the Command docstring
Code LGTM 👍
Still trying to figure out the relationship between this management command and the earlier one to load shapefiles available [here],(https://github.com/CodeForAfrica/wazimap-ng/blob/staging/wazimap_ng/boundaries/management/commands/loadshp.py)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So on running this, I see the output as below, few questions
- Given a directory
../tmp/shapefile/countries-regions
that contains all shapefiles for African countries, the rebuild files are each written to the specific country sub-directory e.g
/tmp/shapefiles/countries-regions/Angola/rebuilt
that has files
Province.dbf Province.shp Province.shx
. Although these files already exist in the pre-built country sub-directory, one level back as
Angola.cpg Angola.dbf Angola.prj Angola.qgz Angola.qmd Angola.shp Angola.shx
Is my understanding that we are looking at rebuilding indices and group this in folders based on Region such as Province
, County
etc, and wouldn't that necessitate a restructuring of how we re-organise our rebuilt indices?
- I still believe we intend to use the
python manage.py loadshp
to load the shapefiles.
…pfiles command. - Added a command to loadshpfiles from a directory, specifying multiple options for possible fields of each necessary value needed for the loadshp command. - Modified the rebuildshpfiles command to specify output directory and have that directory be skipped
Added new line at EOF Co-authored-by: Clemence Kyara <[email protected]>
Hi @thepsalmist,
We can't accurately load this file with the correct level into our Geography using the This command will split this file into 3 files as shown below: This split allows us to load these records with 3 calls to the The second script will take use the output of this command and load the shapefiles by calling |
I can respond to 2; unfortunately, that wouldn't work as this is a pre-requisite to running the app in the first place, it's one of the first things we do when trying to run the app; Loading the Geographies into the wazimap backend. |
…on. Fixed Docstring to reflect true actions.
I'm likely missing something @DavidTheProgrammer but:
Why wouldn't any of this work? |
|
Cool, cool ... I could be wrong but I don't think Django management commands are the kind of commands that someone will install on their system to do generic things; they're mostly tied to the running/management of app. I'm sure there are apps/tools/scripts/commands out there to process shapefiles. The utility of this command lies in it processing the shapefiles in a way that's useful to this app moving forward. But it's cool if you / Xavier think the current approach works; I feel like like just saying "I have these shapefiles for this profile" is enough for this kind of a management command; it should be able to pull levels, etc. from the specified profile.
Yes, my point was a |
That's a fair point @kilemensi. I guess the decision then lies on whether the options should be configurations in the app (in settings or somewhere) or passed as arguments to the command. Do I get you right? My design was inspired by the existing
Yes, that's true. And actually, this command's only job is to split the shape files to prepare them for load and it technically doesn't need anything other than the level keys. The other command in PR #11 is the one that actually does all the loading. |
Two things to keep in mind @DavidTheProgrammer / @thepsalmist
|
Description
Created a management command in the boundaries application to open and split shape files based on specific arguments passed in.
Why?
The wazimap backend stores geographies in a hierarchy and this data should be representative of the loaded shape files in the application. For example, the Africa root geography should contain countries and each country should contain it's representative sub region, i.e, province, county, district etc. There are some geographies that contain multiple sub regions at the same level, e.g City, Municipality, Province can all be children of Country and this should be accurately represented in the data. This script splits the shape file into multiple files based on these children so that they can be loaded using the
loadshp
command with the correct argument, e.g...loadshp ./cities/shape.shp Botswana City ...
Related Issue
#7
How to test it locally
Download the shapefiles from Google Drive and store them in some directory. Run the management command passing in the right options and verify that the files have been split according to your keys. e.g
python manage.py rebuildshpfiles ./datasets/shapefiles/country-regions ENGTYPE_1,Region
Checklist
Pull Request
Commits
Code Quality
Testing