-
-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Administrator review of contributed councillor data #1203
Comments
@henare @hisayohorie it would be good to extract some more specific issues from the idea above. We've implemented the CSV API part with #1237 what's next? |
Riffing off what @henare has written above, here's a way I could imagine this working. Here's an idea for the next (small) step of how we get the CSV data into Popolo format:
How is this different from the current situation? Currently the admin has to find the authority in the Google sheet and add the new councillors. This is tricky because we generally already have councillors for most authorities that we got in our initial scraping. These existing councillors don't have emails, and that data is out of date after two years of natural turn over and elections. It's not clear what's been merged into WriteIt etc. Generally, it's tricky! If the CSVs only contained councillors with emails, this would be a lot more straight forward because you'd just be appending rather than merging. Having the CSVs in the repo would make things much clearer for admins because you could always work from clean files. With the google sheet, other edits can happen between contributions that change the Popolo, which can be really confusing. If the contributor is making changes to the CSV and the Popolo in one go, then the diffs should be a lot clearer and the cause and effect line between the csv and the Popolo should be clearer. So the CSV file remains the editable interface, and the JSON is generated data. Steps 2-6 above could also be automated as a next step. This feels like a stepping stone to some better solution, which is good. I can imaging how editing existing councillors fits into this flow too. Having the CSV's in the repo, rather than editble Google Sheets, would make automating a merge step for existing councillors much easier too. Adding the councillors to the bottom could be done in a rake task using the public CSV API for contributions, manually, or on the command line. So if we like this idea, here's how we could get there.
Now there's the production CSVs and the archived ones that have all that original data for keeping. Admins can go through the steps above to add new councillors. |
From step 2 above:
I think so :) So I think the next step here, is to add the ability on that repo to collect a reviewed contribution from the CSV API on PlanningAlerts and merge it into it's local CSV. We could then provide a way from admins to trigger that process (maybe by running a scraper?) and have a PR openned automagically. Where in this process the admin can make edits to the contribution if they need to? (update: they can now edit the contribution in PlanningAlerts before accepting it #1259) |
The scraper path seems like a simple first pass at this. Sounds like we'll need a way for an external service to find out about new reviewed contributions. Contributions API? |
The data store now has functionality and a Rake task for collecting a contribution csv from a remote URL. We have an API of accepted contributions https://www.planningalerts.org.au/councillor_contributions.json
I've opened a [WIP] PR over there with pseudo code for a scraper that could turn the accepted contributions in the JSON feed into Pull Requests to the councillor data. That scraper could be set to autorun to automatically open PRs when we accept new contributions, and admins could pop over to morph.io and hit run if they felt like it. It would be nice for admins to be able to trigger the PR step directly from PlanningAlerts too. Another option would be to run a little app that received a webhook from PlanningAlerts and then did what the scraper would do. Sounds like a second pass? Or someone could add the functionality to PlanningAlerts to be able to trigger runs with a webhook ;-). |
So far as part of Contributing Councillor Information we've built a way for people to suggest new councillor data. That data is then just stored in a new table in the PlanningAlerts database. What happens to it next is what this issue is about.
The simplest thing we can do to start is by replicating the existing process as much as possible but automating some parts of it. So far we've effectively built a way to enter the data that's not Google Docs but creates basically the same output, i.e. tabular data. So the next step is converting that into Popolo and opening a pull request to have it merged into the existing repository.
Here's some ideas for how we could do that:
First of all we could expose all of the contributions in PlanningAlerts via a CSV API. This pretty closely replicates what Google Docs provides us, albeit only for 1 council instead of a whole Australian state. (Done in Added download link for suggested councillors csv file in admin/councillor_contribution#show #1237)
We then need to convert that CSV into Popolo. The existing repository already has a Rake task to do this. It will need modification because it's currently designed to convert a whole Australian state's CSV into Popolo. Maybe here we need to add the contributed data onto the cached CSVs and then generate the Popolo from that?
Once we've got the changed data we need to commit that onto a new branch, push to GitHub, and open a pull request. Here we'll need to use some kind of Git library or shell out, and use the GitHub API to open the pull request. That should give administrators a nice diff they can review and easily merge.
Where should all this code live? @equivalentideas suggested a morph.io scraper, which made me think we could even put this code as a morph scraper into the data repository itself. That's worth considering but if somewhere else makes more sense then let's do that.
Of course that still doesn't get this data back into PlanningAlerts but those next steps should be extracted into other issues. This is the simple, largely status quo, first steps.
The text was updated successfully, but these errors were encountered: