-
Choose stories based on frequency of placename mentions across entire novel. Shortlist was developed from analysis of highest frequency of mentions in novels/novellas/short stories written by Australian authors. The CSV for this information is located here: Locations for the spreadsheet were extracted using Stanford’s NER 3-class classifier (NER code included in repository)
-
Compose CSV shortlist of 50 novels/novellas/short stories and assess each according to quality of reference to ‘place’ and whether it is a feature of narrative, and also appropriateness of narratives for high school students. Criteria for selection is as follows:
- 3000 words desirable
- Mentions of place - something is said about the place other than the name (description, attitude, even if a couple of sentences)
- Doesn't have to be Australian places
- OCR - if terrible find out if there is another version or versions in different newspapers that are more legible Link to spreadsheet located here
-
Create new spreadsheet in Google Docs of 28 narratives. Link to spreadsheet is located here
-
Manual cleaning of placenames and extracts:
- Identify false positives in placenames eg. ‘Miss’ or ‘French.’ Remove these rows
- Correct OCR errors and spelling in extracts
- Create a column that links to bibliographic information about the author preferably http://adb.anu.edu.au/biography or Auslit if this is not available.
-
Develop extracts from narratives that mention placenames (see document ‘Extract_Rule’ and integrate into spreadsheet (extraction code included in repository)
-
Geo-code locations. Use EZ-Geocode an add-on in Google sheets that allows 250 queries per day.
-
Visualise in ArcGIS Developers account and create pop-ups in-browser for Trove and to To Be Continued website (code for website is titled ‘index.html’).
-
Manual cleaning of place-names based on incorrect geo-locations.
Link to Google Drive: