This is the repository of source codes and datasets for the Dynamic Vicmap project at the state level. The code has been tested on Victoria state, Australia, and Water_Hydro_Dataset. This process has three main steps:
-
Ontology Engineering: The ontology DV_project has been created based on the following standardized ontology as well as our own created data properties (shown as green):
- Vicmap_ontology: http://www.semanticweb.org/mkazemi/ontologies/2023/9/Ontology_Vicmap2#
- prov: http://www.w3.org/ns/prov#
- foaf: http://xmlns.com/foaf/0.1/
- xsd_ns: http://www.w3.org/2001/XMLSchema#
- schema_geo: http://schema.org/
- dc: http://purl.org/dc/elements/1.1/
- dcam: http://purl.org/dc/dcam/#
- owl: http://www.w3.org/2002/07/owl#
- skos: http://www.w3.org/2004/02/skos/core#
- terms: http://purl.org/dc/terms/#
- vann: http://purl.org/vocab/vann/#
- xml: http://www.w3.org/XML/1998/namespace
- dqv: http://www.w3.org/ns/dqv#
- dcat: http://www.w3.org/ns/dcat#
- geosparql: http://www.opengis.net/ont/geosparql#
- ns: http://www.w3.org/2006/vcard/ns#
- cube: http://purl.org/linked-data/cube#
- fsdf: https://linked.data.gov.au/def/fsdf/
The below image shows the snapshot of the classes, Object Property (OP), and Data Property (DP) of Vicmap_ontology.rdf where the blue ones are the classes and yellow ones are the DP captured from the previous standardized ontology mentioned above, as well as green ones are DP created by ourselves.
-
Data Stage: This stage has three steps as follows:
- Pre-processing of input data: four hydro dataset including authoritative data: Waterbody Hydro Polygon and Waterbody Hydro Point, non-authoritative data: Waterbody captured from ML have pre-processed based on: checking their geometry, having consistent attribute names including UFI, PFI, C_DATE_PFI, F_TYPE_COD, and finally checking CRS of different layers to be consistent with authoritative data.
- Embedded rules: There are mainly two rules embedded in the script for creating waterbody instances: 1) Intersection: when two polygons are intersected, then they are for the same waterbody. 2) Buffer: when a point within a buffer of a specific distance (e.g., 10m) of a polygon, then the point and polygon are two representations of the same waterbody.
- Creating waterbody instances: In the final step, the embedded rules are applied to the pre-processed data to return the final waterbody instances (wb_instances). Each waterbody instance shows whether it has a single or multiple representation.
-
Building Knowledge Graph (KG): The wb_instances are populated into the Vicmap_Ontology.rdf ontology to create the Knowledge Graph (KG) of the state of Victoria (KG_State).
-
Smart Queries: By translating the intended questions in natural language into SPARQL language, we will be able to run them over the statewide KG file and retrieve answers.
In summary, the below chart displays the flowchart of the work including the previous four main steps.
The following dataset need to be downloaded to run the script.
-
Waterbody hydro water polygon (HY_WATER_AREA_POLYGON_GeomChecked.shp): download from here.
-
Waterbody hydro water point (HY_WATER_AREA_POLYGON.shp): download from here.
-
Waterbody captured from ML (DamPredictions_Preliminary_V2.shp): download from here.
-
Waterbody captured from LIDAR (bendigo_2020mar05_lakes_mga55.shp): download from here
-
Victoria Parcels (Vicmap_Parcels.shp): download from here.
-
Authoritative Floods (VIC_FLOOD_HISTORY_PUBLIC.shp): download from here.
-
Non-Authoritative Flood (FullFloodExtents_25Jan_Detailed.shp): download from here
Dependencies:
- python 3.5
- geopandas
- rdflib
- pickle
- multiprocessing
- rtree
In order to run the main.py script, you have to do the following steps:
- Importing data: the above dataset is fed as input.
- Create wb_instances: This step will create wb_instances.pickle file.
- Creating KG: Our ontology (Ontology_Vicmap.rdf) in the Ontology folder is defined as input along with their Namespaces. Secondly, metadata files are defined for four dataset layers based on the ontology. Finally, the wb_instances created from the previous steps are populated into our ontology and return the KG_State file as output.
-
Importing large rdf file into GraphDB: The created KG_State file is around 5GB for the whole state, so this should be imported as server files into GraphDB as follows:
- Create a folder with the name graphdb-import in your GraphDB's server directory (on Mac OS it is in ~/ or equivalently /Users/my_username), and then place the downloaded rdf file there.
- Restart the Graphdb workbench and go to the import tab.
- You should see the rdf file name under the server files.
- Finally, provide the IRI of the rdf file (You can find this in the ontology rdf file (line 2 in the rdf file), it should be successfully imported into the online GraphDB workbench.
-
You can find the list of SPARQL queries in the in the Queries folder where you can run those over the imported Knowledge Graph (KG for the whole state) in GraphDB and retrieve answers.
We tested this work on the AWS and the following EC2 instances:
- C6I 32xlarge for creating Knowldge Graph for the whole state.
- R3.2xlarge for running queries in GraphDB desktop.
The following table shows the processing time in different steps in AWS:
EC2 instance | CPUs | RAMs | Processing time | |
---|---|---|---|---|
Data stage (Loading dataset, creating wb_instances) | c6i.32xlarge | 128 | 256 | 2.5 hrs |
Populating KG (~2 millions data instances) | c6i.32xlarge | 128 | 256 | 10.23 min |
Loading the KG rdf file into GraphDB | R3-2XL | 8 | 61 | 9 min |
Querying over the KG | R3-2XL | 8 | 61 | 3.5 min |
To run the web query visualisation dashboard, first CORS in GraphDB must be enabled.
- Open settings in GraphDB desktop
- Add a custom Java property: 'graphdb.workbench.cors.enable', value: 'true'
- Set the property, then save and restart GraphDB
- After GraphDB restart, confirm changes have been applied by checking Help -> System Information -> Configuration parameters.
Before running the web query visualisation dashboard, the local SPARQL endpoint URL must be specified in the dashboard HTML file, e.g., vicmap.html.
- In GraphDB, go to Setup -> Repositories
- Copy the repository URL to the clipboard using the link icon by the repository to use
- Open the relevant visualisation dashboard HTML file using a text editor (ideally make your own copy so as not to overwrite others' work when committing).
- Search for 'const endpointUrl = '
- const endpointUrl = 'http://localhost:7200/repositories/YOUR_REMOTE_GRAPHDB_ENDPOINT';
Run the web query visualisation dashboard by opening the HTML file in a web browser.
There are different javascript libraries included:
- Leaflet
- Proj4
- wellknown (this needs to be installed through npm install wellknown)
The below image shows the web query visualisation dashboard which has the following sections:
- GeoSPARQL endpoint
- Map view to show the GeoSPARQL results on map
You can access the public URL link for the map in here.
- Mohammad Kazemi Beydokhti ([email protected])
- Nenad Radosevic ([email protected])
- Matt Duckham ([email protected])
- Yaguang Tao ([email protected])