DBPedia automatically extracts data from Wikipedia and may contain links to the municipalities' websites.
The Portuguese language DBPedia extracts data from the Portuguese language Wikipedia, while the English language DBPedia extracts data from the English language Wikipedia. We are going to query them both.
The Portuguese language DBPedia does not use the dbo:country
property, so
getting Brazilian cities is a little tricky. Here we use having a link to
the wiki page "States of Brazil" as a filter for getting only cities located
in Brazil, instead.
The use of the foaf:homepage
property is rare, so we have to resort to using
a dbo:wikiPageExternalLink
property in addition to that. Keep in mind that
this will pollute the results to other pages which are not the official pages
of the municipality, so we need to filter them out somehow. The simplest way
of doing that is by using a SPARQL FILTER
clause to get only containing
.gov.br
. Unfortunately, some municipality websites do not conform to that
and will be missing in the query.
The following SPARQL query will extract links from the Portuguese language DBPedia:
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo:<http://dbpedia.org/ontology/>
PREFIX dbp:<http://dbpedia.org/property/>
PREFIX dbr:<http://dbpedia.org/resource/>
SELECT ?city, ?name, ?state, ?link WHERE {
?city a dbo:City ;
dbo:wikiPageWikiLink dbr:States_of_Brazil ;
dbo:wikiPageExternalLink|foaf:homepage ?link .
FILTER REGEX(STR(?link), ".gov.br")
OPTIONAL {?city rdfs:label ?name}
OPTIONAL {?city dbp:estado ?state}
}
This is query is a little more complicated compared to the Portuguese language DBPedia, because while the data is more structured, we cannot get information about the state directly. Other cities have no state information assigned.
At least for filtering by country we can simply use the
dbo:country
property to determine that a city is located in Brazil.
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbo:<http://dbpedia.org/ontology/>
PREFIX dbr:<http://dbpedia.org/resource/>
PREFIX dbp:<http://dbpedia.org/property/>
PREFIX foaf:<http://xmlns.com/foaf/0.1/>
PREFIX yago:<http://dbpedia.org/class/yago/>
SELECT ?city, ?name, ?state_abbr, ?state_name, ?link, ?external_link WHERE {
?city a dbo:City ;
dbo:country dbr:Brazil .
OPTIONAL {
?city foaf:homepage ?link .
}
OPTIONAL {
FILTER REGEX(STR(?external_link), ".gov.br")
?city dbo:wikiPageExternalLink ?external_link .
}
OPTIONAL {
?city rdfs:label ?name
FILTER(LANG(?name) = "" || LANGMATCHES(LANG(?name), "pt"))
}
OPTIONAL {
?city dbo:isPartOf ?state .
?state a yago:WikicatStatesOfBrazil .
?state dbp:coordinatesRegion ?state_abbr .
}
OPTIONAL {
?city dbo:isPartOf ?state .
?state a yago:WikicatStatesOfBrazil .
?state rdfs:label ?state_name .
FILTER(LANG(?state_name) = "" || LANGMATCHES(LANG(?state_name), "pt"))
}
OPTIONAL { # cities linked to a state whose URI has changed
?city dbo:isPartOf ?state_old_page .
?state_old_page dbo:wikiPageRedirects ?state .
?state a yago:WikicatStatesOfBrazil .
?state dbp:coordinatesRegion ?state_abbr .
}
OPTIONAL { # cities wrongfully linked to a city instead of state
?city dbo:isPartOf ?other_city .
?other_city dbo:isPartOf ?state .
?state a yago:WikicatStatesOfBrazil .
?state dbp:coordinatesRegion ?state_abbr .
}
}