Skip to content
This repository has been archived by the owner on Dec 19, 2024. It is now read-only.

Latest commit

 

History

History
61 lines (54 loc) · 4.12 KB

README.md

File metadata and controls

61 lines (54 loc) · 4.12 KB

local-branch-mapping

Code and documentation for the local branch mapping project, a cooperation of 510 and the International Federation of Red Cross and Red Crescent Societies (IFRC).

Authors: Jacopo Margutti, You Hu, Anna Pechina (2019)

Objectives

The technical goals of the project were to build tools and methodologies to automatise the following operations:

  1. Extract locations of local branches of a given Red Cross National Society (NS) from the web.
  2. Extract information on those branches (contacts, capacity, activities).
  3. Compare and contrast this information with (1) relevant needs of the given country and (2) the IFRC's strategy 2030.

The tools described belowed were built and tested studying 4 pilot countries:

  1. Malawi
  2. Guatemala
  3. Lebanon
  4. Netherlands

Data Sources

Introduction

The data sources investigated during this project were the following:

  1. HDX
  2. UNdata
  3. World Bank Open Data
  4. FDRS
  5. Specific datasets obtained from the NS
  6. Social media: Facebook and Twitter
  7. Google Maps
  8. OpenStreetMap
  9. GDELT (database of events, locations and organisation built from newspapers)
  10. National and regional online newspapers
  11. NS' website(s).

Source Selection

After reviewing the above sources, the sources #6, #7, #8, #10 and #11 were selected as most relevant and included in the following analysis. First observations:

  • Social media: contain information on locations and activities of some local branches
    • The fraction of local branches with a dedicated account on social media platforms varies from country to country
    • Facebook shows the most number of local branch pages, while Twitter usually hosts very few accounts
  • Google maps: contains locations and some information (contacts, opening hours) of some local branches
  • OpenStreetMap: contains locations of some local branches; less populated that Google maps.
  • National and local online newspapers: contain information on some local branch activities; difficult to unambigously associate one activity to one specific local branch, if not explicitely mentioned, but possible to guess by geographical proximity.
  • NS website(s): can contain locations and contact information of local branches, but not for every country: among the pilot countries, only the Netherlands and Guatemala had a dedicated page with such information.

The other sources were discarded for the following reasons:

  • Sources #1 to #4 do not contain any information on RC local branches.
  • Source #4 contains detailed information on the NS, but little on local branches (only the total number of them).
  • Source #5 was concluded to be out-of-scope, as it is impossible to automate the analysis of an arbitrary, possibly unstructured data set.
  • Source #9 was found to contain little information on RC local branches, as it is built mostly upon content produced by major international news agencies.

Data Collection

Data was collected from the aformentioned sources using the corresponding APIs (Application Programming Interfaces). Python scripts were developed to query and download data in an automated fashion; they can be found in the following directories, together with documentation on how to run them:

  • collect_social_media_data: Social media (Facebook, Twitter)
  • collect_google_maps_data: Google Maps
  • collect_openstreetmap_data: OpenStreetMap
  • collect_local_news: National and local online newspapers
  • collect_addresses_from_web: NS website(s)

Data Analysis

Code and methodologies developed to perform data analysis is stored in 3 separate directories, one for each of the following tasks

  1. merge_locations: merge locations from social media, Google Maps, OpenStreetMap and the web to assign a best estimate for the location of each local branch
  2. extract_topics: extract topics from social media content
  3. map_to_dashboard: map social media content and news articles to the dashboard mockup