This set of scripts is used to enrich citation data with links to dbpedia and wikidata and then transform that citation data into a dynamic GML graph.
The citation data source for this script was downloaded from here: https://sites.google.com/site/vispubdata/home
It contains information on IEEE Visualization (IEEE VIS) publications from 1990-2020.
Data Citation: Petra Isenberg, Florian Heimerl, Steffen Koch, Tobias Isenberg, Panpan Xu, Chad Stolper, Michael Sedlmair, Jian Chen, Torsten Möller, and John Stasko. vispubdata.org: A Metadata Collection about IEEE Visualization (VIS) Publications. IEEE Transactions on Visualization and Computer Graphics, 23(9):2199–2206, September 2017. (doi: 10.1109/TVCG.2016.2615308)
The data pipeline is as follows:
Download from https://sites.google.com/site/vispubdata/home
↓
publications.csv --> Transform to JSON using CSVtoJSON.py
↓
publications.json --> Enrich with DBpedia and WikiData links using get-concepts.py.
↓
enriched-publications.json --> Transform JSON to XML using JSONtoXML.py
↓ enriched-publications.xml (Intermediate file generated by JSONtoXML.py script) ↓
enriched-publications-eprints-model.xml --> Transform to a dynamic co-concept graph in GML format using Pig Latin script eprints-items-publications-date-merged-edges.pig
↓
OUTPUT/merged-file-co_node-dynamic-gml-with_edge_labels-withheader.gml --> Open directly with Gephi, apply layout and visual mappings, save and export renders
↓
-
enriched-dynamic-GML.gephi (saved gephi file)
↓
-
interactive folder contains interactive visualization exported using Gephi sigma exporter plugin
-
renders folder contains exports of visualizations of the graph from Gephi.
These allow for online interaction with the graph (search, community display, zoom/pan, etc.). The sigma export interactive visualization results are here:
- (English labels): http://photomedia.ca/visualizations/InfoViz/
- (French labels): http://photomedia.ca/visualizations/InfoViz-fr/
A folder with some exported PNG files of visualizations of the GML graph.
https://github.com/photomedia/citationDataEnrichTransform/tree/main/renders
- Giant Component, Node Size mapped to Betweenness Centrality (BC) on a spline.
- Giant Component, Node Size mapped to Betweenness Centrality (BC) on a spline, Filter Nodes with BC greater than .01
- Giant Component, Node Size mapped to Betweenness Centrality (BC) on a spline, Filter Nodes with Degree greater than 10
- Giant Component, Node Size mapped to Betweenness Centrality (BC) on a spline, Filter only Concepts that are related by publications from more than 1 conference
- Temporal Filters
- Filter leaving only concept relations that span 25 years or longer
- Filter by time 1990-2000, 2000-2010, 2010-2020
- Filter by time 1990-2000, 2000-2010, 2010-2020 and Duration of concept relations LESS than 10 years
- Filter by time (2015-2020) and Duration of concept relations LESS than 5 years