Using a collection of maps from the published literature this repository forms the basis of a full content analysis for graphical data representation of spatio-temporal data in the scientific literature.
Because content analysis was largely performed on copyrighted material from the literature we do not include figures directly in this repository. To account for this issue we link to journal article DOIs, and indicate figure numbers in a data file. All subsequent analysis is publicly available, and described below in the
This project is an open project, and contributions are welcome from any individual. All contributors to this project are bound by a code of conduct. Please review and follow this code of conduct as part of your contribution.
Issues and bug reports are always welcome. Code clean-up, and feature additions can be done either through pull requests to project forks or branches.
All products of the Throughput Annotation Project are licensed under an MIT License unless otherwise noted.
This project is intended to function as an RMarkdown file that will render the paper manuscript for publication. To faciliate editing and the use of continuous integration tools we use a Makefile
as well as a bash script that will render the RMarkdown file to an HTML document, suitable for viewing in web applications. The ultimate publication
branch of this repository will be used for rendering to PDF/DOCX format for submission to publisher.
You can find an html rendering of the data analysis at https://fcgi.github.io/ContentAnalysis/paleomultivar.html.
To render the RMarkdown document use the Makefile included with this respository:
make content
The Makefile also can be used to clean supporting files out or the repository (make clean
), and can render both the paper and a version of the paper (using knitr::purl()
) that consists only of the R code (make localbuild
) to ease debugging.
We use a bash script (autobuild.sh
) to monitor changes in the RMarkdown file. This script checks the current timestamp of the Rmd file against the last saved version. If there is a difference between the two it will re-build the RMarkdown document. This allows a user to see their changes almost instantly, without having to stop their writing process.
bash autobuild.sh paleomultivar.Rmd
Th project uses a hand coded XLSX file as the basis for analysis. The XLSX file is structured in such a way as to facilitate hand-encoding. This file is then processed into CSV files using an R script, saved in the R folder.
Statistical analysis and figure generation is then performed within an RMarkdown document in the home directory of this repository.
This workflow allows us to run a single bash script to convert the XLSX file to a useful csv format, perform key analysis and knit the final report to a HTML/PDF/DOCX format in preparation to submission.
This project is developed using pandoc, with figures and analysis performed using R (RMarkdown). It should be possible to render the document using all operating systems. Continuous integration uses TravisCI.
The project pulls data from figures in journal publications. These figures are then assigned parameters based on content analysis following methods outlined in the RMarkdown document based on Rose 2012 and Muehlenhaus 2012 (among others cited within the paper).
This repository is intended to be the canonical repository for a paper intended for publication. Key outputs include the paper in markdown and PDF format. Figures for publication and key data outputs as well as the full bibliography of papers used herein.
This project is to be evaluated using the following metrics:
- Number of unique images analysed.
- Publication of a peer-reviewed journal article.