On May 10, 2018, Democrats on the United States House Intelligence Committee released 3500+ ads created by the Internet Research Agency between 2015 and 2017. The Internet Research Agency is believed to have created these ads to influence the outcome of the 2016 United States presidential election, and in general influence Americans' political views.
You can download this data here. It's also mirrored via Github Large File Service (LFS) on this repository: https://github.com/russian-ad-explorer/russian-ad-pdfs.
There are five datasets collected here:
- Text files extracted from these pdfs using pdftotext
- Images extracted from these pdfs using pdfimages, and cropped using ImageMagick.
- A JSON file organizing some of the attributes found in the above datasets. Documentation about how this dataset was created, and the meaning of the keys, is coming soon.
- A dataset of thumbnails for easy loading on the Russian Ad Explorer.
- Two CSV files containing some informal categorizations of some of the "Targeted Interests" and "Location" categories, for use in the Russian Ad Explorer web app. You can read more about how these categories were chosen in the About section of that page.
I've mirrored the some of the datasets on Google Drive. You can download the extracted images here and the downloaded text here: here.