Data Planes

Here you can find data and reproducible code for some of my projects. Feel free to use the code as it is or modify (and improve) it for your own purposes.

Please do not hesitate to get in touch if you encounter any errors.


Accident analyses

Aviation accidents (replication data; script for projecting, mapping, and analyzing crashes in Florida 2014 with point pattern analyses) [script slightly outdated]

Road accidents (script for geocoding, mapping, and descriptively analyzing crashes in South Australia 2016)


Aviation accident database scraper

Aviation Safety Network (scraper for the aviation accident data provided in the Aviation Safety Network database; covers the years 1919-2019)

Plane Crash Info (scraper for the aviation accident data provided in the Plane Crash Info database; covers the years 1920-2019)


Amazon scraper

Amazon scraper (scraping functions for 1. customer reviews from URLs and 2. product information of items from best seller lists)

Customer reviews (script for automated scraping of all customer reviews for one or more Amazon products, with product ID/ASIN as input)


Bibliometric analysis

Gender & authorship (replication data and code for analyzing bibliometric data from the Web of Science with regard to gender differences in Computational Social Science publications)


Crime analyses

Death penalty timeline (dataset and script for scraping, wrangling, mapping, and visualizing historical data on executions in the United States, 1801-1900)

Serial killers (dataset containing scraped Wikipedia data on international serial killers and scripts for scraping, geocoding, visualizing, and mapping data on both international and US American serial killers)


Network data

Renaissance Florence (datasets containing relational and attribute data on Florentine families for the years 1426-34)


Spatial analyses

Airline flight routes (script for mapping airline routes using OpenFlights data in combination with NASA's night lights images)

Spatial gravity models (replication data; scripts for building dyadic data sets and conducting predictive spatial gravity analyses)


Text mining

Classic literature (script for preprocessing and analyzing public domain works using the example of Bram Stoker's Dracula)

Movie scripts (script for preprocessing and analyzing PDF files using the screenplay of The Room)

Music lyrics (script for preprocessing, analyzing, and visualizing song lyrics using STARSET albums)

Reddit threads (script for scraping, preprocessing, and mining user comments from the Today I Learned subreddit)


Twitter data

Mapping user locations (script for scraping tweets with #MeToo and extracting, geocoding, and mapping user locations using the Google Maps API)

Timeline analysis (script for scraping, cleaning, and analyzing tweets from Donald Trump's timeline, e.g., via sentiment analysis and publication statistics)


YouTube analysis

Viewer engagement (replication code and additional materials for analyzing YouTube comments and video statistics on the Florida High School Shooting in February 2018)