Data Planes

Here you can find data and reproducible code for some of my projects. Feel free to use the code as it is or modify (and improve) it for your own purposes.

Please do not hesitate to get in touch if you encounter any errors.


Bibliometric & network analyses

Collaboration networks & open access (code and data for analyzing publications on Open Science from Dimensions; work in progress)

Renaissance Florence (datasets containing relational and attribute data on Florentine families for the years 1426-34)


Crime analyses

History of death penalty (dataset and script for scraping, preprocessing, mapping, and visualizing historical data on executions in the United States, 1801-1900)

Serial killers (dataset and scripts for scraping, preprocessing, geocoding, and analyzing Wikipedia data on international and US American serial killers for the years 1435-2013)


Data visualizations

Airline flight routes (script for mapping airline routes using OpenFlights data combined with NASA's night lights images)

#TidyTuesday (collection of side projects to create some fun data viz)


Interactive dashboards

GitHub statistics (dashboard showing the Open Science MOOC's interactive collaboration network as well as repository statistics and user activities)

Project KillR (dashboard featuring data, statistics, and interactive visualizations on 576 serial killers from 51 countries for the years 1435-2013)


Scraper functions

Amazon data (functions to retrieve 1. product information for items in best seller lists and 2. customer reviews for one or more products, with either product ID/ASIN or URLs as input)

Aviation Safety Network (scraper for aviation accident data provided in the Aviation Safety Network database, covering the years 1919-2019)

Plane Crash Info (scraper for aviation accident data provided in the Plane Crash Info database, covering the years 1920-2019)


Social media analyses

Reddit: Today I Learned (script for scraping, preprocessing, and mining user comments from the TIL subreddit)

Social activism on Twitter (Markdown document containing text and code for analyzing the development, key actors, and contents of the #We2 movement)

Twitter follower analysis (Markdown replication script and data for analyzing the Open Science MOOC Twitter community)

Viewer engagement on YouTube (replication code and materials for analyzing YouTube comments and video statistics on the Florida High School Shooting in February 2018)


Spatial analyses

Aviation accidents (replication materials for mapping and analyzing crashes in Florida 2014 using point pattern analyses) [script slightly outdated]

Road accidents (script for geocoding, mapping, and analyzing crashes in South Australia 2016)

Spatial gravity models (replication materials for building dyadic data sets and conducting predictive spatial gravity analyses)


Text analyses

Movie scripts (script for preprocessing and analyzing PDF files using the screenplay of The Room)

Music lyrics (script for preprocessing, analyzing, and visualizing STARSET album lyrics)