Software

List of research software currently developed and maintained by Bernhard Rieder, Associate Professor in Media Studies at the University of Amsterdam and researcher with the Digital Methods Initiative.

Over the last years, I have been working on quite a number of research tools, mainly for the data driven analysis of social media platforms. My main goal is to gain a deeper understanding of the logics embedded in these platforms and their APIs, but I think that writing research tools is an excellent way to pursue this kind of exploration. Nothing beats first hand experience.

Development and maintainance of some of these tools are financed by the Dutch Platform Digitale Infrastructuur Social Science and Humanities as part of the CAT4SMR project.

Most of these tools have basic descriptions or FAQ sections. There are also a number of instruction videos on my YouTube channel.

For the tools covered by the CAT4SMR project, we provide support through a subreddit and a Facebook Group. All other tools are "as is" and are not supported.

High quality bug reports are much appreciated. If you have no experience with reporting bugs effectively, please read this piece, preferably twice. Submit bug reports via github.

Data Extraction

DMI-TCAT

The Digital Methods Initiative Twitter Capture and Analysis Toolset - developed with Erik Borra and Emile den Tex - provides various ways to retrieve and collect tweets from Twitter and provides a number of modules to analyze tweet collections. Requires server installation.

source code

Netvizz

A tool that extracts data from different sections of the Facebook platform – in particular groups and pages – for research purposes since 2010; due to API and ToS changes, the feature set has changed over the years. Facebook removed page data access on Sept. 4, 2019 and Netvizz is no longer functional.

launch tool intro video

TumblrTool

A simple tool that gets posts tagged with a specific term and creates tabular statistics and co-tag networks.

launch tool source code

Visual Tagnet Explorer

A simple tool that gets media from Instagram tagged with a specific term or posted around a specific location and creates tabular statistics and co-tag networks. Since Instagram has changed its API regulations, this tool no longer works.

launch tool source code intro video

YouTube Data Tools

A collection of simple tools for extracting data from the YouTube platform via the YouTube API v3.

launch tool source code intro video

Analysis

LineMiner

A tool for analyzing textual data stored in timestamped lines of text (e.g. files from Netvizz, DMI-TCAT, etc.). Provides fast text searching and some statistical and visual text analysis. Work in progress. Requires server installation.

source code

RankFlow

A visualization tool for analyzing changes in ordered lists (e.g. rankings) over time.

launch tool source code

SimilarityNet

Calculates cosine similarity between lists of quantified variables (i.e. feature vectors) and outputs a similarity network.

launch tool source code

Tag Cloud HTML Generator

Input tags and values in wordle format to produce a HTML tag cloud or tag list.

launch tool

Textanalysis

Another small text analysis tool for emoji statistics and bigram/collocation extraction.

launch tool

Scripts

MemeSpector

A simple PHP script for using Google's Vision API. Takes a comma- or tab-separated file containing a column with image URLs as input, sends images to the Vision API and puts the detected annotations back into the list.

source code

PiperScraper

A series of PHP scripts to scrape and analyze pipermail list archives.

source code

Reddit Tools

This is a collection of PHP command line scripts to grab data from Reddit and transform it into CSV files.

source code

Textprocessing

A (possibly growing) collection of basic Python scripts that interface common data with more complex forms of text processing.

source code

YouTube Transcript Scraper

Script that uses browser automation to click through the YouTube web interface and download the transcript file. A basic example for starting with Selenium.

source code

Funstuff

Spotify Artist Network

Creates networks of related artists, based on data from Spotify.

launch tool