List of research software currently developed and maintained by Bernhard Rieder, Associate Professor in Media Studies at the University of Amsterdam and researcher with the Digital Methods Initiative.
Over the last years, I have been working on quite a number of research tools, mainly for the data driven analysis of social media platforms. My main goal is to gain a deeper understanding of the logics embedded in these platforms and their APIs, but I think that writing research tools is an excellent way to pursue this kind of exploration. Nothing beats first hand experience.
Development and maintainance of some of these tools are financed by the Dutch Platform Digitale Infrastructuur Social Science and Humanities as part of the CAT4SMR project.
Most of these tools have basic descriptions or FAQ sections. There are also a number of instruction videos on my YouTube channel.
For the tools covered by the CAT4SMR project, we provide support through a subreddit and a Facebook Group. All other tools are "as is" and are not supported.
High quality bug reports are much appreciated. If you have no experience with reporting bugs effectively, please read this piece, preferably twice. Submit bug reports via github.
The Digital Methods Initiative Twitter Capture and Analysis Toolset - developed with Erik Borra and Emile den Tex - provides various ways to retrieve and collect tweets from Twitter and provides a number of modules to analyze tweet collections. Requires server installation.
A tool that extracts data from different sections of the Facebook platform – in particular groups and pages – for research purposes since 2010; due to API and ToS changes, the feature set has changed over the years.
Facebook removed page data access on Sept. 4, 2019 and Netvizz is no longer functional.
A simple tool that gets posts tagged with a specific term and creates tabular statistics and co-tag networks.
A simple tool that gets media from Instagram tagged with a specific term or posted around a specific location and creates tabular statistics and co-tag networks.
Since Instagram has changed its API regulations, this tool no longer works.
launch tool source code intro video
A collection of simple tools for extracting data from the YouTube platform via the YouTube API v3.
A tool for analyzing textual data stored in timestamped lines of text (e.g. files from Netvizz, DMI-TCAT, etc.). Provides fast text searching and some statistical and visual text analysis. Work in progress. Requires server installation.
A visualization tool for analyzing changes in ordered lists (e.g. rankings) over time.
Calculates cosine similarity between lists of quantified variables (i.e. feature vectors) and outputs a similarity network.
Input tags and values in wordle format to produce a HTML tag cloud or tag list.
Another small text analysis tool for emoji statistics and bigram/collocation extraction.
A simple PHP script for using Google's Vision API. Takes a comma- or tab-separated file containing a column with image URLs as input, sends images to the Vision API and puts the detected annotations back into the list.
A series of PHP scripts to scrape and analyze pipermail list archives.
This is a collection of PHP command line scripts to grab data from Reddit and transform it into CSV files.
A (possibly growing) collection of basic Python scripts that interface common data with more complex forms of text processing.
Script that uses browser automation to click through the YouTube web interface and download the transcript file. A basic example for starting with Selenium.