This is a series of iPython notebooks for analyzing Big Data — specifically Twitter data — using Python’s powerful PANDAS (Python Data Analysis) library. Through these tutorials I’ll walk you through how to analyze your raw social media data using a typical social science approach. The target audience is those who are interested in covering […]
Do I Need to Learn Programming to Download Big Data?
You want to download and analyze “Big Data” — such as messages or network data from Twitter or Facebook or Instagram. But you’ve never done it before, and you’re wondering, “Do I need to learn computer programming?” Here are some decision rules, laid out in the form of brief case studies. One-Shot Download with Limited […]
Producing a Summary Statistics Table in iPython using PANDAS
Below is an embedded version of an iPython notebook I have made publicly available on nbviewer. To download a copy of the code, click on the icon with three horizontal lines at the top right of the notebook (just below this paragraph) and select “Download Notebook.” I hope you find it helpful. If so, please […]
iPython Notebook and PANDAS Cookbook
More and more of my research involves some degree of ‘Big Data’ — typically datasets with a million or so tweets. Getting these data prepped for analysis can involve massive amounts of data manipulation — anything from aggregating data to the daily or organizational level, to merging in additional variables, to generating data required for […]
Random Python Recipes
This page is mostly for me as a handy reference for all those Python commands I tend to forget. That said, if it proves helpful to any others, all the better! Lists Create list by slicing items of another list: Combine two lists into a dictionary: Get list of all files in a directory: Find […]
Python Tutorials for Downloading Twitter Data
I often get requests to explain how I obtained the data I used in a particular piece of academic research. I am always happy to share my code along with my data. Having been through the learning process myself about 5 years ago, I understand the confusion and frustration that can go along with learning […]
Setting up Your Computer to Use My Python Code for Downloading Twitter Data
I frequently get requests for how to download social media data in general, as well as for help on how to run code I have written to download and analyze the data I analyzed for a particular piece of research. Often, these requests are from people who are excited about doing social media research but […]
Tag Cloud Tutorial
In this post I’ll provide a brief tutorial on how to create a tag cloud, as seen here. First, this assumes you have downloaded a set of tweets into an SQLite database. If you are using a different database please modify accordingly. Also, to get to this stage, work through the first 8 tutorials listed […]
#ARNOVA14 – Tag cloud of ARNOVA 2014 Tweets
Word clouds are generally not all that helpful given how the words are taken out of their context (sentences). In certain settings, however, they do provide meaningful information. Hashtags are one of those contexts — they are meant to be single words. The tags denote ideas or topics or places. By examining the hashtags, we […]
Your First Steps with Python: Part II — Four Ways to Run your Code
This is the second in a series of posts to get you up and running on Python. In the first post I showed you which version of Python to install, how to check that the installation succeeded, and how to type in and run your first simple Python command. In this tutorial I will show […]