This page contains brief (generally one-liner) blocks of code for working with Python and PANDAS for data analytics. I created it as a handy reference for PANDAS commands I tended to forget when I was learning. I hope it proves useful to you, too! I also have a page with longer data analytics tutorials. Table […]
Python Data Analytics Tutorials
The bulk of my research involves some degree of ‘Big Data’ — such as datasets with a million or more tweets. Getting these data prepped for analysis can involve massive amounts of data manipulation — anything from aggregating data to the daily or organizational level, to merging in additional variables, to generating data required for […]
Downloading Tweets, Take III – MongoDB
In this tutorial I walk you through how to use Python and MongoDB to download tweets from a list of Twitter users. This tutorial builds on several recents posts on how to use Python to download Twitter data. Specifically, in a previous post I showed you how to download tweets using Python and an SQLite […]
SQLite vs. MongoDB for Big Data
In my latest tutorial I walked readers through a Python script designed to download tweets by a set of Twitter users and insert them into an SQLite database. In this post I will provide my own thoughts on the pros and cons of using a relational database such as SQLite vs. a “noSQL” database such […]
Downloading Tweets – Take II
The goal of this post is to walk you through a Python script designed to download tweets by a set of Twitter users and insert them into an SQLite database. In a previous post I supplied a brief, temporary attempt at providing an overview of how to download tweets sent by a list of Twitter […]
Using Your Twitter API Key
Below is an embedded version of an iPython notebook I have made publicly available on nbviewer. To download a copy of the code, click on the icon with three horizontal lines at the top right of the notebook (just below this paragraph) and select “Download Notebook.” I hope you find it helpful. If so, please […]
Setting up Access to the Twitter API
The Twitter API (application programming interface) is your gateway to accessing Twitter data. The image above shows a screenshot of Twitter’s Search API, just one of the key parts of the API you might be interested in. To access any of them you’ll need to have a password. So, in this post I’m going to […]
Analyzing Big Data with Python PANDAS
This is a series of iPython notebooks for analyzing Big Data — specifically Twitter data — using Python’s powerful PANDAS (Python Data Analysis) library. Through these tutorials I’ll walk you through how to analyze your raw social media data using a typical social science approach. The target audience is those who are interested in covering […]
Do I Need to Learn Programming to Download Big Data?
You want to download and analyze “Big Data” — such as messages or network data from Twitter or Facebook or Instagram. But you’ve never done it before, and you’re wondering, “Do I need to learn computer programming?” Here are some decision rules, laid out in the form of brief case studies. One-Shot Download with Limited […]
Levels of Analysis in Big Data
So you want to download “Big Data.” You could be a social scientist wanting to take your first stab at downloading and analyzing 100 organizations’ worth of tweets. Or a marketing or public relations practitioner interested in analyzing Facebook or Instagram or Pinterest YouTube activity by your competitors. Or a budding data scientist interested in […]