Maybe you're interested About this site or in some of my Projects or Articles. You might even be interested About me or My Friends. If all else fails head back Home.

Morethanseven is where plays with the web

Archiving Twitter data with Python

November 23rd, 2007

Alex from Twitter just got round to adding the ability to export your entire archive of tweets via the API. A few people on the mailing list had been asking for this for a while so good to see it get released.

I couldn’t resist knocking together a very quick and simple Python script to go off and get all your tweets, presented here for anyone else to play around with. Note that simple, fast and works on my machine were watchwords here. Don’t expect fancy parameters, much error handling or artificial intelligence.

import urllib2
username = '<YOUR USERNAME>'
password = '<YOUR PASSWORD>'
format = 'json' # json or xml
filename = 'archive.json' # filename of the archive
tweets = 164 # number of tweets
pages = (int(float(tweets)/float(80)))+1
auth = urllib2.HTTPPasswordMgrWithDefaultRealm()
auth.add_password(None, 'http://twitter.com/account/', username, password)
authHandler = urllib2.HTTPBasicAuthHandler(auth)
opener = urllib2.build_opener(authHandler)
urllib2.install_opener(opener) 
i = 1
response = ''
print 'Downloading tweets. Note that this may take some time'
while i <= pages:
    request = urllib2.Request('http://twitter.com/account/archive.' \
    + format + '?page=' + str(i))
    response = response + urllib2.urlopen(request).read()
    i = i + 1
handle = open(filename,"w")
handle.write(response)
handle.close()
print 'Archived ' + str(tweets) + ' of ' + username + \
'\'s tweets to ' + filename

Download TwitterExporter Script

The main thing to note here though is the ideal of data portability. Let’s just say I wanted to move to Jaiku or Pownce instead of Twitter, but didn’t want to lose all that data. If I can knock up an export script in half an hour then all you need are import data API calls and a little more scripting.

As it is, I’m more interesting in seeing what I can do with my data now I can get at it. Brian just suggested a quick visualisation tool (which already eats JSON) and I’d already been thinking about language analysis and maybe having a look at APML some more. Open data is simply more fun.

Popularity: 6%

Tagged , , , , ,

2 Responses to “Archiving Twitter data with Python”

  1. [...] exported my Twitter data. Thinking about importing the tweets into the blog as quickie posts. « Quickie: 01/11/2008 [...]

  2. I like this idea, but any thoughts on how to automatically download all of your tweets? With the above code you have to know that number beforehand…

Leave a Reply