Our work with Twitter data builds on a number of tools. Many posts on the blog describe how we’re using them. Here are our key tools:
Data Gathering
yourTwapperkeeper – an open source platform building on the popular Twapperkeeper Web service. Both capture tweets containing particular hashtags or keywords.
We have made some further extensions and modifications to the yourTwapperkeeper platform in order to ensure compatibility between TK and yTK datasets and to be able to export data in comma- and tab-separated formats. These modifications are described here; the modified yTK PHP scripts are available here:
yTK-modifications-v1.0.zip (9.4 kB) – v1.0, released 20 June 2011
Data Processing
Gawk – an open source, multiplatform, programmable command-line tool for processing CSV/TSV documents; essential for manipulating the datasets produced by our gathering tools.
We have developed a number of Gawk scripts for processing Twitter datasets in Twapperkeeper format. Many of the individual scripts are discussed on the blog; the current collection can be downloaded here:
Gawk-Twitter-scripts-v1.0.zip (23.7 kB) – v1.0, released 22 June 2011
Textual Analysis
Leximancer – commercial, multiplatform textual analysis tool: extracts key concepts from large corpora of text, examines and visualises concept co-occurrence
WordStat – commercial, PC-only textual analysis tool; part of a larger text statistics package: similar to but more powerful than Leximancer, and generates concept co-occurrence data that can be exported in standard formats for subsequent visualisation
Visualisation
Gephi – open source, multiplatform network visualisation tool: wide range of visualisation options, extensible plugin system, exports maps as PDF or SVG
Wordle – simple word cloud visualisation tool
Seadragon – handy tool for embedding large-scale images on a Web page; handles images, PDFs, SVGs, even URLs for Web pages…
RELATED POSTS




Dynamic Networks in Gephi: From Twapperkeeper to GEXF
You guys did a lot jobs, thanks!I started to do some working on Data v
Queensland Election, Week 4: All Over But the Shouting?
Hi Jim, thanks for this, and good luck on the weekend. The answer to
Queensland Election, Week 4: All Over But the Shouting?
I might like to point out that my actively used Twitter account @Jim4G
Does The Australian’s Paywall Affect Link Sharing?
I find it somewhat insulting as someone who only casually buys the new
Queensland Election, Week 3: The Twitter Story to Date
Hi Ryan, oops, sorry. Must have mixed up your account and @GREENS4Gls