OK, so this is the second part of my post on turning Twitter data from Twapperkeeper into a dynamic network visualisation in Gephi. Last night’s post did the groundwork, generating a GEXF file from our #spill hashtag dataset (covering Twitter discussion of an Australian Labor Party leadership spill between 7 p.m. and midnight (AEST) on 23 June 2010). In this post, we’ll work with this data file to generate a number of dynamic visualisations of the @reply activity (including old-style ‘RT @username’ retweets) during this time.
Essentially, here’s the overall network of the most active participants which we ended up with last night, now with each node’s degree value (number of @replies sent + number of @replies received, from within this most active group) next to its name. (If positions of nodes have shifted slightly from what they were, that’s because I had to recalculate the map again.) As noted at the end of part one, this overall map somewhat underestimates the weight of connections within the network, due to a limitation in how Gephi currently calculates its edge weight averages, but hopefully this will be fixed soon. What I’ve done in this new version of the map, though, is to highlight a number of interesting nodes in the network whom we’ll want to follow further:
I’ve highlighted the accounts of key politicians (no specific meaning implied in the colour choices, incidentally):
- then-Prime Minister @KevinRuddPM in red, and the spoofs @KevinRuddExPM and @KevinDuddPM in pink;
- deputy PM @juliagillard in dark red (a small dot down the bottom of the graph);
- a fake account for former Opposition Leader @malcolmturnbull in blue;
- then Fairfax reporter @LatikaMBourke in dark green;
- ABC reporter @annabelcrabb in light blue;
- The Punch writer Leo Shanahan (@_leo_s) in medium blue;
- ABC journalist Helen @Tzarimas in medium green;
- independent journalist @renailemay in yellowish green;
- News Ltd. journalist and The Punch editor David Penberthy (@penbo) in brown;
- ABC political editor Chris Uhlmann (@CUhlmann) in light green;
- Nine Network political editor @LaurieOakes in yellow;
as well as:
- the official @abcnews account in orange; and
- the account of ABC Managing Director @abcmarkscott in purple.
There are also a handful of other major nodes in the network, whom I haven’t coloured – part of my interest in this exploration is to examine the influence of known institutional actors (journalists and politicians), as opposed to prominent Twitter users who have no role that is relevant to reporting on the leadership spill. (A PDF of the map is here.)
Right now, though, we have a choice in Gephi between pretty and dynamic maps – so our further work will build on Gephi’s preview visualisations rather than its fully featured output. So, here’s what the map above looks like in Gephi’s preview:
Now, as I’ve outlined in a previous post, there are two major ways we can turn this map into a dynamic network graph. In the first place (which is also less processor-intensive), I’ll simply leave the network structure as is, but use the Gephi timeline tool to zone in on specific timeframes within the overall timeline.
Dynamic Visualisation 1:
Static Network, Dynamic Connections
That still leaves me with a choice of what length to select for the period on the timeline that we’ll focus on – let’s call this the ‘timeline aperture’, for want of a better term. A longer aperture (say, one hour) combines more consecutive @replies into the same visualisation: any @replies which were active at any point during that hour will be shown as part of the network – and remember that we’ve chosen to regard @replies as having a decay time of 1800s, or half an hour, so our one hour aperture window will contain both those @replies which are about to expire at the very start of the chosen hour, and those which only occurred in the dying seconds of the aperture timeframe. Here’s what this network looks like in our case (you’ll want to select the HD version of this YouTube video, and display it on the full screen):
Alternatively, if I’m using a floating 15 minutes (900 seconds) aperture period within the overall five hours, we’re able to see the ebb and flow of @reply activity in much finer detail. Our aperture time is now much smaller than the decay time which we’ve set for @replies, so in this network visualisation, in the main we’ll be seeing those @replies which were made before the start of the aperture period and will expire after it ends, with other @replies that were made or are due to expire during the period added into the mix. Again, choose HD and full screen as you view this:
In both these animations, I’ve begun by mapping the overall network for the entire period (of five hours, in this case); only once that was done – and Gephi’s ‘Force Atlas’ mapping algorithm had terminated – did I zoom in on specific timeframe apertures. Essentially, then, what we’ve visualised in these two animations is somewhat analogous to a roadmap, or to the synapses in the brain: we begin with a map of the overall connections, and then plot onto that map the greatest levels of activity that are currently taking place.
Dynamic Visualisation 2:
Dynamic Network, Dynamic Connections
But that’s not the only way to approach our task. Another, albeit significantly more computing-intensive, approach is to keep the mapping algorithm running, so that the network structure itself is constantly adjusted to encompass only that subset of network connections that is currently visible through our timeline aperture. For a #hashtag network, whose participants may not be Twitter followers of one another already, this is probably the more appropriate visualisation approach, as it shows the #hashtag network wax and wane over time as more or fewer user @reply (and retweet) one another under the #hashtag umbrella.
In visualising the network dynamically, we’re also starting to scrape against the boundaries of what’s technically possible with Gephi (at least using the processing power I have available right now). Gephi’s Force Atlas algorithm is pass-based, which means that it can take some time after moving our timeline aperture slider until the next processing pass begins and the changed aperture parameters have any effect on the visualisation. Patience is required: you’ll see a number of such update lags in the animation below, and in fact I’ve sped up the entire animation by a factor of two to stay within YouTube’s 15 minute limit. Still, I think the result has been worth the effort:
And that, most definitely, is it for now. More in the new year, no doubt!