I’m following up a little further on my post of our first, very tentative and incomplete, map of the Australian Twittersphere, for another slightly more detailed look. First, though – also in response to some of the Twitter comments to the first posting, here’s another clarification of what you’re seeing.
In the first place, the total number of ‘Australian’ (by our criteria) Twitter users we’ve identified so far is about 550,000. Of these, at this point we have data on their follower/followee networks for about 450,000. If we exclude from this group all those users who have fewer than five followers, we’re left with roughly 150,000. So, if you see me use these numbers, that’s where they’re coming from.
The maps I’ve posted display these follower/followee networks based on affinity – those users who are closely interlinked through a range of connections with each other appear close to one another in the network, forming clusters which (we should assume) are determined by shared interests and other shared attributes. From a quick glance at the users involved in these clusters (or at least, at their usernames, which often indicate their interests or backgrounds), this seems to hold true, especially for the most connected users.
Additionally, we also find a number of very highly connected users (people that a great number of Twitter users follow, well beyond specific interests clusters) who transcend such interests to some extent – this is true for example for accounts like those of Julia Gillard or Kevin Rudd, as well as ABC News, for example.
Now, beyond manually identifying clusters and determining the specific drivers for their formation, our network visualisation software Gephi also provides its own functionality for identifying sections of the network which are especially closely interlinked – so, in addition to my own rough exploration of the network structure, I thought I’d also run Gephi’s partitioning algorithm over the data to see what it would come up with. Here are the results – first, for the full network of over 450,000 known users (click here for the full PNG – 8MB). The positioning of nodes in this and in the following map aligns with the corresponding maps in my previous post, incidentally:
In the first place, of course, it’s good to see that Gephi actually did identify a number of larger partitions in the network (it would be very disconcerting if it hadn’t). We may also note that they broadly align with the larger divisions we’ve identified manually: the turquoise top left quadrant largely matches the large political/media cluster, the red in the bottom left combines the various sporting codes; the fuchsia-coloured nodes in the top right covers PR as well as the Adelaide and wine clusters (the latter of which could be understood as focussing on a specific form of PR, perhaps). On the bottom right things get more interesting, and we may distinguish a smaller food cluster (in yellow) from magazines, radio, and television in blue; the black offshoot towards the bottom is what we’ve previously identified as teen culture; and the greens in the far bottom right are our false positives from the Philippines. And in between all this, we see a smattering of pink, and orange, and a handful of purple nodes; we’ll need to do more work to see whether these are genuine clusters in their own right, or simply various categories of leftovers which don’t clearly belong to any one of the major groups. (It’s interesting to note in this respect that there are a number of very large purple nodes – perhaps some of them are a distinct category of widely-followed, transcluster nodes?)
At this point, Gephi doesn’t provide us with any additional tools for setting the sensitivity with which it identifies clusters, by the way – so this is all we’re able to get for the moment. However, we are able to repeat the process for our reduced map of the best-connected 150,000 nodes in the network – here’s the result (click here for full PNG – 10MB):
In this graph, some further sub-partitions are becoming apparent: while politics and sports remain roughly as they were, PR has now splintered, and Adelaide, wine, and food have become their own cluster in the top right (in orange). Additionally, a particularly Perth-focussed group of PR-related accounts form a satellite cluster at the centre top of the main graph (in ultramarine) By contrast, the bottom right quadrant has formed new allegiances: fashion and fashion magazines have formed their own cluster in the centre right (in green), while the remnants of radio, TV, and teen culture now form one loose association (in yellow). Finally, the Filipino false positives are now located at the very top, and have been reduced to a small green speck.
What’s particularly striking about this image, though, is the thoroughly intermixed centre of the graph, where the four major communities (red, green, purple, and yellow meet and interconnect. Here, cluster boundaries break down, and we should probably pay particular attention to what’s happening in this area: the users who connect with one another here (as well as along the various bilateral intersections between individual clusters) are the boundary riders who we might assume play a particularly important role in brokering the flow of information from one community to the next, assuring that individual interest groups are never entirely disconnected from one another. To stretch the metaphor: we might infer that they’re what keeps the Australian Twittersphere a federation, rather than simply an assemblage of individual states.
Again, I stress that these maps are entirely preliminary; we do not claim that they show the complete and final shape of the Australian Twitterverse. They do already provide a tantalising glimpse of what’s to come, though – some time further down the track.