Visualising topic-based conversation networks: the #masterchef edition

In future analysis we’ll be interested in doing some form of comparison between the #ausvotes data we’ve been looking at (and that Axel has already blogged about earlier this week), and other topics of shared interest among Australian Twitter users. As an exceptionally high-rating Australian prime-time TV show that was also a trending topic on Twitter, Masterchef is a particularly interesting example of such a topic drawn from popular culture. The patterns of Twitter use around this highly popular, nationally-based show (perhaps even more so than around the pre-election debate) can hopefully help us to understand something about the practices of the networked television audience as a public.

Working with the complete Twapperkeeper archive of tweets that contain the #masterchef hashtag from 09 July (when I set it up) until the morning after the nailbiting final episode (26 July), I plan to run through the processes we’ve been using for analysing Twitter data related to the Australian election in a series of posts over the next few days. Hopefully this will be an interesting counterpoint to the uses of social media around the election (and a bit of a break from it!). Much of this is still experimental – I’m in the process of exploring various tools and techniques and thinking through how they might be useful in a more purposeful inquiry.

One thing we might be interested in is the patterns of conversation and attention among the livetweeting Masterchef audience. To be more precise, if we take @replies (rather than number of followers) as an indicator of attention, who was being referred to or “spoken to” most frequently as the series went along? And did they talk back? To get a sense of this, we can map the @replies as a directed social network graph and visualise the results using Gephi.

I ran a script over the complete Twapperkeeper archive of 60138 tweets, in order to extract each @reply instance and output it as a basic edge list in two columns: one column contains the username of the tweet author; the other contains the username of anyone they @replied in their tweets. The resulting .csv can then be directly imported into Gephi for visualisation. By the way, I’ve stripped the retweets out of the file first – while retweeting practices are interesting in themselves, they also result in a massive duplication of usernames and keywords, which tends to skew our results, making it hard to get a meaningful sense of the networks of conversation masterchef viewers have been engaged in.

After getting the visualisation underway, I’m immediately struck by the fact that there are a huge number of users who @reply others, but did not receive any replies themselves, and an enormous number of interconnected mini-conversations going on. This is interesting to see, and deserves going into in more detail. But at the level of trying to understand the dominant patterns of attention and conversation, it also makes the visualisation too messy to look at, however pretty it looks (and this is just a tiny section of the thing):

So I then filter the list to display only those users who received at least 3 replies:

So, here it is (try clicking ‘fullscreen’ in the seadragon viewer – you can zoom right in).

The nodes and their labels are scaled according to in-degree (the larger the node, the more incoming @replies that user has had). They are coloured according to out-degree: the reddest nodes represent users who send @replies the most, through yellow, and down to white for zero.

Simply by having a look at the image itself, we can start to make some basic observations. I’ll just take three nodes for now. Perhaps unsurprisingly, Masterchef judge and celebrity food critic Matt Preston’s account @MattsCravat receives lots of mentions (so has a high in-degree ranking and therefore a large node size), but at least in this version of the graph, his recipricocity is zero. Which matches up nicely with his follow-to-follower ratio: he follows only 67 people, but as of today he still has 13,941 followers. Then again, it may indicate that he never bothers to use the #masterchef hashtag when replying to followers, since it’s kind of obvious that’s what he’s tweeting about. That would make sense – and it points to a limitation of relying on hashtag archives.

By contrast, @Masterchef_PR appears as a small but intensely red dot somewhat off to the side of the main action – this indicates they were very active at @replying other Twitter users, but the love was not always returned. [But at least they aren’t as far out in the cold as the ‘official’ Masterchef account – see the lower left of the map].

A third and more interesting example: @MolksTVTalk, with only 436 followers, appears to be quite central to the conversation, and scores fairly highly on both in-degree and out-degree – this, I think is what the node of a highly conversational member of the Twitterati looks like: engaging with topics of common interest, using hashtags to do it, livetweeting the telly, and sharing jokes and running banter with fellow audience members.

If we were to do an in-depth analysis of the contents of these conversations, we could start by going back to a version of the archive that contained both the edge list and the contents of the tweets, and start to do some much more qualitative analysis of how these clusters of @replies work differently across different parts of the network. But I’ll leave that for later…

Published by Jean Burgess