This post was originally featured on http://blog.sociomantic.com, published on May 6th, 2010. Since the website will be relaunched and the post removed, I have relocated the tutorial to my personal page so that the Gephi community can continue to benefit from it.
If a picture is worth a thousand words, then a graph must be worth a thousand spreadsheet rows, right?
Okay, maybe not, but for practitioners and researchers alike, data visualization can reveal insights that aren’t always obvious from looking at the raw data, no matter how well organized it may be. When we’re talking about social network, data visualization takes the form of a “social graph,” and it can be a powerful tool to discover deeper meanings and applications behind the relationships and communities within a network.
Here you can see some social graphs of the French political blogosphere created by researcher Tim Highfield using an open-source network visualization software called Gephi. After exploring Tim’s amazing Flickr full of graphs and reading @kristtina’s recent introduction to Gephi, I wanted to try out some of these social graph visualizations myself.
If you’re interested in something with less of a learning curve, there are lots of easy-to-use, mostly flash-based visualization apps for Facebook and Twitter. These are the ones I’m aware of:
The great thing about these apps is that they do most of the work for you. And a lot of them look pretty cool. The problem is that they don’t give you much room to explore. If you’re hoping to analyze your Facebook network with a little more depth — to discover community clusters and explore network science parameters like degree, betweenness, closeness, etc. – I’d recommend using Netvizz and Gephi. A colleague told me about Netvizz some time ago — it’s a Facebook app that allows you to make a .gdf file out of your Facebook friends or the groups you’re in (.gdf is the file type reader by programs like GUESS and Gephi).
Two quick notes about Netvizz:
1) Right now it can only analyze the friends of your your Facebook “profile” (for individuals) and the members of groups you’re in. Hopefully soon it will be able to provide .gdfs for “Page” fans as well so brands and companies can do Facebook social graph analysis using Gephi, too.
2) The .gdf files for the Facebook groups are limited to 500 randomly selected nodes, no matter the size of the group. (Theoretically you could generate the random list .gdf enough times to discover all the nodes in the group and combine them into one all-encompassing file if you were looking to do some serious network crunching.)
Here are some of the networks I analyzed using the .gdf from Netvizz in Gephi:
Here’s a quick key to understanding these graphs:
- Circle = Node = Facebook friend or group member
- Line = Edge = Facebook connection (friendship)
- Node size = Betweeness centrality (measure of how much a node connects otherwise disconnected communities)
- Node color = randomly chosen colors used to represent the communities/clusters, determined here based on their modularity class via the Louvain method
Taking a Closer Look at Using Gephi
I think the most interesting network I analyzed was the one I posted an image of up at the top of this post. You can easily see the different communities to which my friend is connected identified in the graph, and it’s interesting to see which nodes have the most impact over multiple groups.
Since I took screen shots along the way, I made this slideshow to explain the steps I took to reach the final visualization.
Since I’m still learning I initially followed the Gephi Quick Start guide. They have a file you can use to try out this process if you don’t want to use your Netvizz .gdf.
From an industry standpoint, studying social graphs like these over time can enable companies and brands to understand things such as:
- Which individuals are connecting disparate communities within their customer base. (If this Facebook network was my customer base, I’d definitely want to make sure I am reaching out to our managing director Thomas Nicolai, who has many connections to multiple communities within the greater network.)
- Over time and using methodologies to determine parameters like reputation and bandwidth, you can discover which individuals are gaining influence within particular clusters (e.g., someone who starts small might become more influential over time)
I hope you found this tutorial helpful! Please feel free to share the link to help others learn