Sarj

Archive for May, 2010|Monthly archive page

Sea Lion Woman

In Life in Berlin on May 8, 2010 at 7:12 pm

I had no idea that the original song was actually called “Sea Lion Woman,” or that it was an an old-time American folk song. In any case, I’m pretty sure it’s one of the best songs of all time, which is must be why there are so many terrible remixes and renditions. But Nina Simone does it so right, perhaps even better in her old age. What a gal.

Pretty sure the last time I wrote in this thing I was sitting in almost the exact same window seat at Cafe Cuccuma, thinking about many of the same things, watching row upon row of fashionable Berliners walk and bike by with their amazing shoes and adorable, mangy dogs.  A lot has happened in the two months since my last update, which is precisely why it’s taken so long to get back to this thing.

[PAUSE: How the f*&^ is everyone in this city so incredibly gorgeous? Seriously, WTF??]

To be honest, I don’t even know where to begin. Thanks to my (wonderful) job and the always-eventful weekend schedules, my weeks are absolutely flying by. My weekdays are filled with techy tweeting and blogging, project management, brainstorming galore. I am so proud of the guys that I work with — next level intelligence and incredible kindness. I feel exceptionally lucky to have found myself a part of such a wonderful team. Every day is a healthy challenge, and the work is always changing so every day feels fresh. Love it.

Advertisements

Using Netvizz & Gephi to Analyze a Facebook Network

In Technology on May 6, 2010 at 3:00 am

This post was originally featured on http://blog.sociomantic.com, published on May 6th, 2010. Since the website will be relaunched and the post removed, I have relocated the tutorial to my personal page so that the Gephi community can continue to benefit from it.

If a picture is worth a thousand words, then a graph must be worth a thousand spreadsheet rows, right?

A Facebook network rendered in Gephi

Okay, maybe not, but for practitioners and researchers alike, data visualization can reveal insights that aren’t always obvious from looking at the raw data, no matter how well organized it may be. When we’re talking about social network, data visualization takes the form of a “social graph,” and it can be a powerful tool to discover deeper meanings and applications behind the relationships and communities within a network.

Here you can see some social graphs of the French political blogosphere created by researcher Tim Highfield using an open-source network visualization software called Gephi. After exploring Tim’s amazing Flickr full of graphs and reading @kristtina’s recent introduction to Gephi, I wanted to try out some of these social graph visualizations myself.

The Alternatives

If you’re interested in something with less of a learning curve, there are lots of easy-to-use, mostly flash-based visualization apps for Facebook and Twitter. These are the ones I’m aware of:

Facebook:

Twitter:

The great thing about these apps is that they do most of the work for you. And a lot of them look pretty cool. The problem is that they don’t give you much room to explore. If you’re hoping to analyze your Facebook network with a little more depth — to discover community clusters and explore network science parameters like degree, betweenness, closeness, etc. –  I’d recommend using Netvizz and Gephi. A colleague told me about Netvizz some time ago — it’s a Facebook app that allows you to make a .gdf file out of your Facebook friends or the groups you’re in (.gdf is the file type reader by programs like GUESS and Gephi).

Two quick notes about Netvizz:

1)      Right now it can only analyze the friends of your your Facebook “profile” (for individuals) and the members of groups you’re in. Hopefully soon it will be able to provide .gdfs for “Page” fans as well so brands and companies can do Facebook social graph analysis using Gephi, too.

2)      The .gdf files for the Facebook groups are limited to 500 randomly selected nodes, no matter the size of the group. (Theoretically you could generate the random list .gdf enough times to discover all the nodes in the group and combine them into one all-encompassing file if you were looking to do some serious network crunching.)

Here are some of the networks I analyzed using the .gdf from Netvizz in Gephi:

Analyzing a friend's Facebook network -- You can see distinct community clusters

Facebook Group "Graphs & Social Networks" : A highly connected network!

Here’s a quick key to understanding these graphs:

  • Circle = Node = Facebook friend or group member
  • Line = Edge = Facebook connection (friendship)
  • Node size = Betweeness centrality (measure of how much a node connects otherwise disconnected communities)
  • Node color = randomly chosen colors used to represent the communities/clusters, determined here based on their modularity class via the Louvain method

Taking a Closer Look at Using Gephi

I think the most interesting network I analyzed was the one I posted an image of up at the top of this post. You can easily see the different communities to which my friend is connected identified in the graph, and it’s interesting to see which nodes have the most impact over multiple groups.

Since I took screen shots along the way, I made this slideshow to explain the steps I took to reach the final visualization.

Since I’m still learning I initially followed the Gephi Quick Start guide.  They have a file you can use to try out this process if you don’t want to use your Netvizz .gdf.

From an industry standpoint, studying social graphs like these over time can enable companies and brands to understand things such as:

  • Which individuals are connecting disparate communities within their customer base. (If this Facebook network was my customer base, I’d definitely want to make sure I am reaching out to our managing director Thomas Nicolai, who has many connections to multiple communities within the greater network.)
  • Over time and using methodologies to determine parameters like reputation and bandwidth, you can discover which individuals are gaining influence within particular clusters (e.g., someone who starts small might become more influential over time)

I hope you found this tutorial helpful! Please feel free to share the link to help others learn 🙂