Visualizing the ‘Power Struggle’ in Wikipedia
May 20th, 2007
A new visualization Bruce Herr and I recently completed is being featured in this week’s New Scientist Magazine (the article is free online, minus the viz). They did a good job jazzing up the language used to describe the viz–’power struggle’, ‘bubbling mass’, ‘blitzed articles’–but they also dumbed down the technical accomplishments. I guess not everyone gets as excited about algorithms as I do.
Before I talk anymore about the viz, though, let me mention its appearing at the NetSci 2007 Conference this week, and hopefully a varient will appear at Wikimania later this summer as well. The viz is a huge 5 feet by 5 feet when printed, and I only include a low res, smaller version here. At some point high quality art prints of it will appear at SciMaps for sale to fund further visualization research.
Now for the good stuff. Much like my visualization of the netflix prize competition data, we began this piece by representing the data as a network. In this case the nodes in the network are wikipedia articles and the edges are the links between articles. We then (with some help from our friends at Sandia) used an algorithm to lay out all 650,000 nodes (wikipedia articles) that had at least one link in such a way that similar articles are near one another. These are the yellow dots, which when viewed at low res give a yellow tint to the whole picture.
The sizes of the nodes (circles, dots, whatever you want to call them), are based on a model of revision activity. So large circles indicate that an article might be controversial, or the subject of lots of vandalism, or just a topic whose content frequently changes. We labeled only the largest nodes, to keep it readable. There is an interactive version of this in the works based on the google maps platform which will change the labels and pictures used as the user ‘zooms’ in or out. Stay tuned for that.
The image used for each tile was selected automatically, simply by using the first image in the most linked to article among all the articles in that tile. We were pleasantly surprised by the quality of the images that appeared.
Our hope for this visualization approach, which we continue to improve on, is that it could be updated in real time to give a macro sense of what is happening in Wikipedia. I personally hope that some variation of it will end up in high schools as a teaching tool and for generating discussions.
Top 20 Most Hotly Revised Articles
- Jesus
- Adolf Hitler
- October 2003
- Nintendo revolution
- Hurricane Katrina
- India
- RuneScape
- Anarchism
- Britney Spears
- PlayStation 3
- Saddam Hussein
- Japan
- Albert Einstein
- 2004 Indian Ocean Earthquake
- New York City
- Germany
- Muhammad
- Pope Benedict XVI
- Ronald Regan
- Hinduism
May 22nd, 2007 at 3:23 pm
Great work! Do you plan to work on a specific topic like medicine?
May 23rd, 2007 at 5:22 am
Awesome! How did you find how often they were edited?
May 23rd, 2007 at 8:32 am
That’s awesome. I think it’s fascinating to be able to make such stunning visualizations as this. I also find it kind of cool that in the scheme of things, I played a part since I have contributed to Wikipedia, though my part is small, it’s there and that’s kind of cool.
May 23rd, 2007 at 8:33 am
Very interesting stuff… I’d be curious to see something like this done for other large sites, like Digg or MySpace…
May 27th, 2007 at 5:18 pm
Really cool!
I didn’t expect October 2003 being on the list of “Hotly Revised Articles”.
May 29th, 2007 at 9:16 pm
You might want to update your algorithm to weigh recent changes a bit more heavily; and to discount bot edits if you don’t already. Through 2004 there were some bots that weren’t flagged as such.
October 2003 was the last month for which the [[Current Events]] page kept on rolling over from month to month; it was finally moved to [[October 2003]] and a new naming system started. So all edits to the current events page from 2001 to October 2003 get counted in that total.
“Muhammid” (in your last sidebar above) is I hope a misspelling!
Lovely work. I want to know how I can get a large-scale version printed out for the Boston Wikipedia group…
SJ
June 4th, 2007 at 4:54 pm
Lovely image it would be great to see more of it in detail.
I’m wondering about the licence for this image. As most images on wikipedia are released under a sharealike licence I guess this counts as a derived work so GFDL would seem appropriate.
Rich (Salix alba on wikipedia)
July 10th, 2007 at 5:18 am
It’d be great if it could create this on-the-fly for individual topics that you searched Wiki for, eg: showing everything that featured within ‘Technology’.
July 21st, 2007 at 2:22 pm
The above-mentioned zoomable version exists now:
http://scimaps.org/maps/wikipedia/
I found it amusing how much anime was represented. e.g. Ghost in the Shell(with a picture of Motoko) is diagonal from Albert Einstein, and Cowboy Bebop and the Big O are nearby.
August 14th, 2007 at 7:32 pm
I like how cheese is right next to Jesus, the Bible, and a bunch of other religious oriented stuff
August 15th, 2007 at 1:02 am
The juxtaposition of the first two items on the list somehow gives me a truly eerie feeling…
August 28th, 2007 at 4:46 pm
Any update on when/whether the “high quality art print” will be available?
October 2nd, 2007 at 8:55 pm
[...] Visualizing the ‘Power Struggle’ in Wikipedia [...]
November 6th, 2007 at 5:12 pm
There are several versions of the wikipedia visualization available for purchase at http://scimaps.org/ordermaps/
Also, a google maps version of the places and spaces version of the wikipedia visualization is available here: http://www.scimaps.org/maps/wikipedia-ps3/ It shows the math, science, and technology related articles contained in wikipedia.
Also also, we are working on an updated version with newer data and a paper describing our work
December 29th, 2007 at 10:32 am
What should I need to learn and know if I am trying to duplicate this process on Chinese Wikipedia? It is very interesting and I would love to see the results when this applied to other versions…. Is it possible for me to do this as part of my non-commercial PhD work in Oxford?
March 16th, 2008 at 10:24 am
Hi,
I think this is an incredible project.
I am trying to build an interactive structure along the same lines, but much different.
I am an artist by trade, and love programming.
Can you inform me of current work?
Also, I am looking for someone to collaborate with.
Anyone interested?
Great work, keep it going!
May 6th, 2008 at 6:28 pm
Hi there,
i’m a writer for Australian Geographic magazine – very interested in reproducing your “Top 20 most hotly revised pages in Wikipedia” list. Can you pls email me ASAP?
Many thanks
Kathy
November 8th, 2008 at 1:25 am
[...] could stare at this amazing five foot square zoomable Visualization of Wikipedia all [...]
October 8th, 2009 at 12:31 pm
[...] Visualizing the Power Struggle in Wikipedia displays the most popular articles and the most frequent search queries in the heatmap. [...]
October 9th, 2009 at 3:47 pm
[...] Visualizing the Power Struggle in Wikipedia displays the most popular articles and the most frequent search queries in the heatmap. [...]
May 10th, 2010 at 2:30 pm
This is probably the best and the most relevant data viz, example I have seen related to human dynamics on the Internet. What is the current state to this project ?
May 9th, 2012 at 2:01 pm
Hi Todd,
My name is Mike Patterson, I’m the Director of Feedback for TEDMED.
I’ve been desperately trying to get my hands on a print of your brilliant Mosaic of Wikipedian Activity and wondered if you knew of anywhere I could CURRENTLY purchase one.
Can you help me out at all here? I’d greatly appreciate it!
Thanks in advance,
Mike Patterson
Director, Feedback and TEDMED Live
mike@tedmed.com
2 High Ridge Park
Stamford, CT 06905
dir. 203-461-7311
cel. 917-250-2021
July 9th, 2012 at 11:01 am
[...] Visualizing the Power Struggle in Wikipedia displays the most popular articles and the most frequent search queries in the heatmap. [...]