We’ve heard a lot this week about earthquake aid for Haiti. As is always the case when large numbers are bandied about in the news media, it’s hard to get a feeling of scale. For example, Canada has, at the time of writing, pledged to donate nearly 5.5M dollars to the aid effort. What does this number really mean? Well, considering Canada’s population of 33.3M, the aid works out to about 16 cents per Canadian citizen. 16 cents doesn’t buy you much these days. A sip of coffee, or – say – 3.14 minutes of Avatar; barely enough to get through the credits.
How many Avatar minutes are governments around the world pledging? Sweden leads the way, with almost 38 minutes per citizen – almost a quarter of the movie. Other Scandinavian countries round out the top 6, along with Luxembourg, Guyana, and Estonia.
Here are the times for some other countries:
Sweden: 38 minutes
Luxembourg: 28 minutes
Denmark: 26 minutes
Guyana: 25 minutes
Norway: 20 minutes
Estonia: 14 minutes
Australia: 8 minutes
Finland: 6 minutes
United States: 6 minutes
Switzerland: 5 minutes
New Zealand: 4 minutes
Netherlands: 3 minutes
United Kingdom: 3 minutes
Canada: 3 minutes
Spain: 2 minutes
Brazil: 2 minutes
Germany: 1 minute
Japan: 1 minute
Morocco: 1 minute
Poland: 1 minute
Italy: 1 minute
The images in this post are exports from a Processing tool that I built to manage the data and to render the film strips. The application reads data from a Google spreadsheet – the original data was published by the always excellent Guardian Data Blog. If there’s enough interest, I will post the tool and the source later this week.
The essay talks about Haiti’s violent history, and of the countries incredible tendency towards misfortune:
“How can nature or God or the fates or the universe do this to a country that has borne far too much sadness? An earthquake has now devastated the capital; claiming lives, hopes and the pitifully small dreams that people have held on to, despite political violence, unimaginable poverty, disease, corruption, dictators and nature’s full force of four hurricanes in a row.”
I built this very quick visualization to explore this topic a little further. Specifically, I wanted to compare Haiti to its Caribbean neighbours to see if the country is indeed as unlucky as it seems.
This visualization compares Haiti to 12 other Caribbean nations. It looks at articles published in the New York Times mentioning those countries between 1981 and 2010, and measures the occurence of specific words in those articles.
The pie charts in each row show the percentage of total articles on each country which contain the words in question. For example, we can see that about 25% of articles published about Haiti mention the word ‘violence’ – twice the frequency of any other country on the list.
Haiti has the highest frequency of the words ‘coup’, ‘violence’, ‘disease’, and ‘strife’. It is second or third in mentions of ‘death’, ‘unrest’ and ‘famine’.
Likely this week’s events will lead to many more mentions of these words. As you’re likely aware, many NGOs small and large are organizing to help Haitians – both through emergency assistance and through long-term rebuilding. If you want to donate, I’d highly recommend considering Architecture for Humanity (for long-term projects) or Partners in Health (for emergency assistance). Both organizations are accepting donations through their websites.
In October, I read a fascinating article on GQ.com about head injuries among former NFL players. Written by Jeanne Marie Laskas, the article was a forensic detective story, documenting a little known doctor’s efforts to bring the brain trauma issue to the attention of the medical community, the NFL, and the general public. It is a great read – an in-depth investigative piece with engaging personalities and plenty of intrigue.
A few weeks later, I picked up a copy of The New Yorker on my way home from Pittsburgh. I was surprised to see, on the cover, a promo for an article by Malcolm Gladwell about – you guessed it – brain trauma and the NFL. After having read both articles, I was surprised by how much these two investigative pieces differed. At the time I thought about doing a visualization to investigate, but somehow the idea slipped out of my head.
Until this weekend. I spent a few (okay, more like eight) hours putting together a tool with Processing that would examine some of the similarities and differences between the two articles. The most interesting data ended up coming from word usage analysis (I looked at sentences and phrases as well, but with not much luck). The base interface for the tool is a XY chart of the words – they are positioned vertically by their average position in the articles, and horizontally by which article they occur in more. The words in the centre are shared by both articles. Total usage affects the scale of the words, so we can see quite quickly which words are used most, and in which articles.
By focusing our attention on the big words which lie more or less in the center, we can see what the two articles have in common: brains, football, dementia, and a disease called CTE. What is perhaps more interesting is what lies on the outer edges; the subjects and topics that were covered by one author and not by the other.
Laskas’ article is about Dr. Bennet Omalu, dead NFL players (Mike Webster), Omalu’s colleagues (Dr. Julian Bailes & Bob Fitzsimmons) and the NFL (click on the images to see bigger versions):
Gladwell’s article, on the other hand, focuses partly on another scientist, Dr. Ann McKee, the sport of football in general, as well as s central metaphor in his piece – a comparison between football and dogfighting (the bridge between the two is Michael Vick):
The gulf between the two main scientific personalities profiled in the articles is interesting. Omalu and McKee are both experts in chronic traumatic encephalopathy (CTE) so it makes sense that they each appear in both articles (Omalu was the first to describe the condition; McKee. However, we see when we isolate these names that Laskas references Dr. Omalu almost exclusively (Omalu is mentioned 96 times by Laskas and only 6 times by Gladwell)* – it’s worth noting here that the Laskas article is 11.4% longer than the Gladwell piece – JT:
In contrast, Laskas only refers to McKee once (Dr. McKee is mentioned by Gladwell 21 times):
What is the relationship between Dr. McKee and Dr. Omalu? McKee is on the advisory board for the Sports Legacy Institute, a group which studies the results of brain trauma on athletes. SLI was founded by four individuals, including Bennet Omalu and the group’s current head, Chris Nowinski, a former professional wrestler. Omalu and the other three founders of SLI have now left the group, but it apparently continues to be a high-profile presence in the CTE field. Laskas writes:
“Indeed, the casual observer who wants to learn more about CTE will be easily led to SLI and the Boston group. There’s an SLI Twitter link, an SLI awards banquet, an SLI Web site with photos of Nowinski and links to videos of him on TV and in the newspapers. Gradually, Omalu’s name slips out of the stories, and Bailes slips out, and Fitzsimmons, and their good fight. As it happens in stories, the telling and retelling simplify and reduce.”
I wonder how much the path of an journalistic piece is affected by who you talk to first? If I had to guess, I’d say Gladwell started with the SLI, whereas Laskas seemed to have began from Dr. Omalu. This single decision could account for many of the differences between the two articles.
Other word-use choices might also give insight into editorial positions. Laskas, for example, uses the term NFL (below, at left) a lot – 57 times to Gladwell’s 11. Gladwell, on the other hand, talks more about the sport in general, using the word ‘football’ (below, at right) 40 times to Laskas’ 23:
According to Laskas, Dr. Omalu has been roundly shunned by the NFL – they have attempted to discredit his research on many occasions (attention that has not been so pointedly focused on Dr. McKee and the SLI). Though both articles are critical of the League, it seems clear both from the article and the data that Laskas and GQ have taken a more severe stance – the addresses the NFL much more often, and with more disdain.
This exercise of quantitatively analyzing a pair of articles may seem like a strange way to spend a weekend, but it helped me to more clearly understand the differences between the two stories and to consider my reactions to each. I uncovered a few things that I hadn’t picked up at first, and at the same time was able to reinforce some of the feelings that I had after reading the two articles.
It was also another opportunity to build a quick, lightweight visualization tool dedicated to a fairly specific topic (though in this case the tool could be used to compare any two bodies of text). This strategy holds a lot of appeal to me and I think deserves attention alongside the generalist approach that we tend to see a lot of on the web and in data visualization. It seems to me that this type of investigative technique could be useful for researchers of various stripes.
I will be releasing source code for this project as well as compiled applications for Mac, Linux & Windows. In the meantime, here’s a short video of how the interface behaves:
Many have you have already seen the series of visualizations that I created early in the year using the newly-released New York Times APIs. The most complex of these were in the 365/360 series in which I tried to distill an entire year of news stories into a single graphic. The resulting visualizations (2009 is picture above) capture the complex relationships – and somewhat tangled mess – that is a year in the news.
This release is a single sketch. I’ll be releasing the Article Search API Processing code as a library later in the week, but I wanted to show this project as it sits, with all of the code intact. The output from this sketch is a set of .PDFs which are suitable for print. Someday I’d like to show the entire series of these as a set of 6′ x 6′ prints – of course, someday I’d also like a solid-gold skateboard and a castle made of cheese.
That said, really nice, archival quality prints from this project (and the one I’ll be releasing tomorrow) are for sale in my Etsy shop. I realize that you’ll all be able to make your own prints now (and you are certainly welcome to do so) – but if you really enjoy the work and want to have a signed print to hang on your wall, you know who to talk to.
Put the folder ‘NYT_365_360’ into your Processing sketch folder. Open Processing and open the sketch from the File > Sketchbook menu. You’ll find detailed instructions in the header of the main tab (the NYT_365_360.pde file).
Most of the credit for this sketch goes to the clever kids at the NYT who made the amazing Article Search API. This is the gold standard of APIs, and really is a dream to use. As you’ll see if you dig into the code, each of these complicated graphics is made with just 21 calls to the API. I can’t imagine the amount of blood, sweat, and tears that would go into making a graphic like this the old-fashioned way.
Speaking of gold standards,Robert Hodgin got me pointed to ArrayLists in the first place, and has been helpful many times over the last few years as I’ve tried to solve a series of ridiculously simple problems in Processing. Thanks, Robert!
Project: GoodMorning! Date: August, 2009 Language:Processing Key Concepts: Spherical coordinates, latitude & longitude conversion, Twitter API, MetaCarta API
GoodMorning! is a global Twitter visualization tool. It allows tweets to be placed geographically and temporally, showing how a word or phrase is used around the world over a certain time period. You can watch a video here.
The project consists of two Processing sketches: a small sketch called TwitGather to gather tweets from the Twitter API and store them with location data in a JSON file, and a larger sketch which renders that data, and allows for export as .MOV and hi-res bitmap files. I’ve included a sample file which holds about 24 hours of ‘Good morning’ tweets.
This is likely the most complicated project that I will be releasing this week. I have tried to document the sketches as thoroughly as possible, but it may be confusing for beginners. However, there are lots of fairly simple things that can be gleaned by looking at various parts of the code.
To compile this project, you’ll need to make sure you have all of the libraries and other dependencies outlined below.
Move the sketches into your Processing sketch folder. Open Processing and open the GoodMorning sketch from the File > Sketchbook menu. You’ll find detailed instructions in the header of the main tab (the GoodMorning.pde file).
GoodMorning uses a variety of libraries, to the authors of which I am extremely grateful: