New York Times Visualizations

NYTimes: Iran and Iraq since 1981 (Radial) - White

The New York Times recently released a fairly thorough API which allows anyone to access a huge database of articles, as well as APIs for movie reviews, congressional vote data, campaign finance data, and more. 

The API is very easy to use – you send it a custom-built query along with your API key, and it returns you your requested data in JSON format. It’s easy enough, in fact, that it only took me a few hours to build a simple applet in Processing for examining the frequency of words in NYT articles over time. The image at the top of the post compares the occurrence of the word ‘iran’ (in red) with the word ‘iraq’ (in yellow). The result can be read like a clock: You can see the Iran Contra Affair at about 2:30. The first gulf war is at about 4pm. The second Iraq invasion is the biggest spike, continuing up until the current day. Interestingy, Iran shows a large increase in frequency in the months leading up to the end of 2008.

Here is the same data shown in a more traditional format:

NYTimes: Iran & Iraq since 1981

Here is a frequency chart for the words ‘internet’, ‘web’, and ‘twitter’, to remind us how recent all of this fancy technology is (this chart runs from 1981-2008). It’s also interesting to see the term ‘web’ overtake the term ‘internet’:

 NYTimes: Internet, Web & Twitter

This one compared the terms ‘sex’ and ‘scandal’ – see how many times you can find the term ‘catholic church’. The sheer number of peaks in this chart give you some idea of the mad times that the last two decades have held:

 NYTimes: Sex & Scandal since 1981

You can see the full-size versions of these charts and a few others in this Flickr set. Individual developers are limited to 10,000 API calls per day, so I won’t be able to run any more until tomorrow. I would love to hear suggestions about possible word (and time) combinations to run. I will be releasing the source for this very simple app soon – in the meantime, feel free to get in touch if you have any questions or suggestions.

Bookmark the permalink. Follow any comments here with the RSS feed for this post. Post a comment or leave a trackback: Trackback URL.

6 Comments

  1. Posted February 6, 2009 at 4:03 pm | Permalink

    Nice work!

  2. Jer
    Posted February 6, 2009 at 7:56 pm | Permalink

    Thanks, Greg.

    I have spent the last 20 minutes reading articles from your blog, which not only has provided me with a nice break from work, but has been very informative!

  3. Posted February 7, 2009 at 9:06 am | Permalink

    Brilliant. Would love to see the processing bits.

  4. Posted February 7, 2009 at 2:36 pm | Permalink

    Hey Jer,

    Newspapers are kind of area of interest of mine! I’ve been very excited by these initiatives of the Times, it kind of gives me hope for the industry. Seeing artists like yourself work with their data is exciting. :)

  5. Posted February 10, 2009 at 4:34 pm | Permalink

    Excellent use of the API… I’ll add it to my article: http://poynter.org/column.asp?id=101&aid=158177

  6. Posted May 26, 2009 at 8:13 pm | Permalink

    really nice. just wondering what are the “hairy” random bits at the end of each line?

11 Trackbacks

  1. [...] jer thorp has started visualizing some data comparing keywords like iran+iraq and sex+scandal+catholic church and more… visit blprnt blog [...]

  2. [...] New York Times Visualizationsは、Jer Thorp氏によるNewYork Timesのビジュアライゼーションのプロジェクト。 [...]

  3. [...] entire text archive searchable online, beginning with the year 1981. Jer had made some beautiful graphs of various word occurrences, and we asked him to build some for us based on a few design-related [...]

  4. [...] Links for March 5 SQLZoo: Joins House Members House Positions NYT Visualizations [...]

  5. By 202 » Blog Archive » NYT: Visualisations on March 18, 2009 at 6:24 am

    [...] http://blog.blprnt.com/blog/blprnt/new-york-times-visualizations [...]

  6. [...] (in blue) and “crisis” (in gray) from 1981 to 2009, starting and ending at midnight. This one does the same for “sex” (beige) and “scandal” [...]

  7. By Dear Bill Keller | byJoeyBaker on July 11, 2009 at 7:54 pm

    [...] by saying that bureaus are necessary for reporting. The number of stories about Iraq as dropped off drastically in your paper. Having people on the ground there isn’t helping. If you’d drop you [...]

  8. By pixeldreamz » blog » processing on August 21, 2009 at 4:14 am

    [...] Einen bleibenden Eindruck hat in diesem Zusammenhang der Vortrag von Jer Thorp hinterlassen. Sein Projekt welches mit processing die New York Time API abfragt und daraus Bilder generiert hat mein Interesse [...]

  9. [...] Artist Jer Thorp is one of the latter. Thorp accesses the Times Article Search API to create visualizations that compare the frequency of key words over time. The image below, for example, compares ’sex’ and ’scandal’ from 1981 – 2008: [...]

  10. [...] Click here for the article on The Times concept and click here for the Developer API link Posted by Javor Filed in Data Visualization [...]

  11. [...] York Times articles. These discussion threads are so complex and contain so much information that that graphical representation is necessary for humans to be able to analyze them. Using his tool he was able to pinpoint the [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*