Eyeo Festival 2011 – New York, New York

Last year at Eyeo, I gave a talk about the work that I had done in my first year in New York City, including Cascade (a social network visualization tool), OpenPaths (a secure personal location data project) and my work with Local Projects on the 9/11 Memorial.

Oh, and Batman. And Wayne Gretzky.

You can watch the video below (it’s about 50 minutes). As always, questions, feedback and bizarre non-sequiturs are welcome.

Selected Works (2009 – 2011)

These days my spare time is typically measured in seconds – so I took advantage of a rare free morning today to put together a very simple list of the some of the work that I have built since the beginning of 2009. You can find a link to this list at the top of the site – or you can just click here. As always, comments and suggestions are encouraged and appreciated.

UPDATED: Quick Tutorial – Processing & Twitter

** Since I first released this tutorial in 2009, it has received thousands of views and has hopefully helped some of you get started with building projects incorporating Twitter with Processing. In late 2010, Twitter changed the way that authorization works, so I’ve updated the tutorial to get it inline with the new Twitter API functionality.

Accessing information from the Twitter API with Processing is (reasonably) easy. A few people have sent me e-mails asking how it all works, so I thought I’d write a very quick tutorial to get everyone up on their feet.

We don’t need to know too much about how the Twitter API functions, because someone has put together a very useful Java library to do all of the dirty work for us. It’s called twitter4j, and you can download it here. We’ll be using this in the first step of the building section of this tutorial.

1. Authentication

Here’s the tricky part. In the olden days, Twitter used to be happy with us sending it a username and password pair for authentication. Now, all of the Twitter API functionality uses an oAuth system, which allows people to authenticate applications through a login at twitter.com (so that developers never see the user’s login information).

That’s pretty good from a security standpoint, but it makes our job a little bit harder. Now, instead of just needing a short username and password, we need FOUR things to authenticate our sketch: A consumer key & secret, and an access token & secret. You can get all 4 of these things from the twitter developer page.

1. Visit https://dev.twitter.com/ and login with your twitter username and password
2. Click on the ‘Create an app’ button at the bottom right
3. Fill out the form that follows – you can use temporary values (like “Jer’s awesome test application”) for the first three fields.
4. Once you’ve agreed to the developer terms, you’ll arrive at a page which shows the first two of the four things that we need: the Consumer key and the Consumer secret. Copy and paste these somewhere so you have them ready to access.

5. To get the other two values that we need, click on the button that says ‘Recreate my access token’. Copy and paste those two values (Access token and Access token secret) so that we have all four in the same place.

2. Building

1. Open up a new Processing sketch.
2. Import the twitter4j library. In your downloaded twitter4j files, you’ll find the core .jar in a directory called ‘lib’. You’re looking for a file called something like ‘twitter4j-core-2.2.5.jar’. We can add this to our sketch by simply dragging the twitter4j-core-2.2.5.jar file onto your sketch window. If you want to check, you should now see this file in your sketch folder, inside of a new ‘code’ directory.
3. The first thing we’ll do in our new file is to store the credentials that we gathered in the first part of this tutorial into a ConfigurationBuilder object (this is a built-in part of the twitter4j library):

** Don’t use the credentials below – these are fake ones that I made up – use the ones that you copy & pasted in steps 4 & 5 above

ConfigurationBuilder cb = new ConfigurationBuilder();
cb.setOAuthConsumerKey("lPFSpjBppo5u4KI5xEXaQ");
cb.setOAuthConsumerSecret("SYt3e4xxSHUL1gPfM9bxQIq6Jf34Hln9T1q9KGCPs");
cb.setOAuthAccessToken("17049577-Yyo3AEVsqZZopPTr055TFdySop228pKKAZGbJDtnV");
cb.setOAuthAccessTokenSecret("6ZjJBebElMBiOOeyVeh8GFLsROtXXtKktXALxAT0I");

4. Now we’ll make the main Twitter object that we can use to do pretty much anything you can do on the twitter website – get status updates, run search queries, find follower information, etc. This Twitter object gets built by something called the TwitterFactory, which needs our configuration information that we set above:

ConfigurationBuilder cb = new ConfigurationBuilder();
cb.setOAuthConsumerKey("lPFSpjBppo5u4KI5xEXaQ");
cb.setOAuthConsumerSecret("SYt3e4xxSHUL1gPfM9bxQIq6Jf34Hln9T1q9KGCPs");
cb.setOAuthAccessToken("17049577-Yyo3AEVsqZZopPTr055TFdySop228pKKAZGbJDtnV");
cb.setOAuthAccessTokenSecret("6ZjJBebElMBiOOeyVeh8GFLsROtXXtKktXALxAT0I");

Twitter twitter = new TwitterFactory(cb.build()).getInstance();

5. Now that we have a Twitter object, we want to build a query to search via the Twitter API for a specific term or phrase. This is code that will not always work – sometimes the Twitter API might be down, or our search might not return any results, or we might not be connected to the internet. The Twitter object in twitter4j handles those types of conditions by throwing back an exception to us; we need to have a try/catch structure ready to deal with that if it happens:

ConfigurationBuilder cb = new ConfigurationBuilder();
cb.setOAuthConsumerKey("lPFSpjBppo5u4KI5xEXaQ");
cb.setOAuthConsumerSecret("SYt3e4xxSHUL1gPfM9bxQIq6Jf34Hln9T1q9KGCPs");
cb.setOAuthAccessToken("17049577-Yyo3AEVsqZZopPTr055TFdySop228pKKAZGbJDtnV");
cb.setOAuthAccessTokenSecret("6ZjJBebElMBiOOeyVeh8GFLsROtXXtKktXALxAT0I");

Twitter twitter = new TwitterFactory(cb.build()).getInstance();
Query query = new Query("#OWS");

try {
  QueryResult result = twitter.search(query);
}
catch (TwitterException te) {
  println("Couldn't connect: " + te);
};

5. This code is working – but we haven’t done anything with the results. Here, we’ll set the results per page parameter for the query to 100 to get the last 100 results with the term ‘#OWS’ and spit those results into the output panel:


ConfigurationBuilder cb = new ConfigurationBuilder();
cb.setOAuthConsumerKey("lPFSpjBppo5u4KI5xEXaQ");
cb.setOAuthConsumerSecret("SYt3e4xxSHUL1gPfM9bxQIq6Jf34Hln9T1q9KGCPs");
cb.setOAuthAccessToken("17049577-Yyo3AEVsqZZopPTr055TFdySop228pKKAZGbJDtnV");
cb.setOAuthAccessTokenSecret("6ZjJBebElMBiOOeyVeh8GFLsROtXXtKktXALxAT0I");

Twitter twitter = new TwitterFactory(cb.build()).getInstance();
Query query = new Query("#OWS");
query.setRpp(100);

try {
  QueryResult result = twitter.search(query);
  ArrayList tweets = (ArrayList) result.getTweets();

  for (int i = 0; i < tweets.size(); i++) {
    Tweet t = (Tweet) tweets.get(i);
    String user = t.getFromUser();
    String msg = t.getText();
    Date d = t.getCreatedAt();
    println("Tweet by " + user + " at " + d + ": " + msg);
  };
}
catch (TwitterException te) {
  println("Couldn't connect: " + te);
};

It seems stupid to come all of this way without doing something visual, so let’s throw all of the words that we see in the tweets onto the screen, quick and dirty, at random positions. The following code takes our existing instructions and puts them into the standard Processing setup/draw enclosures so that we can do some animation over time:


//Build an ArrayList to hold all of the words that we get from the imported tweets
ArrayList<String> words = new ArrayList();

void setup() {
  //Set the size of the stage, and the background to black.
  size(550,550);
  background(0);
  smooth();
 
  //Credentials
  ConfigurationBuilder cb = new ConfigurationBuilder();
  cb.setOAuthConsumerKey("lPFSpjBppo5u4KI5xEXaQ");
  cb.setOAuthConsumerSecret("SYt3e4xxSHUL1gPfM9bxQIq6Jf34Hln9T1q9KGCPs");
  cb.setOAuthAccessToken("17049577-Yyo3AEVsqZZopPTr055TFdySop228pKKAZGbJDtnV");
  cb.setOAuthAccessTokenSecret("6ZjJBebElMBiOOeyVeh8GFLsROtXXtKktXALxAT0I");

  //Make the twitter object and prepare the query
  Twitter twitter = new TwitterFactory(cb.build()).getInstance();
  Query query = new Query("#OWS");
  query.setRpp(100);

  //Try making the query request.
  try {
    QueryResult result = twitter.search(query);
    ArrayList tweets = (ArrayList) result.getTweets();

    for (int i = 0; i < tweets.size(); i++) {
      Tweet t = (Tweet) tweets.get(i);
      String user = t.getFromUser();
      String msg = t.getText();
      Date d = t.getCreatedAt();
      println("Tweet by " + user + " at " + d + ": " + msg);
      
      //Break the tweet into words
      String[] input = msg.split(" ");
      for (int j = 0;  j < input.length; j++) {
       //Put each word into the words ArrayList
       words.add(input[j]);
      }
    };
  }
  catch (TwitterException te) {
    println("Couldn't connect: " + te);
  };
}

void draw() {
  //Draw a faint black rectangle over what is currently on the stage so it fades over time.
  fill(0,1);
  rect(0,0,width,height);
  
  //Draw a word from the list of words that we've built
  int i = (frameCount % words.size());
  String word = words.get(i);
  
  //Put it somewhere random on the stage, with a random size and colour
  fill(255,random(50,150));
  textSize(random(10,30));
  text(word, random(width), random(height));
}


** Extra-coolness: see this example working in the browser with Processing.js!!

That’s it! This example is very simple, but it’s the bare bones of what you need build a project which connects Twitter through Processing.

Shortly I’ll add another post which shows how to use Processing with the Twitter Streaming API, so that we can deal with large volumes of tweets over time.

Good luck!

138 Years of Popular Science

Magazine shots - Popular Science

Near the end of this summer, I was asked by the publishers of Popular Science magazine to produce a visualization piece that explored the archive of their publication. PopSci has a history that spans almost 140 years, so I knew there would be plenty of material to draw from. Working with Mark Hansen, I ended up making a graphic that showed how different technical and cultural terms have come in and out of use in the magazine since its inception.

The graphic is anchored by a kind of molecular chain – decade clusters in turn contain year clusters. Every atom in these year clusters is a single issue of the magazine, and is shaded with colours extracted from the issue covers via a colour clustering routine. The size of the issue-atoms is determined by the number of words in each issue.

Magazine shots - Popular Science

Magazine shots - Popular Science

Surrounding this chain are about 70 word frequency histograms showing the issue-by-issue usage of different terms (like ‘software’ or ‘bakelite’). I used a simple space-filling algorithm to place these neatly around the molecule chain, and to stack them so that one histogram begins shortly after another ends. This ended up resulting in some interesting word chains that show how technology has progressed – some that make sense (microcomputer to e-mail) and some what are more whimsical (supernatural to periscope to datsun to fax).

Picking out interesting words from all of the available choices (pretty much the entire dictionary) was a tricky part of the process. I built a custom tool in Processing that pre-visualized the frequency plots of each word so that I could go through many, many possibilities and identify the ones that would be interesting to include in the final graphic. This is a really common approach for me to take – building small tools during the process of a project that help me solve specific problems. For this visualization, I actually ended up writing 4 tools in Processing – only one of which contributed visually to the final result.

My working process is riddled with dead-ends, messy errors and bad decisions – the ‘final’ product usually sits on top of a mountain of iterations that rarely see the light of day. To give a bit of insight into the steps between concept and product, I’ve put together a Flickr set showing 134 process images that came out of the development of this visualization. Here are a few images from that set:

Rough beginning

Popular Science - Process

Early molecular chain

Popular Science - Process

Denser chain with test ‘word span’

Popular Science - Process

A diversion

Popular Science - Process

Near-final

Popular Science - Process

Lost in the image records are the steps that involved the data – and there were a lot of them. The archive was text that came from an OCR (optical character recognition) process, and was incredibly messy. To make matters worse, the file names for each issue were machine-generated and didn’t tie to the actual date order of the documents. A great deal of our time was spent cleaning up this data, and compiling customized datasets (many of which never ended up getting used).

While the final graphic is pictured above, it looks much better in print – so make sure you get a copy! Better yet, you can order a poster-sized version of the graphic by clicking here.

Tutorial: Processing, Javascript, and Data Visualization

[The above graphic is an interactive 3D bar graph. If you can't see it, it's probably because your browser sucks can't render WebGL content. Maybe try Chrome or Firefox?]

Ben Fry and Casey Reas, the initiators of the Processing project, announced at the Eyeo Festival that a new version of Processing, Processing 2.0, will be released in the fall (VIDEO). This 2.0 release brings a lot of really exciting changes, including vastly improved 3D performance, and a brand-new high-performance video library. Perhaps most interesting is the addition of publishing modes – Processing sketches can now be published to other formats other than the standard Java applet. The mode that is going to cause the most buzz around the interweb is Javascript mode: yes, Processing sketches can now be published, with one click, to render in pretty much any web browser. This is very exciting news for those of us who use Processing for data visualization – it means that our sketches can reach a much broader audience, and that the visual power of Processing can be combined with the interactive power of Javascript.

While preview releases of Processing 2.0 will be available for download very soon, you don’t have to wait to experiment with Processing and Javascript. The upcoming JS mode comes thanks to the team behind Processing.js, and we can use it to build sketches for the browser right now.

To get started, you can download this sample file. When you unzip the file, you’ll see a directory with three files – the index.html page, processing.js, and our sketch file, called tutorial.pde. If you open the sketch file in a text editor, you’ll see something like this:

void setup() {
  //instructions in here happen once when the sketch is run
  size(500,500);
  background(0);
}

void draw() {
  //instructions in here happen over and over again
}

If you’re a Processing user, you’ll be familiar with the setup and draw enclosures. If you’re not, the notes inside of each one explain what’s going on: when we run the sketch, the size is set to 500px by 500px, and the background is set to black. Nothing is happening (yet) in the draw enclosure.

You can see this incredibly boring sketch run if you open the index.html file in your browser (this will work in Firefox and Safari, but not in Chrome due to security settings).

Processing.js is a fully-featured port of Processing – all of the things that you’d expect to be able to do in Processing without importing extra libraries can be done in Processing.js. Which, quite frankly, is amazing! This means that many of your existing sketches will run, with no changes, in the browser.

Since this is a data visualization tutorial, let’s look at a simple representation of numbers. To keep it simple, we’ll start with a set of numbers: 13.4, 14.5, 15.0, 23.2, 30.9, 31.3, 32.9 35.1, 34.3. (These numbers are obesity percentages in the U.S. population, between the 1960s and the present – but we’re really going to use them without context).

Typically, if we were to put these numbers into a Processing sketch, we’d construct an array of floating point numbers (numbers with a decimal place), which would look like this:

float[] myNumbers = {13.4, 14.5, 15.0, 23.2, 30.9, 31.3, 32.9, 35.1, 34.3};

However, we’re not in Java-land anymore, so we can make use of Javascript’s easier syntax (one of the truly awesome things about Processing.js is that we can include JS in our sketches, inline):

var numbers = [13.4, 14.5, 15.0, 23.2, 30.9, 31.3, 32.9, 35.1, 34.3];

Our .pde file, then, looks like this:


var numbers = [13.4, 14.5, 15.0, 23.2, 30.9, 31.3, 32.9, 35.1, 34.3];

void setup() {
  //instructions in here happen once when the sketch is run
  size(500,500);
  background(0);
}

void draw() {
  //instructions in here happen over and over again
}

Now that we have our helpful list of numbers (called, cleverly, ‘numbers’), we can access any of the numbers using its index. Indexes in Processing (and Javascript) are zero-based, so to get the first number out of the list, we’d use numbers[0]. To get the fourth one, we’d use numbers[3]. Let’s use those two numbers to start doing some very (very) simple visualization. Indeed, the first thing we’ll do is to draw a couple of lines:

var numbers = [13.4, 14.5, 15.0, 23.2, 30.9, 31.3, 32.9, 35.1, 34.3];

void setup() {
  //instructions in here happen once when the sketch is run
  size(500,500);
  background(0);
  
  //Draw two lines
  stroke(255);
  line(0, 50, numbers[0], 50);
  line(0, 100, numbers[4], 100);
}

void draw() {
  //instructions in here happen over and over again
}

OK. I’ll admit. That’s a pretty crappy sketch. But, we can build on it. Let’s start by making those lines into rectangles:

var numbers = [13.4, 14.5, 15.0, 23.2, 30.9, 31.3, 32.9, 35.1, 34.3];

void setup() {
  //instructions in here happen once when the sketch is run
  size(500,500);
  background(0);
  
  //Draw two lines
  stroke(255);
  rect(0, 50, numbers[0], 20);
  rect(0, 100, numbers[4], 20);
}

void draw() {
  //instructions in here happen over and over again
}

At this point, I’m going to introduce my very best Processing friend, the map() function. What the map() function allows us to do is to adjust a number that exists in a certain range so that it fits into a new range. A good real-world example of this would be converting a number (0.5) that exists in a range between 0 and 1 into a percentage:

50 = map(0.5, 0, 1, 0, 100);

We can take any number inside any range, and map it to a new range. Here are a few more easy examples:

500 = map(5, 0, 10, 0, 1000);
0.25 = map(4, 0, 16, 0, 1);
PI = map(180, 0, 360, 0, 2 * PI);

In our sketch, the list of numbers that we’re using ranges from 13.4 to 34.3. This means that our rectangles are pretty short, sitting in the 500 pixel-wide sketch. So, let’s map those numbers to fit onto the width of our screen (minus 50 pixels as for a buffer):

var numbers = [13.4, 14.5, 15.0, 23.2, 30.9, 31.3, 32.9, 35.1, 34.3];

void setup() {
  //instructions in here happen once when the sketch is run
  size(500,500);
  background(0);
  
  //Draw two rectangles
  rect(0, 50, map(numbers[0], 0, 34.3, 0, width - 50), 20);
  rect(0, 100, map(numbers[4], 0, 34.3, 0, width - 50),  20);
}

void draw() {
  //instructions in here happen over and over again
}

Two bars! Wee-hoo! Now, let’s use a simple loop to get all of our numbers on the screen (this time we’ll use the max() function to find the biggest number in our list):

var numbers = [13.4, 14.5, 15.0, 23.2, 30.9, 31.3, 32.9, 35.1, 34.3];

void setup() {
  //instructions in here happen once when the sketch is run
  size(500,500);
  background(0);
  
  for (var i = 0; i < numbers.length; i++) {
    //calculate the width of the bar by mapping the number to the width of the screen minus 50 pixels
    var w = map(numbers[i], 0, max(numbers), 0, width - 50);
    rect(0, i * 25, w, 20);
  }
}

void draw() {
  //instructions in here happen over and over again
}

We’re still firmly in Microsoft Excel territory, but we can use some of Processing’s excellent visual programming features to make this graph more interesting. For starters, let’s change the colours of the bars. Using the map() function again, we’ll colour the bars from red (for the smallest number) to yellow (for the biggest number):

var numbers = [13.4, 14.5, 15.0, 23.2, 30.9, 31.3, 32.9, 35.1, 34.3];

void setup() {
  //instructions in here happen once when the sketch is run
  size(500,500);
  background(0);
  
  for (var i = 0; i < numbers.length; i++) {
    //calculate the amount of green in the colour by mapping the number to 255 (255 red &amp; 255 green = yellow)
    var c = map(numbers[i], min(numbers), max(numbers), 0, 255);
    fill(255, c, 0);
    //calculate the width of the bar by mapping the number to the width of the screen minus 50 pixels
    var w = map(numbers[i], 0, max(numbers), 0, width - 50);
  	rect(0, i * 25, w, 20);
  }
}

void draw() {
  //instructions in here happen over and over again
}

This process of mapping a number to be represented by a certain dimension (in this case, the width of the bars along with the colour) is the basic premise behind data visualization in Processing. Using this simple framework, we can map to a whole variety of dimensions, including size, position, rotation, and transparency.

As an example, here is the same system we saw above, rendered as a radial graph (note here that I’ve added a line before the loop that moves the drawing point to the centre of the screen; I’ve also changed the width of the bars to fit in the screen):

var numbers = [13.4, 14.5, 15.0, 23.2, 30.9, 31.3, 32.9, 35.1, 34.3];

void setup() {
  //instructions in here happen once when the sketch is run
  size(500,500);
  background(0);
  
  //move to the center of the sketch before we draw our graph
  translate(width/2, height/2);
  for (var i = 0; i < numbers.length; i++) {
    //calculate the amount of green in the colour by mapping the number to 255 (255 red &amp; 255 green = yellow)
    var c = map(numbers[i], min(numbers), max(numbers), 0, 255);
    fill(255, c, 0);
    //calculate the width of the bar by mapping the number to the half the width of the screen minus 50 pixels
    var w = map(numbers[i], 0, max(numbers), 0, width/2 - 50);
  	rect(0, 0, w, 20);
  	//after we draw each bar, turn the sketch a bit
  	rotate(TWO_PI/numbers.length);
  }
}

void draw() {
  //instructions in here happen over and over again
}

So far we’ve been keeping things very simple. But the possibilities using Processing and Javascript are really exciting. It’s very easy to add interactivity, and it’s even possible to take advantage of Processing’s 3D functionality in the browser.

Here’s our numbers again, rendered in 3D (note the change to the size() function), with the rotation controlled by the mouse (via our new friend the map() function):

var numbers = [13.4, 14.5, 15.0, 23.2, 30.9, 31.3, 32.9, 35.1, 34.3];

void setup() {
  //instructions in here happen once when the sketch is run
  size(500,500,P3D);
  background(0);
  
}

void draw() {
  //instructions in here happen over and over again
  background(0);
  //turn on the lights so that we see shading on the 3D objects
  lights();
  //move to the center of the sketch before we draw our graph
  translate(width/2, height/2);
  //Tilt about 70 degrees on the X axis - like tilting a frame on the wall so that it's on a table
  rotateX(1.2);
  //Now, spin around the Z axis as the mouse moves. Like spinning that frame on the table around its center
  rotateZ(map(mouseX, 0, width, 0, TWO_PI));
  for (var i = 0; i < numbers.length; i++) {
    //calculate the amount of green in the colour by mapping the number to 255 (255 red &amp; 255 green = yellow)
    var c = map(numbers[i], min(numbers), max(numbers), 0, 255);
    fill(255, c, 0);
    //calculate the height of the bar by mapping the number to the half the width of the screen minus 50 pixels
    var w = map(numbers[i], 0, max(numbers), 0, width/2 - 50);
    //move out 200 pixels from the center
  	pushMatrix();
  		translate(200, 0);
  		box(20,20,w);
  	popMatrix();
  	//after we draw each bar, turn the sketch a bit
  	rotate(TWO_PI/numbers.length);
  }
}

Thirty lines of code, and we have an interactive, 3D data visualization (you can see this sketch running inline in the header). Now, I understand that this isn’t the best use of 3D, and that we’re only visualizing a basic dataset, but the point here is to start to imagine the possibilities of using Processing with Javascript in the browser.

For starters, Javascript lets us very easily import data from any JSON source with no parsing whatsoever – this means we can create interactive graphics that load dynamic data on the fly (more on this in the next tutorial). On top of that, because we can write Javascript inline with our Processing code, and we can integrate external Javascript functions and libraries (lie JQuery) to our sketches, we can incorporate all kinds of advanced interactivity.

I suspect that the release of Processing 2.0 will lead to a burst of Javascript-based visualization projects being released on the web. In the meantime, we can use the basic template from this tutorial to work with Processing.js – have fun!