Category Archives: Processing

Text Comparison Tool: SOURCE CODE

About this time last year, I built a light-weight tool in Processing to compare two articles that I read about head injuries in the NFL. Later, I extended the tool to compare any two texts, and promised a source release.

Well, it only took 12 months, but I’ve finally cleaned up the code to a point where I think it will be reasonably easy to use and helpful for those who might want to learn a bit more about the code.

I always find myself in a tricky position with source releases. Often, as in the case of this tool, I have ideas about what the project should look like before it gets released. Here, I wanted to build an interface to allow people to select different text sources within the app, so that people could use the app without having to compile it from Processing. This is what delayed the release of the code for a year – I was waiting for the time to get this last piece done.

Two weeks ago, though, I has a chance to speak at an event with Tahir Hemphill. I had sent Tahir a messy version of the project for use with his Hip Hop Word Count initiative a few months back, and he used it to analyze a famous rap battle between Nas and Jay-Z:

Text Correlation Tool: Nas Versus Jay / Ether Versus The Takeover from Staple Crops on Vimeo.

This reminded me of a fairly valuable lesson as far as source code is concerned: If it works, it’s probably good enough to release.

So, without too much further ado, here’s a link to download the Text Comparison Tool:

Text Comparison Tool (1.1MB)

It’s pretty easy to use. First, you’ll need Processing to open and work with the sketch. Also, you’ll need the toxiclibs core library installed. Assuming you have those two things, these are the steps:

1. Drag the unzipped folder into your sketchbook.
2. Place your text files in the sketch’s data folder.
3. Open the sketch.
4. Look for the code block at the top of the main tab where the article information is set. It’s pretty clearly marked, and looks like this:

String title1 = "Tokyo";
String url1 = "asia.txt";
String desc1 = "Suntory Hall";
String date1 = "November 14th, 2009";
color color1 = #140047;

String title2 = "Cairo";
String url2 = "cairo.txt";
String desc2 = "Cairo University";
String date2 = "June 4th, 2009";
color color2 = #680014;

5. Replace the information here with appropriate information for your files.
6. Run the sketch!

That’s it. If you are getting strange results, you can tweak the clean() and cleanBody() methods at the bottom of the main tab to control how your text is filtered.

Hopefully I’ll still find the time to package this thing up in a bit more of a user-friendly form. But, in the meantime, hopefully people will find this useful as an exploratory tool. Note that at any time you can press the ‘s’ key to save out an image – if you find some interesting texts to compare, let me know!

Processing Tip: Rendering Large Amounts of Text… Fast!

I’ve been working on a project at the New York Times R&D Lab which involves rendering a fair amount of text to screen, which needs to be animated smoothly in real time. I’ve never had much luck working with large amounts of text in Processing, using the traditional approach. But I’ve found a nice way using an external library and a old trick, which allows for the rendering of large amounts of text at high framerates (the image above shows 1000+ text blocks running at ~50fps on my two-year-old MacBook).

How does it work?

Well, let’s first talk about how we’d typically approach this problem. Because they need to be animated independently, each of the text blocks is a member of a custom Class. Once per frame, we loop through all of the text objects, and ask them to render. In the old-school way, each of these text blocks would render using Processing’s text() method, and all of the shapes would be drawn to screen. While this works well with limited amounts of text, if we add more text objects to the sketch, things start to slow down at an alarming rate:

Here, I’ve barely added 200 objects, and the frame rate has crawled down to a measly 10fps. Clearly not good.

Blit it.

The first part of the solution is to use an old-school technique called ‘blitting’. Now, before all of you demo-scene kids get all worked up, I know this isn’t really blitting, but I like the sound of the word, and it’s my tutorial, so deal with it. Instead of drawing every letter of every text block every frame, what we’re going to do is this:

  1. Render the text once to an off-screen graphics object
  2. Transfer the pixels from this off-screen object to a PImage
  3. Render that PImage every frame

The core reason this works is that Processing is much better at rendering hundreds of images than it is at rendering hundreds of text objects. With this implemented, we can get many, many more objects onto our screen:

Here, we’ve got about 550 objects on the screen, at full framerate. But, at least on my Mac, this system seems to collapse after about 600 objects. So, how can we get up to the 1000 objects that I showed in the first image in this post?

Let’s push it a little.

The last step of this process owes its existence to Andres Colubri, who has built an excellent library called GLGraphics which lets us do all kinds of things in Processing that get accelerated by the super-fancy graphics cards (GPUs) which most of us are lucky enough to have in our computers (Andres is also responsible for all of the tasty Processing for Android 3D stuff which I’ll be posting about later this week). In this case, the final step involves rendering all of the text blocks as GL Textures, instead of as PImages (note that this will give you a performance boost in any case where you are using images – yay!)

Around now you’re probably asking about the code. Well, here it is. I’ve tried to make it as simple as possible – just remember that you need the GLGraphics library installed.

GL Text Example (4kb)

This solution isn’t perfect – it doesn’t allow for perfect scaling, or dynamic effects within the text objects. But is does solve the basic problem, and also gives us a set of GLTextures which are ready to be used in whatever wacky 3D system we want to put them in. Let me know how it works for you.

Processing & Android: Mobile App Development Made (Very) Easy

Those of you who have read this blog before will know that I do most of my work using Processing, an open-source programming language that was specifically created for artists and designers. It’s easy enough that you can learn the basics a day or two, but powerful enough to use for any number of applications, ranging from interactive installations to architecture to generative visual systems, and beyond. I’ve used it to visualize huge sets of data, to design a 1000 square-metre playground, and to create hundreds of small digital ‘sketches’. Over the last few months, Processing has gained yet another appealing attribute – you can now use it to quickly and easily create apps for mobile devices that use the Android operating system.

Creating apps for Android with processing is ridiculously easy. How easy? Let’s get a from-scratch Android app working on a device in 25 minutes (15 of those minutes will be spent installing software).

** You don’t need to have an Android device to do this tutorial, since we can see the app that we’ll build in a software emulator. But it’s much cooler if it’s on a device.

** Before you start on this tutorial, I’d recommend that you make sure you have a recent version of Java installed. Mac users can run Software Update – Windows folks should go here and download the latest Java version.

Step One: Install the Android SDK

I promise. This isn’t going to be one of those tutorials that is full of three-letter acronyms that you don’t understand. But before we get started building our app in Processing, we need to download some software that will allow us to author Android applications (Processing manages all of the tricky bits of this, and will eventually download and install the SDK for you as well – for now we have to do a tiny bit of manual labour). This bundle of software is called a ‘Software Development Kit’ and is fairly easy to get and install.

First, go to the URL below, and download the appropriate SDK for your operating system:

http://developer.android.com/sdk

This download should un-zip to a folder called something like ‘android-sdk-mac_x86′. Move that folder to a safe location on your machine , and then open the ‘tools’ directory inside of it. Double-click on the file named ‘android’ – on a Mac this will fire up Terminal and in turn will open the Android SDK and AVD manager, which looks something like this:

We’re going to use the manager to install the Android packages that we need to build our apps. Click on the left menu option ‘Available Packages’ and check the box that appears in the centre window. The manager will check with the Android repository and then list all of the packages and tools that are available for download:

We can install all of the packages – that way we know we’ll get everything we need. Click ‘Install Selected’. In the next window, click ‘Accept All’, and then ‘Install’. That’s it! We now have the Android SDK files that we need to work with Processing Android. Now we need to download the latest and greatest version of Processing.

Step Two: Get an Android-enabled Version of Processing

Existing Processing users (unless they’re a Processing fan(boy/girl)), will probably still have the same version of Processing that they installed when they first learned how to use the software. This will be a good chance, then, for all of you laggers-behind to get with the program. For those of you who don’t have Processing yet, here’s your chance to get in on the fun.

Now normally, as a sane person, I’d ask you to download the rock-solid, tested, official release from the main Processing.org downloads page. But we want to play with some new features here, so we’ll download the freshest, most exciting, and hopefully functional version of Processing from this secret-squirrel Processing Android Wiki Page:

http://wiki.processing.org/w/Android

Download and install whatever the latest version is for your operating system (at the time of writing, the version number was 0190). Once it’s installed, open up Processing, and let’s get Android-ing!

Step Three: Make an Android App

Before we do anything even somewhat complicated, let’s take the time to learn how Processing Android works. Let’s start with a really simple sketch, which draws a white square in the middle of an orange field:

void setup() {
   size(480,800);

   smooth();
   noStroke();
   fill(255);
   rectMode(CENTER);     //This sets all rectangles to draw from the center point
};

void draw() {
   background(#FF9900);
   rect(width/2, height/2, 150, 150);
};

If we press the run button (or hit Apple-R), Processing compiles our sketch to run as a temporary Java applet, which we see running in a separate window (go ahead and try this). This is the basic Processing behaviour. I have a few other options for compiling my little sketch. I could select ‘Sketch > Present’ from the menubar to present the sketch in fullscreen, I could select ‘File > Export’ to compile my sketch as a Java applet to display in the browser, or I could select ‘File > Export Application’ to produce a stand-alone file to launch like a ‘real’ application. These three basic options for compiling (run, present, export) have slightly different functions in Android Mode.

Wait. What? Android Mode? Here we discover the genius of our new, super Processing – we can build sketches the same way that we are normally used to, then switch into Android Mode to preview and publish our droidified sketches. Let’s do that with our simple rectangle sketch.

From the shiny new ‘Android’ menubar heading, select Android mode. Let’s first see what the sketch looks like in the emulator, by pressing the run button:

* The first time you try this you might get an error message telling you that the Android SDK hasn’t been installed. Press ‘yes’ and locate the Android SDK folder that you downloaded in Step One. Then run the sketch again.

** Be patient! The emulator can take a while to get started. You might have to run the sketch again one the desktop appears in the emulator.

Exciting, right? Right?

OK. Maybe not. It’s just a white square. But… let’s make it a white square on an Android Device!!

Step Four: Running the App on a Device

Very Important: To get the app running on your device, you’ll first have to make sure that USB debugging is turned on. You can do this from the Settings>Applications>development menu on your device. If you’re on Windows, you might have to do some other setup, too.

Connect your Android device, and then select ‘Sketch>Present’ from the menubar (or, press Shit-Apple-R).

Yahoo!

Now that we know that everything is working, let’s add a little bit of interactivity. We’ll make the box rotate as we slide our finger left to right, and make the background color change as we move our finger up and down. We’ll also add a little ball-and-stick indicator to show where the ‘mouse’ is:

/*

World's simplest Android App!
blprnt@blprnt.com
Sept 25, 2010

*/

//Build a container to hold the current rotation of the box
float boxRotation = 0;

void setup() {
  //Set the size of the screen (this is not really necessary in Android mode, but we'll do it anyway)
  size(480,800);
  //Turn on smoothing to make everything pretty.
  smooth();
  //Set the fill and stroke colour for the box and circle
  fill(255);
  stroke(255);
  //Tell the rectangles to draw from the center point (the default is the TL corner)
  rectMode(CENTER);

};

void draw() {
  //Set the background colour, which gets more red as we move our finger down the screen.
  background(mouseY * (255.0/800), 100, 0);
  //Chane our box rotation depending on how our finger has moved right-to-left
  boxRotation += (float) (pmouseX - mouseX)/100;

  //Draw the ball-and-stick
  line(width/2, height/2, mouseX, mouseY);
  ellipse(mouseX, mouseY, 40, 40);

  //Draw the box
  pushMatrix();
    translate(width/2, height/2);
    rotate(boxRotation);
    rect(0,0, 150, 150);
  popMatrix();
};

Again, we can use Present (Shift-Apple-R) to run the sketch on our device:

At this point, the world is our proverbial oyster. Those of you who have already worked with Processing are probably already loading your archived sketches onto your phone. If you don’t know much about Processing, you can learn a lot from the Processing website, from the excellent books that are available, and from doing some exploring on your own.

I am very (very) excited about Processing for Android. It enables students, artists, activists – everyone – to make and distribute their own mobile-based software. While we’ve kept things pretty simple in this tutorial, later in the week, I’ll be showing you some nifty Android-specific Processing features, and exploring how to integrate sketches with the device hardware (camera, accelerometer, etc.). We’ll also look at how to package up Processing sketches for distribution.

Stay tuned!

Some helpful tips when you’re working with Processing & Android:

- I know I’ve said this before, but be patient. Canceling a process (ie. the emulator load or a device compile) can cause problems. If you do this inadvertently, you’re best off restarting Processing.

- RTFW – make sure to check out the Processing Android Wiki, where you’ll find some troubleshooting advice, and some tips on how to get your sketches working properly on your device.

Wired UK, Barabási Lab and BIG data

Over the last year, I’ve produced five data-driven pieces for Wired UK. Four of them have been for the two-page infoporn spread that can be found in every issue. I’ve looked at the UK’s National DNA Database, used mined Twitter data to find people’s travel paths, and mapped traffic in some of the world’s busiest sea ports.

In the August issue, out on newsstands right now, I had a chance to work with some spectacular data and extremely talented people. The piece looks at a very, very big data set – cellular phone records from a pool of 10 million users in an anonymous European country. This data came (under a very strict layer of confidentiality) from Barabási Lab in Boston, where they have been using this information to find out some fascinating things about human mobility patterns.

In this post, I’ll walk through the process of creating this piece. Along the way, I’ll show some draft images and unused experiments that eventually evolved into the final project.

Working With Big Data

I can’t get into a lot of detail about the specifics of the data set, but needless to say, phone records for 10 million individuals take up a lot of space. All told, the data for this project consisted of more than 5.5GB of flattened text files. I should say, at this point, that I don’t work on a supercomputer – I churn out all of my work from an often overheated 2.33GHZ MacBook Pro. Since the deadline was reasonably tight on this project, I decided to rule out a distributed computing approach to get at all of this data, and instead chose to work with a subset of the full list of records. Working in Processing, I built a simple script that could filter out a smaller dataset from the complete files. I built several of these at varying file sizes, giving me a nice set of data to work with both in prototyping and in production stages. This is a strategy that I often employ, even with more minimal datasets – save the heavy lifting until the final render.

The first thing I did with the trimmed-down data was to construct ‘call histories’ for each user in the set. I rendered out these histories as stacked bars of individual calls, which could then be placed into a histogram. Here’s a graph of about 10,000 users, sorted by their total time spent on the phone :

Wired UK & Barabási Lab: Process

Here we see a very obvious power law distribution, with a few people talking a lot (really, a lot – 28.3 hours a week), and most callers talking relatively little (these is also a tail of text-only users at the very end). The problem here, of course, is that on a computer screen – or even in print – it’s hard to get into the data to learn anything useful. When I zoom into the graph, we can start to see the individual call histories (I’ve enlarged a few columns for detail). Here, long calls are rendered yellow, short calls are rendered red, and text messages are flat blue rectangles:

Wired UK & Barabási Lab: Process

I took the same graph as above, and added another set of columns extending below – here the white bars show us how many ‘friends’ the individual callers had – ie. how many people they are regularly talking to over the week:

Wired UK & Barabási Lab: Process

If I sort this graph by number of friends (rather than total call time), we can see that the two measures (talkativeness, and number of friends) don’t seem to be strongly correlated:

Wired UK & Barabási Lab: Process

It’s interesting to note here as well, that the data set includes linkage information – so I can also visualize who is calling who within our group of individuals:

Wired UK & Barabási Lab: Process

There is some interesting information to be dug up in here, but the long aspect of the graph and the general over-detail involved makes it not very usable – particularly for a magazine piece.

Ooh, and then Aaah.

The Infoporn section in Wired is a two page spread;  I always think of it as needing to serve two separate purposes for two different kinds of readers. First, it needs to be visually pleasing. I want people to say ‘Oooh…!’ when they turn the page to it. Once they’re hooked, though, I want them to learn something – the ‘Aaah!’ moment.

The data used in the graphs above seemed too complex to do anything truly revealing with – so perhaps it could be built into something sexy enough to draw an ‘Oooh!’ or two? In order to fit the long tails of these graphs onto the page, I wondered if I could add a bit of a curl to them. To make this structural change evident, I turned the graphs on a slight angle and rendered them in 3D. Here, we see five of these graphs, totaling about a million individual users, arranged into a single, tower-like shape:

Wired UK & Barabási Lab: Process

While these structures took a little while to render, I could quite easily generate a unique set of them, which I assembled as a line trailing off to the page edge on the left:

Wired UK & Barabási Lab: Process

Getting Personal

So far, the visuals for this project only tell a part of the story: that our individual calling habits fall into predictable patterns when placed with the larger whole (some excellent text from Michael Dumiak helps clarify this in the final piece). There’s another crucial piece, though. Cel phone usage data is inherently locative, since our provider always knows from which of their cel towers we are placing the call.

This is where the fun starts – we can use this locative data to track the mobility patterns of individual people (it’s worth saying here that all of the data the I worked with was anonymized). To do this, I created a tool (again, in Processing) to make ‘mobility cubes’ – which show a history of an individual’s movements over time:

Wired UK & Barabási Lab: Process

The individual above, for example, travels around an area less than a square kilometer over a period of just under three days. If I flatten this graph, we can see that this person travels mostly between two locations:

Wired UK & Barabási Lab: Process

From the data, we can identify a lot of individuals like this – commuters – who travel short distances between two places (home, and work). We can also find travelers (people who cover a long distance in a short period of time):

Wired UK & Barabási Lab: Process

And others who seem to follow more elaborate (but often still regular) mobility patterns:

Wired UK & Barabási Lab: Process

We can assemble a ‘mobility cube’ for each individual in the database – and very quickly gain a mechanism for recognizing patterns amongst these people:

Wired UK & Barabási Lab: Process

Which brings us to the underlying point of the piece – we are all leaving digital trails behind us, as we make our way around our individual lives. These trails are largely considered individual – even ethereal – yet technology is making these trails more visible and more readable everyday.

Of course, to see the final piece – the polished assembly of some of the drafts and artifacts you’ve seen in this post – you’ll have to buy the magazine. Wired UK is available on newsstands in the UK, and to all of our clever subscribers.

If you want to read more about this – and you should – I’d highly recommend Albert-László Barabási’s Bursts, which goes into much more detail about human mobility & predictability.

Finally, huge thanks have to go out to László and his team at the lab, without whom this piece would have never made it to print!

Your Random Numbers – Getting Started with Processing and Data Visualization

Over the last year or so, I’ve spent almost as much time thinking about how to teach data visualization as I’ve spent working with data. I’ve been a teacher for 10 years – for better or for worse this means that as I learn new techniques and concepts, I’m usually thinking about pedagogy at the same time. Lately, I’ve also become convinced that this massive ‘open data’ movement that we are currently in the midst of is sorely lacking in educational components. The amount of available data, I think, is quickly outpacing our ability to use it in useful and novel ways. How can basic data visualization techniques be taught in an easy, engaging manner?

This post, then, is a first sketch of what a lesson plan for teaching Processing and data visualization might look like. I’m going to start from scratch, work through some examples, and (hopefully) make some interesting stuff. One of the nice things, I think, about this process, is that we’re going to start with fresh, new data – I’m not sure what kind of things we’re going to find once we start to get our hands dirty. This is what is really exciting about data visualization; the chance to find answers to your own, possibly novel questions.

Let’s Start With the Data

We’re not going to work with an old, dusty data set here. Nor are we going to attempt to bash our heads against an unnecessarily complex pile of numbers. Instead, we’re going to start with a data set that I made up – with the help of a couple of hundred of my Twitter followers. Yesterday morning, I posted this request:

Even on a Saturday, a lot of helpful folks pitched in, and I ended up with about 225 numbers. And so, we have the easiest possible dataset to work with – a single list of whole numbers. I’m hoping that, as well as being simple, this dataset will turn out to be quite interesting – maybe telling us something about how the human brain thinks about numbers.

I wrote a quick Processing sketch to scrape out the numbers from the post, and then to put them into a Google Spreadsheet. You can see the whole dataset here: http://spreadsheets.google.com/pub?key=t6mq_WLV5c5uj6mUNSryBIA&output=html

I chose to start from a Google Spreadsheet in this tutorial, because I wanted people to be able to generate their own datasets to work with. Teachers – you can set up a spreadsheet of your own, and get your students to collect numbers by any means you’d like. The ‘User’ and ‘Tweet’ columns are not necessary; you just need to have a column called ‘Number’.

It’s about time to get down to some coding. The only tricky part in this whole process will be connecting to the Google Spreadsheet. Rather than bog down the tutorial with a lot of confusing semi-advanced code, I’ll let you download this sample sketch which has the Google Spreadsheet machinery in place.

Got it? Great. Open that sketch in Processing, and let’s get started. Just to make sure we’re all in the same place, you should see a screen that looks like this:

At the top of the sketch, you’ll see three String values that you can change. You’ll definitely have to enter your own Google username and password. If you have your own spreadsheet of number data, you can enter in the key for your spreadsheet as well. You can find the key right in the URL of any spreadsheet.

The first thing we’ll do is change the size of our sketch to give us some room to move, set the background color, and turn on smoothing to make things pretty. We do all of this in the setup enclosure:

void setup() {
  //This code happens once, right when our sketch is launched
 size(800,800);
 background(0);
 smooth();
};

Now we need to get our data from the spreadsheet. One of the advantages of accessing the data from a shared remote file is that the remote data can change and we don’t have to worry about replacing files or changing our code.

We’re going to ask for a list of the ‘random’ numbers that are stored in the spreadsheet. The most easy way to store lists of things in Processing is in an Array. In this case, we’re looking for an array of whole numbers – integers. I’ve written a function that gets an integer array from Google – you can take a look at the code on the ‘GoogleCode’ tab if you’d like to see how that is done. What we need to know here is that this function – called getNumbers – will return, or send us back, a list of whole numbers. Let’s ask for that list:

void setup() {
  //This code happens once, right when our sketch is launched
 size(800,800);
 background(0);
 smooth();

 //Ask for the list of numbers
 int[] numbers = getNumbers();
};

OK.

World’s easiest data visualization!

 fill(255,40);
 noStroke();
 for (int i = 0; i < numbers.length; i++) {
   ellipse(numbers[i] * 8, width/2, 8,8);
 };

What this does is to draw a row of dots across the screen, one for each number that occurs in our Google list. The dots are drawn with a low alpha (40/255 or about 16%), so when numbers are picked more than once, they get brighter. The result is a strip of dots across the screen that looks like this:

Right away, we can see a couple of things about the distribution of our ‘random’ numbers. First, there are two or three very bright spots where numbers get picked several times. Also, there are some pretty evident gaps (one right in the middle) where certain numbers don’t get picked at all.

This could be normal though, right? To see if this distribution is typical, let’s draw a line of ‘real’ random numbers below our line, and see if we can notice a difference:

fill(255,40);
 noStroke();
 //Our line of Google numbers
 for (int i = 0; i < numbers.length; i++) {
   ellipse(numbers[i] * 8, height/2, 8,8);
 };
 //A line of random numbers
 for (int i = 0; i < numbers.length; i++) {
   ellipse(ceil(random(0,99)) * 8, height/2 + 20, 8,8);
 };

Now we see the two compared:

The bottom, random line doesn’t seem to have as many bright spots or as evident of gaps as our human-picked line. Still, the difference isn’t that evident. Can you tell right away which line is our line from the group below?

OK. I’ll admit it – I was hoping that the human-picked number set would be more obviously divergent from the sets of numbers that were generated by a computer. It’s possible that humans are better at picking random numbers than I had thought. Or, our sample set is too small to see any kind of real difference. It’s also possible that this quick visualization method isn’t doing the trick. Let’s stay on the track of number distribution for a few minutes and see if we can find out any more.

Our system of dots was easy, and readable, but not very useful for empirical comparisons. For the next step, let’s stick with the classics and

Build a bar graph.

Right now, we have a list of numbers. Ours range from 1-99, but let’s imagine for a second that we had a set of numbers that ranged from 0-10:

[5,8,5,2,4,1,6,3,9,0,1,3,5,7]

What we need to build a bar graph for these numbers is a list of counts – how many times each number occurs:

[1,2,1,2,1,3,1,1,1,1]

We can look at this list above, and see that there were two 1s, and three 5s.

Let’s do the same thing with our big list of numbers – we’re going to generate a list 99 numbers long that holds the counts for each of the possible numbers in our set. But, we’re going to be a bit smarter about it this time around and package our code into a function – so that we can use it again and again without having to re-write it. In this case the function will (eventually) draw a bar graph – so we’ll call it (cleverly) barGraph:

void barGraph( int[] nums ) {
  //Make a list of number counts
 int[] counts = new int[100];
 //Fill it with zeros
 for (int i = 1; i < 100; i++) {
   counts[i] = 0;
 };
 //Tally the counts
 for (int i = 0; i < nums.length; i++) {
   counts[nums[i]] ++;
 };
};

This function constructs an array of counts from whatever list of numbers we pass into it (that list is a list of integers, and we refer to it within the function as ‘nums’, a name which I made up). Now, let’s add the code to draw the graph (I’ve added another parameter to go along with the numbers – the y position of the graph):


void barGraph(int[] nums, float y) {
  //Make a list of number counts
 int[] counts = new int[100];
 //Fill it with zeros
 for (int i = 1; i < 100; i++) {
   counts[i] = 0;
 };
 //Tally the counts
 for (int i = 0; i < nums.length; i++) {
   counts[nums[i]] ++;
 };

 //Draw the bar graph
 for (int i = 0; i < counts.length; i++) {
   rect(i * 8, y, 8, -counts[i] * 10);
 };
};

We’ve added a function – a set of instructions – to our file, which we can use to draw a bar graph from a set of numbers. To actually draw the graph, we need to call the function, which we can do in the setup enclosure. Here’s the code, all together:


/*

 #myrandomnumber Tutorial
 blprnt@blprnt.com
 April, 2010

 */

//This is the Google spreadsheet manager and the id of the spreadsheet that we want to populate, along with our Google username & password
SimpleSpreadsheetManager sm;
String sUrl = "t6mq_WLV5c5uj6mUNSryBIA";
String googleUser = "YOUR USERNAME";
String googlePass = "YOUR PASSWORD";

void setup() {
  //This code happens once, right when our sketch is launched
 size(800,800);
 background(0);
 smooth();

 //Ask for the list of numbers
 int[] numbers = getNumbers();
 //Draw the graph
 barGraph(numbers, 400);
};

void barGraph(int[] nums, float y) {
  //Make a list of number counts
 int[] counts = new int[100];
 //Fill it with zeros
 for (int i = 1; i < 100; i++) {
   counts[i] = 0;
 };
 //Tally the counts
 for (int i = 0; i < nums.length; i++) {
   counts[nums[i]] ++;
 };

 //Draw the bar graph
 for (int i = 0; i < counts.length; i++) {
   rect(i * 8, y, 8, -counts[i] * 10);
 };
};

void draw() {
  //This code happens once every frame.
};

If you run your code, you should get a nice minimal bar graph which looks like this:

We can help distinguish the very high values (and the very low ones) by adding some color to the graph. In Processing’s standard RGB color mode, we can change one of our color channels (in this case, green) with our count values to give the bars some differentiation:


 //Draw the bar graph
 for (int i = 0; i < counts.length; i++) {
   fill(255, counts[i] * 30, 0);
   rect(i * 8, y, 8, -counts[i] * 10);
 };

Which gives us this:

Or, we could switch to Hue/Saturation/Brightness mode, and use our count values to cycle through the available hues:

//Draw the bar graph
 for (int i = 0; i < counts.length; i++) {
   colorMode(HSB);
   fill(counts[i] * 30, 255, 255);
   rect(i * 8, y, 8, -counts[i] * 10);
 };

Which gives us this graph:

Now would be a good time to do some comparisons to a real random sample again, to see if the new coloring makes a difference. Because we defined our bar graph instructions as a function, we can do this fairly easily (I built in an easy function to generate a random list of integers called getRandomNumbers – you can see the code on the ‘GoogleCode’ tab):

void setup() {
  //This code happens once, right when our sketch is launched
 size(800,800);
 background(0);
 smooth();

 //Ask for the list of numbers
 int[] numbers = getNumbers();
 //Draw the graph
 barGraph(numbers, 100);

 for (int i = 1; i < 7; i++) {
 int[] randoms = getRandomNumbers(225);
 barGraph(randoms, 100 + (i * 130));
 };
};

I know, I know. Bar graphs. Yay. Looking at the graphic above, though, we can see more clearly that our humanoid number set is unlike the machine-generated sets. However, I actually think that the color is more valuable than the dimensions of the bars. Since we’re dealing with 99 numbers, maybe we can display these colours in a grid and see if any patterns emerge? A really important thing to be able to do with data visualization is to

Look at datasets from multiple angles.

Let’s see if the grid gets us anywhere. Luckily, a function to make a grid is pretty much the same as the one to make a graph (I’m adding two more parameters – an x position for the grid, and a size for the individual blocks):

void colorGrid(int[] nums, float x, float y, float s) {
 //Make a list of number counts
 int[] counts = new int[100];
 //Fill it with zeros
 for (int i = 0; i < 100; i++) {
   counts[i] = 0;
 };
 //Tally the counts
 for (int i = 0; i < nums.length; i++) {
   counts[nums[i]] ++;
 };

//Move the drawing coordinates to the x,y position specified in the parameters
 pushMatrix();
 translate(x,y);
 //Draw the grid
 for (int i = 0; i < counts.length; i++) {
   colorMode(HSB);
   fill(counts[i] * 30, 255, 255, counts[i] * 30);
   rect((i % 10) * s, floor(i/10) * s, s, s);

 };
 popMatrix();
};

We can now do this to draw a nice big grid:

 //Ask for the list of numbers
 int[] numbers = getNumbers();
 //Draw the graph
 colorGrid(numbers, 50, 50, 70);

I can see some definite patterns in this grid – so let’s bring the actual numbers back into play so that we can talk about what seems to be going on. Here’s the full code, one last time:


/*

 #myrandomnumber Tutorial
 blprnt@blprnt.com
 April, 2010

 */

//This is the Google spreadsheet manager and the id of the spreadsheet that we want to populate, along with our Google username & password
SimpleSpreadsheetManager sm;
String sUrl = "t6mq_WLV5c5uj6mUNSryBIA";
String googleUser = "YOUR USERNAME";
String googlePass = "YOUR PASSWORD";

//This is the font object
PFont label;

void setup() {
  //This code happens once, right when our sketch is launched
 size(800,800);
 background(0);
 smooth();

 //Create the font object to make text with
 label = createFont("Helvetica", 24);

 //Ask for the list of numbers
 int[] numbers = getNumbers();
 //Draw the graph
 colorGrid(numbers, 50, 50, 70);
};

void barGraph(int[] nums, float y) {
  //Make a list of number counts
 int[] counts = new int[100];
 //Fill it with zeros
 for (int i = 1; i < 100; i++) {
   counts[i] = 0;
 };
 //Tally the counts
 for (int i = 0; i < nums.length; i++) {
   counts[nums[i]] ++;
 };

 //Draw the bar graph
 for (int i = 0; i < counts.length; i++) {
   colorMode(HSB);
   fill(counts[i] * 30, 255, 255);
   rect(i * 8, y, 8, -counts[i] * 10);
 };
};

void colorGrid(int[] nums, float x, float y, float s) {
 //Make a list of number counts
 int[] counts = new int[100];
 //Fill it with zeros
 for (int i = 0; i < 100; i++) {
   counts[i] = 0;
 };
 //Tally the counts
 for (int i = 0; i < nums.length; i++) {
   counts[nums[i]] ++;
 };

 pushMatrix();
 translate(x,y);
 //Draw the grid
 for (int i = 0; i < counts.length; i++) {
   colorMode(HSB);
   fill(counts[i] * 30, 255, 255, counts[i] * 30);
   textAlign(CENTER);
   textFont(label);
   textSize(s/2);
   text(i, (i % 10) * s, floor(i/10) * s);
 };
 popMatrix();
};

void draw() {
  //This code happens once every frame.

};

And, our nice looking number grid:

BINGO!

No, really. If this was a bingo card, and I was a 70-year old, I’d be rich. Look at that nice line going down the X7 column – 17, 27, 37, 47, 57, 67, 77, 87, and 97 are all appearing with good frequency. If we rule out the Douglas Adams effect on 42, it is likely that most of the top 10 most-frequently occurring numbers would have a 7 on the end. Do numbers ending with 7s ‘feel’ more random to us? Or is there something about the number 7 that we just plain like?

Contrasting to that, if I had played the x0 row, I’d be out of luck. It seems that numbers ending with a zero don’t feel very random to us at all. This could also explain the black hole around the number 50 – which, in a range from 0-100, appears to be the ‘least random’ of all.

Well, there we have it. A start-to finish example of how we can use Processing to visualize simple data, with a goal to expose underlying patterns and anomalies. The techniques that we used in this project were fairly simple – but they are useful tools that can be used in a huge variety of data situations (I use them myself, all the time).

Hopefully this tutorial is (was?) useful for some of you. And, if there are any teachers out there who would like to try this out with their classrooms, I’d love to hear how it goes.