Tag Archives: teaching

Your Random Numbers – Getting Started with Processing and Data Visualization

Over the last year or so, I’ve spent almost as much time thinking about how to teach data visualization as I’ve spent working with data. I’ve been a teacher for 10 years – for better or for worse this means that as I learn new techniques and concepts, I’m usually thinking about pedagogy at the same time. Lately, I’ve also become convinced that this massive ‘open data’ movement that we are currently in the midst of is sorely lacking in educational components. The amount of available data, I think, is quickly outpacing our ability to use it in useful and novel ways. How can basic data visualization techniques be taught in an easy, engaging manner?

This post, then, is a first sketch of what a lesson plan for teaching Processing and data visualization might look like. I’m going to start from scratch, work through some examples, and (hopefully) make some interesting stuff. One of the nice things, I think, about this process, is that we’re going to start with fresh, new data – I’m not sure what kind of things we’re going to find once we start to get our hands dirty. This is what is really exciting about data visualization; the chance to find answers to your own, possibly novel questions.

Let’s Start With the Data

We’re not going to work with an old, dusty data set here. Nor are we going to attempt to bash our heads against an unnecessarily complex pile of numbers. Instead, we’re going to start with a data set that I made up – with the help of a couple of hundred of my Twitter followers. Yesterday morning, I posted this request:

Even on a Saturday, a lot of helpful folks pitched in, and I ended up with about 225 numbers. And so, we have the easiest possible dataset to work with – a single list of whole numbers. I’m hoping that, as well as being simple, this dataset will turn out to be quite interesting – maybe telling us something about how the human brain thinks about numbers.

I wrote a quick Processing sketch to scrape out the numbers from the post, and then to put them into a Google Spreadsheet. You can see the whole dataset here: http://spreadsheets.google.com/pub?key=t6mq_WLV5c5uj6mUNSryBIA&output=html

I chose to start from a Google Spreadsheet in this tutorial, because I wanted people to be able to generate their own datasets to work with. Teachers – you can set up a spreadsheet of your own, and get your students to collect numbers by any means you’d like. The ‘User’ and ‘Tweet’ columns are not necessary; you just need to have a column called ‘Number’.

It’s about time to get down to some coding. The only tricky part in this whole process will be connecting to the Google Spreadsheet. Rather than bog down the tutorial with a lot of confusing semi-advanced code, I’ll let you download this sample sketch which has the Google Spreadsheet machinery in place.

Got it? Great. Open that sketch in Processing, and let’s get started. Just to make sure we’re all in the same place, you should see a screen that looks like this:

At the top of the sketch, you’ll see three String values that you can change. You’ll definitely have to enter your own Google username and password. If you have your own spreadsheet of number data, you can enter in the key for your spreadsheet as well. You can find the key right in the URL of any spreadsheet.

The first thing we’ll do is change the size of our sketch to give us some room to move, set the background color, and turn on smoothing to make things pretty. We do all of this in the setup enclosure:

void setup() {
  //This code happens once, right when our sketch is launched
 size(800,800);
 background(0);
 smooth();
};

Now we need to get our data from the spreadsheet. One of the advantages of accessing the data from a shared remote file is that the remote data can change and we don’t have to worry about replacing files or changing our code.

We’re going to ask for a list of the ‘random’ numbers that are stored in the spreadsheet. The most easy way to store lists of things in Processing is in an Array. In this case, we’re looking for an array of whole numbers – integers. I’ve written a function that gets an integer array from Google – you can take a look at the code on the ‘GoogleCode’ tab if you’d like to see how that is done. What we need to know here is that this function – called getNumbers – will return, or send us back, a list of whole numbers. Let’s ask for that list:

void setup() {
  //This code happens once, right when our sketch is launched
 size(800,800);
 background(0);
 smooth();

 //Ask for the list of numbers
 int[] numbers = getNumbers();
};

OK.

World’s easiest data visualization!

 fill(255,40);
 noStroke();
 for (int i = 0; i < numbers.length; i++) {
   ellipse(numbers[i] * 8, width/2, 8,8);
 };

What this does is to draw a row of dots across the screen, one for each number that occurs in our Google list. The dots are drawn with a low alpha (40/255 or about 16%), so when numbers are picked more than once, they get brighter. The result is a strip of dots across the screen that looks like this:

Right away, we can see a couple of things about the distribution of our ‘random’ numbers. First, there are two or three very bright spots where numbers get picked several times. Also, there are some pretty evident gaps (one right in the middle) where certain numbers don’t get picked at all.

This could be normal though, right? To see if this distribution is typical, let’s draw a line of ‘real’ random numbers below our line, and see if we can notice a difference:

fill(255,40);
 noStroke();
 //Our line of Google numbers
 for (int i = 0; i < numbers.length; i++) {
   ellipse(numbers[i] * 8, height/2, 8,8);
 };
 //A line of random numbers
 for (int i = 0; i < numbers.length; i++) {
   ellipse(ceil(random(0,99)) * 8, height/2 + 20, 8,8);
 };

Now we see the two compared:

The bottom, random line doesn’t seem to have as many bright spots or as evident of gaps as our human-picked line. Still, the difference isn’t that evident. Can you tell right away which line is our line from the group below?

OK. I’ll admit it – I was hoping that the human-picked number set would be more obviously divergent from the sets of numbers that were generated by a computer. It’s possible that humans are better at picking random numbers than I had thought. Or, our sample set is too small to see any kind of real difference. It’s also possible that this quick visualization method isn’t doing the trick. Let’s stay on the track of number distribution for a few minutes and see if we can find out any more.

Our system of dots was easy, and readable, but not very useful for empirical comparisons. For the next step, let’s stick with the classics and

Build a bar graph.

Right now, we have a list of numbers. Ours range from 1-99, but let’s imagine for a second that we had a set of numbers that ranged from 0-10:

[5,8,5,2,4,1,6,3,9,0,1,3,5,7]

What we need to build a bar graph for these numbers is a list of counts – how many times each number occurs:

[1,2,1,2,1,3,1,1,1,1]

We can look at this list above, and see that there were two 1s, and three 5s.

Let’s do the same thing with our big list of numbers – we’re going to generate a list 99 numbers long that holds the counts for each of the possible numbers in our set. But, we’re going to be a bit smarter about it this time around and package our code into a function – so that we can use it again and again without having to re-write it. In this case the function will (eventually) draw a bar graph – so we’ll call it (cleverly) barGraph:

void barGraph( int[] nums ) {
  //Make a list of number counts
 int[] counts = new int[100];
 //Fill it with zeros
 for (int i = 1; i < 100; i++) {
   counts[i] = 0;
 };
 //Tally the counts
 for (int i = 0; i < nums.length; i++) {
   counts[nums[i]] ++;
 };
};

This function constructs an array of counts from whatever list of numbers we pass into it (that list is a list of integers, and we refer to it within the function as ‘nums’, a name which I made up). Now, let’s add the code to draw the graph (I’ve added another parameter to go along with the numbers – the y position of the graph):


void barGraph(int[] nums, float y) {
  //Make a list of number counts
 int[] counts = new int[100];
 //Fill it with zeros
 for (int i = 1; i < 100; i++) {
   counts[i] = 0;
 };
 //Tally the counts
 for (int i = 0; i < nums.length; i++) {
   counts[nums[i]] ++;
 };

 //Draw the bar graph
 for (int i = 0; i < counts.length; i++) {
   rect(i * 8, y, 8, -counts[i] * 10);
 };
};

We’ve added a function – a set of instructions – to our file, which we can use to draw a bar graph from a set of numbers. To actually draw the graph, we need to call the function, which we can do in the setup enclosure. Here’s the code, all together:


/*

 #myrandomnumber Tutorial
 blprnt@blprnt.com
 April, 2010

 */

//This is the Google spreadsheet manager and the id of the spreadsheet that we want to populate, along with our Google username & password
SimpleSpreadsheetManager sm;
String sUrl = "t6mq_WLV5c5uj6mUNSryBIA";
String googleUser = "YOUR USERNAME";
String googlePass = "YOUR PASSWORD";

void setup() {
  //This code happens once, right when our sketch is launched
 size(800,800);
 background(0);
 smooth();

 //Ask for the list of numbers
 int[] numbers = getNumbers();
 //Draw the graph
 barGraph(numbers, 400);
};

void barGraph(int[] nums, float y) {
  //Make a list of number counts
 int[] counts = new int[100];
 //Fill it with zeros
 for (int i = 1; i < 100; i++) {
   counts[i] = 0;
 };
 //Tally the counts
 for (int i = 0; i < nums.length; i++) {
   counts[nums[i]] ++;
 };

 //Draw the bar graph
 for (int i = 0; i < counts.length; i++) {
   rect(i * 8, y, 8, -counts[i] * 10);
 };
};

void draw() {
  //This code happens once every frame.
};

If you run your code, you should get a nice minimal bar graph which looks like this:

We can help distinguish the very high values (and the very low ones) by adding some color to the graph. In Processing’s standard RGB color mode, we can change one of our color channels (in this case, green) with our count values to give the bars some differentiation:


 //Draw the bar graph
 for (int i = 0; i < counts.length; i++) {
   fill(255, counts[i] * 30, 0);
   rect(i * 8, y, 8, -counts[i] * 10);
 };

Which gives us this:

Or, we could switch to Hue/Saturation/Brightness mode, and use our count values to cycle through the available hues:

//Draw the bar graph
 for (int i = 0; i < counts.length; i++) {
   colorMode(HSB);
   fill(counts[i] * 30, 255, 255);
   rect(i * 8, y, 8, -counts[i] * 10);
 };

Which gives us this graph:

Now would be a good time to do some comparisons to a real random sample again, to see if the new coloring makes a difference. Because we defined our bar graph instructions as a function, we can do this fairly easily (I built in an easy function to generate a random list of integers called getRandomNumbers – you can see the code on the ‘GoogleCode’ tab):

void setup() {
  //This code happens once, right when our sketch is launched
 size(800,800);
 background(0);
 smooth();

 //Ask for the list of numbers
 int[] numbers = getNumbers();
 //Draw the graph
 barGraph(numbers, 100);

 for (int i = 1; i < 7; i++) {
 int[] randoms = getRandomNumbers(225);
 barGraph(randoms, 100 + (i * 130));
 };
};

I know, I know. Bar graphs. Yay. Looking at the graphic above, though, we can see more clearly that our humanoid number set is unlike the machine-generated sets. However, I actually think that the color is more valuable than the dimensions of the bars. Since we’re dealing with 99 numbers, maybe we can display these colours in a grid and see if any patterns emerge? A really important thing to be able to do with data visualization is to

Look at datasets from multiple angles.

Let’s see if the grid gets us anywhere. Luckily, a function to make a grid is pretty much the same as the one to make a graph (I’m adding two more parameters – an x position for the grid, and a size for the individual blocks):

void colorGrid(int[] nums, float x, float y, float s) {
 //Make a list of number counts
 int[] counts = new int[100];
 //Fill it with zeros
 for (int i = 0; i < 100; i++) {
   counts[i] = 0;
 };
 //Tally the counts
 for (int i = 0; i < nums.length; i++) {
   counts[nums[i]] ++;
 };

//Move the drawing coordinates to the x,y position specified in the parameters
 pushMatrix();
 translate(x,y);
 //Draw the grid
 for (int i = 0; i < counts.length; i++) {
   colorMode(HSB);
   fill(counts[i] * 30, 255, 255, counts[i] * 30);
   rect((i % 10) * s, floor(i/10) * s, s, s);

 };
 popMatrix();
};

We can now do this to draw a nice big grid:

 //Ask for the list of numbers
 int[] numbers = getNumbers();
 //Draw the graph
 colorGrid(numbers, 50, 50, 70);

I can see some definite patterns in this grid – so let’s bring the actual numbers back into play so that we can talk about what seems to be going on. Here’s the full code, one last time:


/*

 #myrandomnumber Tutorial
 blprnt@blprnt.com
 April, 2010

 */

//This is the Google spreadsheet manager and the id of the spreadsheet that we want to populate, along with our Google username & password
SimpleSpreadsheetManager sm;
String sUrl = "t6mq_WLV5c5uj6mUNSryBIA";
String googleUser = "YOUR USERNAME";
String googlePass = "YOUR PASSWORD";

//This is the font object
PFont label;

void setup() {
  //This code happens once, right when our sketch is launched
 size(800,800);
 background(0);
 smooth();

 //Create the font object to make text with
 label = createFont("Helvetica", 24);

 //Ask for the list of numbers
 int[] numbers = getNumbers();
 //Draw the graph
 colorGrid(numbers, 50, 50, 70);
};

void barGraph(int[] nums, float y) {
  //Make a list of number counts
 int[] counts = new int[100];
 //Fill it with zeros
 for (int i = 1; i < 100; i++) {
   counts[i] = 0;
 };
 //Tally the counts
 for (int i = 0; i < nums.length; i++) {
   counts[nums[i]] ++;
 };

 //Draw the bar graph
 for (int i = 0; i < counts.length; i++) {
   colorMode(HSB);
   fill(counts[i] * 30, 255, 255);
   rect(i * 8, y, 8, -counts[i] * 10);
 };
};

void colorGrid(int[] nums, float x, float y, float s) {
 //Make a list of number counts
 int[] counts = new int[100];
 //Fill it with zeros
 for (int i = 0; i < 100; i++) {
   counts[i] = 0;
 };
 //Tally the counts
 for (int i = 0; i < nums.length; i++) {
   counts[nums[i]] ++;
 };

 pushMatrix();
 translate(x,y);
 //Draw the grid
 for (int i = 0; i < counts.length; i++) {
   colorMode(HSB);
   fill(counts[i] * 30, 255, 255, counts[i] * 30);
   textAlign(CENTER);
   textFont(label);
   textSize(s/2);
   text(i, (i % 10) * s, floor(i/10) * s);
 };
 popMatrix();
};

void draw() {
  //This code happens once every frame.

};

And, our nice looking number grid:

BINGO!

No, really. If this was a bingo card, and I was a 70-year old, I’d be rich. Look at that nice line going down the X7 column – 17, 27, 37, 47, 57, 67, 77, 87, and 97 are all appearing with good frequency. If we rule out the Douglas Adams effect on 42, it is likely that most of the top 10 most-frequently occurring numbers would have a 7 on the end. Do numbers ending with 7s ‘feel’ more random to us? Or is there something about the number 7 that we just plain like?

Contrasting to that, if I had played the x0 row, I’d be out of luck. It seems that numbers ending with a zero don’t feel very random to us at all. This could also explain the black hole around the number 50 – which, in a range from 0-100, appears to be the ‘least random’ of all.

Well, there we have it. A start-to finish example of how we can use Processing to visualize simple data, with a goal to expose underlying patterns and anomalies. The techniques that we used in this project were fairly simple – but they are useful tools that can be used in a huge variety of data situations (I use them myself, all the time).

Hopefully this tutorial is (was?) useful for some of you. And, if there are any teachers out there who would like to try this out with their classrooms, I’d love to hear how it goes.

Fall Processing Workshops – Sept. 12 & Oct. 3

Circle patterns
Circle Patterns, by Alex Beim. See more of Alex’s Processing experiments in this Flickr set.

I was as surprised as anyone to look at the calendar yesterday and see that we’ve left summer behind. It’s September, the days are getting shorter, school is back in session, and soon the shorts will go back into the drawer. This (former) summer saw my first experiments with holding day-long Processing workshops in my house/church/studio. I’m happy to say that they were a success. We managed to cover a lot of ground in a short period of time, and I like to think that everyone who attended left with some excitement about creating with code. I’ve been very happy to see a lot attendees continuing their work with Processing. You can see one example from Alex Beim at the top of this post, and a few more from Dave Shea below.

I’ve decided to carry on with the workshops into the Fall. The next Introduction to Processing workshop will be held on Saturday, September 12th. The next Processing for Web Developers workshops (or Processing Level 2) will be held on Saturday, October 3rd.


A collection of some of Dave Shea‘s processing sketches. Dave is documenting his experiences with Processing on his blog, Ex Nihilo.

Worth noting is the discount system – if you are a student, you get 10% off. If you refer someone else, you get 10% off (refer 2, get 20%). If you’ve already taken a workshop with me, you get another 10% off. I feel like I should make an infomercial!

Space is limited for both workshops, so please get in touch with me if you’d like to reserve a spot. For more details, please check out the workshops page.

Summer Creative Coding Workshops in Vancouver: Processing, Processing, Processing!

For years I have been teaching programming courses of various varieties at Langara College, as part of their Electronic Media Design program. The course I teach is great – I get to introduce programming to groups of designers over 14 weeks, starting with Processing and moving into ActionScript. The problem is, this course is only available as part of the full-time EMD program. I’ve always wanted to teach creative coding & Processing to a wider audience in Vancouver, but have never really had the venue.

Main room, 2005, looking down toward altar

Now, I do. Our studio and living space is a renovated mission church in Vancouver’s historic Strathcona neighbourhood. I think it will be the perfect place to teach to small groups, so I am going to do just that! As a test run this summer, I’ll be running a series of one-day workshops about Processing. If you have every wanted to get into creative coding, this is a perfect opportunity for you to learn how to use a computer program to build visual and multi-media projects. If you are already a programmer, but want to learn Processing as a prototyping tool, for tangible computing projects, or for data visualizaiton, we have workshops for you, as well.

Here is the schedule for the first few workshops:

  • Introduction to Processing – Saturday May 2nd, Saturday June 27th
  • Processing for Web Programmers – Sunday May 3rd, Sunday June 28th
  • Processing & Data Visualization – Dates TBA

Processing is an electronic sketchbook for developing ideas. Since its simple beginnings at the MIT Media Lab, it has emerged as an invaluable tool for media artists, designers, and programmers around the world. Processing is a friendly language for beginners – and at the same time a powerful tool for coders of all levels.

For prices, dates, and other information, or to book your place in a session, read the workshops page or get in touch with me.