I’ve been working on a project at the New York Times R&D Lab which involves rendering a fair amount of text to screen, which needs to be animated smoothly in real time. I’ve never had much luck working with large amounts of text in Processing, using the traditional approach. But I’ve found a nice way using an external library and a old trick, which allows for the rendering of large amounts of text at high framerates (the image above shows 1000+ text blocks running at ~50fps on my two-year-old MacBook).
How does it work?
Well, let’s first talk about how we’d typically approach this problem. Because they need to be animated independently, each of the text blocks is a member of a custom Class. Once per frame, we loop through all of the text objects, and ask them to render. In the old-school way, each of these text blocks would render using Processing’s text() method, and all of the shapes would be drawn to screen. While this works well with limited amounts of text, if we add more text objects to the sketch, things start to slow down at an alarming rate:
Here, I’ve barely added 200 objects, and the frame rate has crawled down to a measly 10fps. Clearly not good.
The first part of the solution is to use an old-school technique called ‘blitting’. Now, before all of you demo-scene kids get all worked up, I know this isn’t really blitting, but I like the sound of the word, and it’s my tutorial, so deal with it. Instead of drawing every letter of every text block every frame, what we’re going to do is this:
Render the text once to an off-screen graphics object
Transfer the pixels from this off-screen object to a PImage
Render that PImage every frame
The core reason this works is that Processing is much better at rendering hundreds of images than it is at rendering hundreds of text objects. With this implemented, we can get many, many more objects onto our screen:
Here, we’ve got about 550 objects on the screen, at full framerate. But, at least on my Mac, this system seems to collapse after about 600 objects. So, how can we get up to the 1000 objects that I showed in the first image in this post?
Let’s push it a little.
The last step of this process owes its existence to Andres Colubri, who has built an excellent library called GLGraphics which lets us do all kinds of things in Processing that get accelerated by the super-fancy graphics cards (GPUs) which most of us are lucky enough to have in our computers (Andres is also responsible for all of the tasty Processing for Android 3D stuff which I’ll be posting about later this week). In this case, the final step involves rendering all of the text blocks as GL Textures, instead of as PImages (note that this will give you a performance boost in any case where you are using images – yay!)
Around now you’re probably asking about the code. Well, here it is. I’ve tried to make it as simple as possible – just remember that you need the GLGraphics library installed.
This solution isn’t perfect – it doesn’t allow for perfect scaling, or dynamic effects within the text objects. But is does solve the basic problem, and also gives us a set of GLTextures which are ready to be used in whatever wacky 3D system we want to put them in. Let me know how it works for you.
I spent a little bit of time today working on my text comparison tool, which I built last weekend to satisfy my curiosity about the similarities and differences between two very similar articles published on head injuries in the NFL (you can read the post here). I wanted to test out the tool with a different kind of content, and settled on something more political: two high profile foreign policy speeches by US President Barack Obama.
The shared words – ‘america’,’world’,’common’,’human’,’responsibility’, ect don’t offer much in the way of analysis. Things start to get interesting, though, when we look towards the edges (click on the images to see larger versions):
At the far extremes, the speech in Cairo was about Islam, about Palestinians, about peace, faith, and communities. The Tokyo address was about China, North Korea, security, agreement and growth. If we look at some of the common words that were used in both speeches, we can see some more interesting patterns emerge.
It seems, for instance that the Egyptian address was more about people, whereas the speech in Tokyo was directed towards nations:
Obama makes many more mentions about peace in Cairo (in Japan, this word seems to have been replaced by ‘security’), and far more mentions of prosperity in Tokyo:
There was a lot of speculation prior to Obama’s speech in Asia about how much focus the President would put on human rights. In the speech, Obama mentions ‘rights’ only five times – once at the beginning of the speech and four times near the end. This weighting is interesting when we compare it to Obama’s reference to China during the speech, which is heavily concentrated at the beginning (China is not mentioned at all past the half-way point of the speech):
Though one of the five occurrences of ‘rights’ is in reference to China, it appears from this analysis that there may have been a deliberate plan to keep the ‘human rights part’ of the speech separated from the ‘China part’.
There likely many interesting things in this data set, a lot of which are open to interpretation. While it’s doubtful that one can steer entirely clear of political biases during this kind of comparison, the quantitative nature of the data makes it a little bit easier to make an attempt at nonpartisan analysis. I will be including these speeches as sample texts when I release the tool to the public (hopefully next week).
Going against the grain of most of my usual blog content, this post is political, opinionated, and locally-focused. It is however, also about data visualization, open government, accountability. Consider yourself warned.
I live in British Columbia, where our ‘liberal’ government has announced plans to make staggering cuts to arts funding over the next year. I was interested in seeing how the cuts to arts funding stacked up against the rest of the spending reductions. I also wanted to get involved with some of the groups and organizations protesting these cuts, and wondered how I could offer the most useful assistance. I thought I’d start by taking a closer look at the budget figures that were released, to put the Arts & Culture issue in context.
To get at this information, I created a dataset from the September Budget Update. It took about an hour of cutting and pasting – BC isn’t exactly on the Open Government wagon yet, but at least all of the .PDFs were formatted the same way. This is DIY transparency – I’ll be posting the full data set as a Google Doc later today.
Once I had the data (stored in a tab-delimited text file for now), I built a lightweight visualization tool in Processing which let me organize and view the data in some useful ways (this took about 4 hours). The main question that I wanted to investigate was this: How do cuts to Arts & Culture funding stack up against cuts in other government business areas?
In the images above and below, we see the 114 items in the budget with expenditures of $1M or higher (bars represent money spent). Arts & Culture funding moves from the 57th highest expenditure at 19.5M in 2008/2009 to the 100th highest expenditure in 2009/2010 with less than 3.7M in funding. This is clearly a significant drop. Not only does Arts & Culture lose a lot of money, it loses much more money in comparison to other programs.
When the 114 expenditures are arranged to display gain (in blue) or loss (in red), the picture becomes even more clear (here, bars represent percentage loss or gain). With a loss of more than 80%, Arts & Culture suffers the second worst cuts – with the worst being another Arts & Culture-related line item!
Compared to other business areas with similar budgets, this decline is particularly drastic. For example, Asia Pacific Trade & Investment falls only 26% (from 16.179M to 11.593M) and Small Business, Research & Competitiveness falls only 21% (from 21.966M to 17.263M). Tourism, overseen by the same ministry as Arts & Culture, enjoys a rise in funding (due to the 2010 Olympics) of 12% (from 18.305M to 20.505M).
Let’s look at same set of graphics, this time limiting to expenditures between 10M and 30M (the visualization tool allows us to restrict the view to any monetary bracket). There are 40 expenditures in the 10-30M range, shown in the charts below. First, the 2008/2009 expenditures, with the bars representing money spent:
Now, the same 40 expenditures in 2009/2010:
And those line items showing the amount of loss or gain, with the bars this time representing percentage loss or gain (there will be scale lines in the final tool) :
These graphs, even more than the first set, show that Arts & Culture has been singled out for much larger cuts than any other similar government business area. Why?
It may be that the Liberal government doesn’t consider a thriving arts & culture industry to be part of their plans for our province. By making such drastic cuts, they also appear to be ignoring studies (many of which they have referenced in their own documents) which demonstrate that investments in the arts tend to lead to an increase in GDP. Going further, some would suggest that this policy move is a purely political one – meant to curry favour with voters who are generally antagonistic towards anything ‘artsy’.
Mind you, we can’t rule out general fiscal incompetence.
When we visualize data we often get a chance to see patterns or anomalies that we might not otherwise notice. In the third image in this post, above, you may have seen the left-most blue bar, which actually stretches well past the top of the image. That same blue bar is shown, to scale, in the image above. The bar shows an increase in budgeted operating expenses for something called the RuralBC Secretariat, which apparently gets a one year only 793% increase in funding from 4.154M in 2008/2009 to 32.951M in 2009/2010 (this increase is roughly 2x the cuts to arts & culture). This entry seems to have clerical error written all over it. The budget for the same department in 2011/12 is 3.951M, and in 2012/2013 is 2.951M. Do those numbers seem strange to you, too? Have a look at them together (from the service plan update):
Doesn’t it look like that ‘3’ (or 2) in 32,951 was added by accident? Whatever the source of this extra $30M in expenditures, it carries through in the main budget estimate document, and is figured into the main budget numbers that were announced to the press.
It’s entirely possible that this is a legitimate increase in expense. I could find no mention no of the extra expense either in the service plan update or on the RuralBC Secretariat website, but it may be for some under-publicized rural Olympic-related initiative. [NOTE – please see the comments for some discussion about where this extra expense may have come from]
That said, I wouldn’t rule out the possibility that the BC Government can’t do simple math, and might have put an extra $30M into a line item where it didn’t belong. Oops.
Wether or not this anomaly turns out to be a mistake, the ease with which data can be gathered and analyzed by the public will hopefully make my government and others more accountable. This will be facilitated by large, organized open government movements (such as data.gov in the US) – as data is more freely available, these large projects offer more freedom to investigate and question the activities of our politicians. However, investigation and analysis can and will also happen on an individual level, by using tools like Processing or OpenFrameworks. Big brother may be watching – but we can watch them right back.
Finally, here is a video capture from the visualization tool in action, showing some of the interface and transitions between states. I plan on releasing a public version of the tool for online use before the end of the week. The graphics (click on each one to get to the Flickr page) are Creative Commons licensed and free for anyone to use. Please get in touch with me if you would like to get high-res versions for print, or would like to get access to the full data set.
The maps posted so far in the Flickr set are general ones – but these can also be generated for any refined keyword search. Similarly, while the current maps are for individual years, a map could be made for any given period of time (the Bush presidency, the Gulf War, September 2001), or indeed for the whole period of time available through the API (1981-present).
One of the best things about using Processing for these types of projects is that the final result can be output to different formats. These, for example, can be output very easily as .PDFs, and I do think they’d look particularly striking as wall-sized prints. 25 years * 10 feet… does anyone have a 250-foot long wall they can lend me?