Category Archives: twitter

GoodMorning!

GoodMorning! Pinheads

GoodMorning! is a Twitter visualization tool that shows about 11,000 ‘good morning’ tweets over a 24 hour period, rendering a simple sample of Twitter activity around the globe. The tweets are colour-coded: green blocks are early tweets, orange ones are around 9am, and red tweets are later in the morning. Black blocks are ‘out of time’ tweets which said good morning (or a non-english equivalent) at a strange time in the day. Click on the image above to see North America in detail (click on the ‘all sizes’ button to see the high-res version, which is 6400×3600), or watch the video below to see the ‘good morning’ wave travel around the globe:

GoodMorning! Full Render #2 from blprnt on Vimeo.

I’ll admit that this isn’t a particularly useful visualization. It began as a quick idea that emerged out of the discussions following my post about Just Landed, in which several commenters asked to see a global version of that project. This would have been reasonably easy, but I felt that the 2D map made the important information in that visualization a bit easier to see. I wondered what type of data would be interesting to see on a globe, and started to think about something time based; more specifically, something in which you might see a kind of a wave traveling around the earth. I have been neck-deep in the Twitter API for a couple of months now, and eventually the idea trickled up to look at ‘good morning’ tweets.

The first task was to gather 24 hours worth of good morning tweets. Querying the Twitter API is easy enough – I posted a simple tutorial about doing this with Processing and Twitter4J a couple of weeks ago. The issue with gathering this many tweets is that you can only get the most recent 1,500 tweets with any search request. I needed many more than that – there are about 11,000 in the video, and those were only the ones from users with a valid location in their Twitter profile. All told, I ended up receiving upwards of 50,000 tweets. The only way to get this many results was to leave a ‘gathering’ client running for 24 hours. I should have put this on a server somewhere, but I didn’t, so my laptop needed to run the client for a full day to get the results. It ended up taking 5 days – a few false starts (the initial scripts choked on some strange iPhone locations), along with a couple of bone-headed errors on my part. Finally, I ended up with a JSON file which held the messages, users, dates and locations for a day’s worth of morning greetings.

I’d already decided to embrace the visual cliche that is the spinning globe. It’s reasonably easy to place points on a sphere once you know the latitude and longitude values – the first thing I did was to place all 11,000 points on the globe to see what kind of a distribution I ended up with. Not surprisingly, the points don’t cover the whole globe. I tried to include some non-english languages to encourage a more even distribution, but I don’t think I did the best job that I could have (if you have ideas for what to search for for other languages – particularly Asian ones, please leave a note in the comments). Still, I thought that there should be enough to get my ‘good morning wave’ going.

In my first attempts, I coloured the tweet blocks according to the size of the tweet. In these versions, you can see the wave, but it’s not very distinct:

GoodMorning! First Render from blprnt on Vimeo.

I needed some kind of colouring that would show the front of the wave – I ended up setting the colour of the blocks according to how far the time block was away from 9am, local time. This gave me the colour spectrum that you see in the latest versions.

Originally, I wanted to include the text in the tweets, but after a few very messy renders, I dropped that idea. I still think it might be possible to incorporate text in some way that doesn’t look like a pile of pick-up-sticks – I just haven’t found it yet. Here’s a render from the text-mess:

GoodMorning!

There are some inherent problems with this visualization. As mentioned earlier, it’s certainly not a complete representation of global Twitter users. Also, I’m relying on the location that the user lists in their Twitter profile to plot the points on the globe. It’s very likely that a high proportion of these locations could be inaccurate. Even if the location is correct, it might not be accurate enough to be useful. If you look at the bottom right of the images above, you’ll see a big plume of blocks in the middle of South America. This isn’t some undiscovered Twitter city in the middle of the jungle (El Tworado? Sorry.) – it’s the result of many users listing their location as simply ‘South America’. There’s one of these in every country and continent (this explains the cluster of Canadians tweeting from somewhere near Baker Lake).

On the other hand, it provides a model for how similar visualizations might be made – propagation maps of trending topics, plotting of followers over time, etc. Even in its current form, the tool does provide some interesting data – for example it seems that East Coasters tweet earlier than West Coaters (there’s more green in the East than in the West). I’m guessing that in the hands of people with more than my rudimentary statistics skills, these kinds of data sets could tell us some interesting – and (heaven forbid) useful things.

Too much information? Twitter, Privacy, and Lawrence Waterhouse

A few months ago, Twitter was in the news over concerns that people might be sharing a little bit too much information in their social networking broadcasts. Arizona resident Israel Hyman left for a vacation – but not before sending out a Tweet telling his friends and followers that he’d be out of town. While he was away, Hyman’s house was burgled. He suspected that his status update could have motivated the thieves to give his house a try.

Personally, I’m not convinced there are too many Twitter-saavy burglars (TweetBurglars?) out there. It’s probably much more likely that the would-be thief saw Mr. Hyman drive off in his station wagon with a canoe on top. In any case, I should be safe: I don’t broadcast much personal information in my Twitter feed.

At least not too much implicit personal information.

How much ‘hidden’ information am I sharing my Tweets? This is a question that has come up recently as I’ve been digging around with the Twitter API. I think that curious parties, armed with the Twitter API and some rudimentary programming skills might be able to find out more about you than you might think.

Here is a simple graphic showing all of my 1,212 Tweets since October of last year:

Twittergrams: All Tweets

The graph is ordered by day horizontally and by time vertically – tweets near the top are close to midnight, while those in the lower half were broadcast in the morning. I’ve also indicated the length of the tweet with the size of the circles – longer tweets show up larger and darker. You can see some trends from this really simple graph. First, it’s clear that my tweets have gotten longer and more frequent since I started using Twitter. Also, my first Tweets of the day (the bottom-most ones) seem to start on average at the same time – which a few notable anomalies. To investigate this a bit further, let’s highlight just the first tweets of the day (I’ve ignored days with 3 or fewer tweets):

Twittergrams: Morning Tweets

You can see from this plot that my morning messages tend to fall around 9:30am. There are a few outliers where I (heaven forbid) haven’t been around the computer – but there are also some deviations from the 9:30 norm that aren’t just statistical anomalies. If we plot some lines through the morning points starting in January this year, we’ll see the three areas where my twitter behaviour is not ‘normal’:

Twittergrams: Morning Tweet Graph

The yellow line in this graph is the 9:30 mean. The red line shows (with a bit of cushioning) the progression of  first tweet times over the 8 months in question. At the marked points, my first tweets deviate from the average. Why? In two of the tree cases, I’ve changed time zones. In March, I was in Munich, and in May/June I was in Boston, New York, and Minneapolis (you can actually see the time shift between EST & CST). In the zone marked with a 1, I was commuting in the morning to a residency in a Vancouver suburb – hence the later starts until February.

This is very simple example – certainly there isn’t a lot of useful (or incriminating) information to be found. But in the hands of a more capable investigator, it’s possible that the information underneath all of the Tweets, Facebook updates, Flickr comments, etc. that I am broadcasting everyday could reveal a lot more that I would want to share. Some nefarious party could quite easily set up a ‘tracker’ to watch my public broadcast, and to be notified if my daily behaviour deviates from the norm. TweetBurglars, are you listening?

Of course, all of this information can be useful for the good guys, too. With millions of people active on Twitter, the store of data – and what it can reveal – gets more and more interesting every day. We are already seeing scientists using web data to measure public happiness, but I think we have just scraped the surface of what could be uncovered. (To see a model of how Twitter updates could be used to track travel and disease spread, see my post on Just Landed).

In Neal Stephenson’s Cryptonomicon, he decribes a scenario in which an accurate map of London might be generated by tracking the heights of people’s heads as they stepped on and off of curves:

The curbs are sharp and perpendicular, not like the American smoothly molded sigmoid-cross-section curves. The transition between the side walk and the street is a crisp vertical. If you put a green lightbulb on Waterhouse’s head and watched him from the side during the blackout, his trajectory would look just like a square wave traced out on the face of a single-beam oscilloscope: up, down, up, down. If he were doing this at home, the curbs would be evenly spaced, about twelve to the mile, because his home town is neatly laid out on a grid.

Here in London, the street pattern is irregular and so the transitions in the square wave come at random-seeming times, sometimes very close together, sometimes very far apart.

A scientist watching the wave would probably despair of finding any pattern; it would look like a random circuit, driven by noise, triggered perhaps by the arrival of cosmic rays from deep space, or the decay of radioactive isotopes.

But if he had depth and ingenuity, it would be a different matter.

Depth could be obtained by putting a green light bulb on the head of every person in London and then recording their tracings for a few nights. The result would be a thick pile of graph-paper tracings, each one as seemingly random as the others. The thicker the pile, the greater the depth.

Ingenuity is a completely different matter. There is no systematic way to get it. One person could look at the pile of square wave tracings and see nothing but noise. Another might find a source of fascination there, an irrational feeling impossible to explain to anyone who did not share it. Some deep part of the mind, adept at noticing patterns (or the existence of a pattern) would stir awake and frantically signal the dull quotidian parts of the brain to keep looking at the pile of graph paper. The signal is dim and not always heeded, but it would instruct the recipient to stand there for days if necessary, shuffling through the pile of graphs like an autist, spreading them out over a large floor, stacking them in piles according to some inscrutable system, pencilling numbers, and letters from dead alphabets, into the corners, cross-referencing them, finding patterns, cross-checking them against others.

One day this person would walk out of that room carrying a highly accurate street map of London, reconstructed from the information in all of those square wave plots.

Stephenson tells us that success in such an endeavour requires depth and ingenuity. Depth, the internet has in spades. Millions of Twitter users are adding to a public dataset every second of every day. Ingenuity may prove a bit tougher, but with open APIs and thousands of clever curious investigators, it will be interesting to see what kinds of maps will be made – and to what means they will be used.

- I’ll be following up this post over the weekend with a preview of a new Twitter visualization that I have been working on – stay tuned!