Category Archives: API

Highway to the Danger Zone

WordPlay: A Tool for Freeform Language Exploration

When text becomes data it opens up a phenomenal amount of possibility for insight and creative exploration. The problem is that most Natural Language Processing (NPL) tools are hard to use unless you have a good foundation in programming to begin with. We use a lot of NLP in our work at The Office for Creative Research and I’ve often wondered what it would mean to make a language tool designed for open-ended exploration. I’ve also thought a lot about the role a tool like this could play in the classroom. English teachers could use it to explore both basic language (what is an adverb?) and more complex questions (how did Shakespeare use hyphenates?). History classrooms could examine patterns in language over time (how does Twitter compare to Moby Dick?). Because the tool would be open source and would have an API, Computer Science instructors could use it as a basis to teach about computation and language.

With all of this in mind I built WordPlay, a freeform NLP tool that lets you search across various bodies of text without having to know how to code or to learn any kind of strange syntax.

It’s really easy to use. Whatever you type into the query box will be used as a pattern to search across the current corpus. For example we might search for text similar to ‘a terrible fate’ across the Shakespeare’s collected works:

Screen Shot 2014-06-09 at 5.41.16 PM

Here is the same search, performed across the Bible:

Screen Shot 2014-06-09 at 5.41.45 PM

And within Sigmund Freud’s A General Introduction to Psychoanalysis:

Screen Shot 2014-06-09 at 5.42.26 PM

The system works by trying to find a match to the query text in two different ways:

First, it tries to match to part-of-speech. ‘A terrible fate’ contains a determiner, followed by an adjective, followed by a noun, so the best results will also have these parts of speech, in the same order.

Next, it attempts to match how the query sounds. Specifically, it considers the stressing pattern of the word or phrase: in the case of ‘a terrible fate’ it’ll look for other phrases with the same syllable counts (1-3-1) and the same stressing pattern.

The result of these two strategies is that the top search results in WordPlay should really sound like the query. And it works– go ahead and sing the results to the Shakespearean Danger  Zone (Kenny Loggins, eat your heart out!).

WordPlay is meant to be a simple tool, so I’ll end the explanation here. Go play with it.