Tweetcatcha uses the New York Times Timeswire API to load the latest news for the last 24 hours. We use the title and the url of the articles on nytimes.com to search through Twitter. There is a lot of data, so please be patient with the load time. Searching through Twitter for url was made much easier by using BackTweets, a service of BackType. I wrote a AS3 class to wrap the BackTweets API, more information in this blog post. The tweets are arranged around in the center based on the time difference from the article posting to the time the tweet was created. So, if a tweet was posted less than an hour after the article, then it would be very close to the inner most ring, and if it was posted 20 or more hours later, then it would be closer to the last ring, (there are 24 rings, one for each hour in the day). Bruce Drummond and I collaborated on this project.
We began collecting data on November 13, 2009 and continued until February 9, 2010. We set up a cron job on one of our computers to pull and store the data locally. Over this time period, the database grew to 107 MB, with 15,327 NYTimes articles and 311,885 tweets related to those articles. That is a lot of data, so please be patient if it takes a while to load!