Mining the Spirit of the Times – The Zeitgeist and The Buzz

I got a bit excited when I found out about Yahoo Buzz http://buzz.yahoo.com/ and Google Zeitgeist “Search patterns, trends, and surprises according to Google” http://www.google.com/press/zeitgeist.html or a country by country zeitgeist at http://www.google.com/press/intl-zeitgeist.html .

Zeitgeist == the general intellectual, moral, and cultural climate of an era.

With so much going on around the web, wouldn’t it be nice if we could tap into what people are e-talking about, the sorts of searches that are happening … and maybe do some multivariate statistical analysis and pattern recognition that would give us an insight into what “the spirit of the times”.

Well, it is technically feasible to write software to grab this data on a day by day basis (some historical data is also available) from the search engine query summaries (Buzz and Zeitgeist, above), shove that into a dataset and do some creative analysis, even though the data is messy, but is the data good enough to make the effort worthwhile?

I can’t decide. I guess if I did not have other things to do I would certainly start collecting the data on an ongoing basis – build some so called “screen scraping” software and store it in a flexible database, pending analysis – just to keep a watching brief, and maybe to get some ideas started.

Maybe the data will get better, but I am concerned that it is currently

a) Censored

b) Difficult to analyze, being mostly rank order (and ranks of changes) stuff .. although comparisons across time and across country will introduce variations and help us spot trends

c) Captured by a tool with unknown properties .. we don’t know what the engineers behind this summarization were “looking for”

d) rather trivial (actors, movies, sports, music..).

Of course “trivial” rather depends on your industry and your application.. if you were in the radio business, maybe the Buzz and Zeitgeist will help you to get a handle on what is “hot” now, rather than waiting for some popularity surveys.

Note that some of the commentary on Yahoo Buzz can occasionally be of some possible use, at least as far as producing some ideas and maybe insights.

I guess it would be worth (programmatically) clipping the occasional piece and building a “Buzz Dossier” which could be reviewed from time to time.. that way, at least, we could step back a bit from the tyranny of the now and let our brains have a chance to synthesize the material.

Just as an aside, there was a Buzz article on the “office spouse” phenomenon http://buzz.yahoo.com/buzz_log/entry/2006/03/27/1100/
which I found kind of interesting.

OK, so I went to the competition (Google), fed it that URL and did a “related” search looking for similar pages. Nothing. But googling on “office spouse” brought up some useful links. Maybe if you were building a questionnaire or a question concourse, the zeitgeist-buzz would give you a starting point to elicit “lines of communication”.

Have a look, see what you think. Is there a there there?

Update:

I came across an application of Frank and Witten’s Pace Regression (more another time, but it basically has embedded methods for doing feature selection) called “Moodwatch”. The technical paper is here

Capturing Global Mood Levels Using Blog Posts
and their site is at moodviews.com

Leave a Comment