The Daily Need

Trending secret sneaker lust, and other uses of Google Correlate

One of many styles of the Retro Jordan and a screenshot of the Google Correlate application in action

Phlegm. Temperature. Sneezing. Apparently sick people are predictable. Before consulting a physician or shopping for DayQuil, we Google our symptoms – or, at least, we do so consistently enough to provide statistically valuable information.

Google researchers recognized this after comparing search query data to confirmed influenza data from the Centers for Disease Control and Prevention (CDC), and released Google Flu Trends in 2008 to much acclaim – and some alarm. The tool allows Regular Joes to view global flu trends, as illustrated by the frequency of these searches, geographically, and in near real-time.

This is exciting because the search engine data is processed in about a day, while the CDC has a one- to two-week reporting lag, according to Nature (pdf). The faster a potential epidemic is detected, the faster it can be contained. Google Flu Trends has shown to be effective for many strains of the flu, including H1N1. It also tracks bacterial infections including methicillin-resistant Staphylococcus aureus or MRSA.

The surprising success of this tool was largely the inspiration for another application released this week – Google Correlate. The idea is that you can expand your own personal search for meaning by including datasets. Provide it with one word or phrase, and it will chart others that have been requested with similar frequency over the past eight years. Provide it with an unrelated dataset (frog mating habits? consumer price index? — anything that can be quantified into weekly averages will do) and it will show you the search terms that correlate. Individuals’ querying patterns, once considered private, are aggregated to make a trend, and then compared with other measurable aspects of our world, in the hope that this could explain a few things.

I tried out this tool with a straightforward example — the Dow Jones industrial average, downloaded from Yahoo! Finance.

Correlating positively to the Dow (moving in the same direction, with the same relative amplitude) one finds a strikingly high percentage of search terms relating to finance and investment tools, and several seeming oddities, including: Biagio’s Restaurant (fine dining), TSP Talk (a government chat service), and Jordan Retros (a good-looking shoe.)

Correlating negatively to the Dow (moving in the opposite direction, with the same amplitude – a mirror image) one finds terms related directly to unemployment and bankruptcy, with one outlier being Air Terra Humara (another good-looking, but, actually, less pricey sneaker). If you shift those values one week (look for which searches mirror changes in the Dow a week later) you see, at the very top of the list of things Googlers wanted, a “funny site.” Indeed.

In short, people are always searching for all these terms, but to varying degrees. When the Dow is up, they search more frequently for ways to invest and spend their money, and when it is down, they are more often looking for ways to cope with economic hardship.

These scenarios all describe phenomena graphed against time.  One can, alternately, use geography, as the flu application does. This is where privacy issues become thorniest. Possible paths of exploration include identifying movement of memes, consumer tendencies or perceived geographical and meteorological shifts. Plausible reactions to anything found there include targeted marketing, complex policy decisions and improved emergency responses.

To explain its tool to the non-statistician, Google uses a comic, which admonishes, “Correlation is not causation! ” This may be true, but inevitably some of us will be unable to resist trying to find a meaningful connection. Please, share your results.

 
SUGGESTED STORIES

Comments

  • http://www.facebook.com/milly.hansen3 Milly Hansen

    I don’t believe I have sneaker lust, but I am a shoe whore!!!

  • Raw10luck

    that Google comic was very informative!

    I now know that everyone at Google is white.

  • Skoglund

    Methicillian is synthetic penicillian. We can figure out how to cure a staph infection. The synthetic is not working. And I want the green shoes.

  • http://pulse.yahoo.com/_3BAPRW6FHATPIRDWZJQNN3WSIE Eric

    Huh.  So when people are sick, they try to find what malady they have.  Okay.  When the economy is good, they try to find expensive crap to buy.  Okay.  When the economy is bad, they try to find cheaper crap to buy.  Mmm hmm.   This is nothing new.  Ms. Henze, I’d say you wrote a very long and needlessly complicated article about nothing.  However, if you could tell me how to get the minute of my life back that I wasted reading this article, you’d have my full attention.

  • http://pulse.yahoo.com/_3BAPRW6FHATPIRDWZJQNN3WSIE Eric

    Huh.  So when people are sick, they try to find what malady they have.  Okay.  When the economy is good, they try to find expensive crap to buy.  Okay.  When the economy is bad, they try to find cheaper crap to buy.  Mmm hmm.   This is nothing new.  Ms. Henze, I’d say you wrote a very long and needlessly complicated article about nothing.  However, if you could tell me how to get the minute of my life back that I wasted reading this article, you’d have my full attention.

  • Spikerola

    George Orwell is spinning in his grave.

  • http://www.facebook.com/moore.jeff.a Jeff Moore

    Eric^ will get left in the primordial slime pool. Thanks for bringing Google correlate to my attention. I know what to do with it…

  • ECameron78

    Google is just another way for “them” to track us.  Pretty soon, you will not be able to search beyond your own interest profile…which they create based on your search history. It is of course a marketing tool regardless of the economy.

  • Powerinpolitics

    Also, correlating negatively to the Dow Jones is “painless ways to commit suicide.” It really is so, so sad to see the search trends in 2008 for this…

  • Haughtnfast

    Nice to know correlative data grows and grows. The tabulative possibilities intrinsic to such data collections are virtually unending. What is already accomplished by any number of private investment firms with their financial predictors that are tabulated constantly, even incorporating the demographic of competing firms’ financial modeling predictors outcomes on the investment markets.