The Limits of Dr. Google
There has been some chatter in recent years about how Google could be used as a barometer for major epidemics. The thought is that the world’s most popular search engine is able to do something few physicians can: build a fairly reliable snapshot of what humanity is thinking about at any given time. Sometimes this means queries about Newtown or the Mars Curiosity rover; other times it means a spike in medical queries. These are the harbingers of coming epidemics – or so the theory goes.
Last year, several news outlets reported that Google’s data crunching had correctly predicted that seasons’ flu trends based on geographic search trends. By watching when and where clusters of questions about fever and cough appeared, Google’s engineering whizzes could track the spread of influenza in real time, and predict its eventual penetration.
This year, a loophole emerged.
It turns out that we search for information about the flu for two different reasons. The first is when we are suffering from flu-related symptoms. The second is when everybody keeps talking about how deadly this year’s flu outbreak will be. All the media coverage this year appears to have “wagged the dog,” as it were, leading Google to report widespread queries for the flu that didn’t comport with the eventual numbers:
But how did Google’s algorithm fare during this year’s fierce outbreak? Flu reached epidemic levels in January and–pertinent for the Flu Trends algorithm–was widely covered in the news. Would the widespread news coverage cause healthy people to enter influenza keywords into their search bars, thereby skewing Google’s results? We updated our graphic from last year to find out. Indeed, Google’s methods seem to wildly overstate the outbreak’s severity, outstripping the CDC’s figures by nearly a factor of two.
The moral: the Internet is good for many things, but when it comes to medicine, the best data we have are those supplied by actual human beings in the real world. Merely pondering the flu doesn’t make it so. When it comes to accurate predictive tools, the search goes on.