Google is offering an interesting data source as part of their social responsibility activity. The Google Swine Flu activity monitor looks at the types of searches that people are making and infers the level of swine flu from what people do. I do want to encourage Google to do more of this sort of activity. But…
You knew there was a “but” coming, didn’t you?
The UK, where I live, is shown as appearing to have the lowest possible activity. When I look at the European Centre for Disease Control (ECDC) for their analysis of the 2009 swine flu outbreak, the UK has a medium incidence of Swine Flu and a high incidence of fatalities.
So how does one source show us as essentially free from Swine Flu and one show us severely affected, as severely as Swine Flu goes? What have I misunderstood?
I’m guessing that the apparent mismatch is probably caused by the way that data is collected and presented. Instead of showing a colour that looks like the lowest incidence, it should be showing a colour that indicates “insufficient data collected”, explicitly called out in the colour key. Now, using an explicitly named colour for insufficient data is probably not desirable for several reasons. The Google Swine Flu monitor is only tracking 20 countries (up from 16 a few weeks ago). That’d make the bulk of the map into “no data”. Doesn’t look good to show that the majority of countries aren’t covered, especially the huge populations of China and India.
But showing where there is a lack of data is more useful for casual viewers than the implication that most of the world is unaffected. It’s really the old scientific dictum:
Absence of evidence is not evidence of absence
Google’s map confuses a low incidence with lack of measurement, breaking that basic rule and resulting in a misleading world view that only a few countries are affected. I don’t think it’s intentional, I think it’s just that the focus of thought went into the place for which data was being shown, and the implication for the casual browser wasn’t considered. Or perhaps I’m just picky about data presentation.
So Why Is the UK Not Shown?
I’m wondering whether the lack of data is caused by our National Health Service, and particularly by NHS Direct - a free phone service that gives you access to an escalation chain from call centre to nurse to doctor. If queries about swine flu are being directed as queries for NHS Direct, then there could be a low signal for swine flu queries.
Where Next?
The other thing I’d love to see, is a progress chart, showing the historical evolution of the interest levels, ideally down to province or even town level. I know that’d be a lot of work to do, though! There are suggestions in the site that this is under way, though - and the link with early warning of disease spread is really interesting use of search data.
Kudos to the team that developed this, and as this kind of work progresses I’ll be looking forwards to seeing better presentation of the data and some deeper insights.
