Good news - Google is allowing more insight to be gathered from AdWords Search Query Reports, by exposing more search queries to scrutiny. This reduces the need to use third party click redirectors or web analytics tools to extract search queries. However, there’s strangely spurious logic - or I’ve failed to grasp a fundamental point about internet privacy.
The article cites the following reason for failing to offer more information about searches that users conduct:
the Search Query Performance report will show all queries that resulted in a click, where the user has not specifically blocked their referrer URL.
At first glance, this seems reasonable. If a user wants to withhold the search query, that’s fine. Or is it? Just how does a search engine decide to deliver your advert and what role does the search user play in that?
A user types in a search query. In parallel with the organic search activity (in most major search engines from about 1997 to date) a separate process matches search queries to candidate keywords. That separate activity delivers the advert impressions that the search engine most expects to deliver revenue.
So if a user types the search query “red shoes” and we have the keyword “red shoes”, then we have an exact match. If we have the budget, and our bid or AdRank is high enough, our advert can have an impression delivered.
But what if the user types “red shoe diaries” as the search query? Should our keyword match? If we have selected (in AdWords) Phrase Match, then it won’t match (plurals are not automatically matched); it will probably match Yahoo’s Standard Match. It should match Google’s Broad Match.
Now what about “shoe store”? Should that match? Well, if our advert or landing pages talk about shoe stores, and online sales, that’s probably a good candidate for a Broad Match on Google, and Advanced Match on Yahoo.
Then, what about “evening dress” as a search query? Should that match “red shoes” as a keyword? This is where it gets interesting. If Google think that specific user is really interested in a red shoe seller instead or as well as an evening dress seller, AdWords may show the advert. We get an impression. Now, if we were actually targeting that keyword, and we were a red shoe seller, we’d probably want different advert copy for someone who was searching for “evening dress”. So it’d be really nice to know that Google thinks that this type of searcher is possibly interested. Instead, I might know that my shoes would be completely unsatisfactory for someone with that search - so I might want to a negative keyword to prevent showing under conditions where I expect a poor ROI as a consequence of these low purchase likelihood clicks. I’d really, really like to know that these search queries are being matched *by Google* in response to user searches.
Now, I *can* use web server logfiles and tracking tags to identify the IP addresses of users that have a search query of “evening dress” for my keyword “red shoes”, Broad Matched, *IF* the user has allowed (as is the default) the referer_info field to be passed to my server, *AND* if the user has clicked.
If the user withholds that data, then my server doesn’t see it. If the user doesn’t click, I don’t see that the user has even searched. That is, if there’s an impression, but no click, Google will know what they have tried, but I won’t. That means that I can’t either create a more relevant advert, or use a negative keyword to improve my targeting.
But hang on… The *USER* was never consulted about the degree of matching. The extent to which Google spreads Broad Match affects the *advertiser* but the user isn’t consulted. Why should the users preference for referer_info, affect what an advertiser learns about the search queries that are matched? Especially since in the usual case, there is no way to tie a click (much less an impression) to a search query in the case that referer_info is not passing on the data.
What About Exact Match And Privacy
If a user types in a search query that matches an exact matched keyword, and clicks, and I have tagged my keywords for analytics purposes, I *don’t care* about the referer_info. I know what they’ve typed. If a user withholds the referer_info, it doesn’t matter. Their privacy has been breached.
What will Google do to ensure their privacy? If privacy is so important, then *don’t* serve adverts to users who have no referer_info, or *only* serve phrase and broad matched adverts - treating users with no referer_info as if they have an implicit negative exact match for their search query.
If the *user* is to be respected, then the place to act is in ad serving, *not* in withholding information to the advertisers *paying* for the whole system to work. Google has it backwards…
Motivation?
Google’s reasons for suppressing the search query data are not, I believe, for privacy. Except for very low volume advertisers, pairing up search queries with IP addresses, is tedious. And for impressions that have not resulted in clicks, or for the 30% or so of browsers that have no usable referer_info, there’s no data, anyway.
Google’s Search Query Reports are, in any case, already anonymised. There’s no tracking data in the reports. You can’t pair the existing reports with any specific user - a properly anonymised design.
So, I call “foul” on Google. They have published an excuse that *I* at least, don’t think stands up to minimal scrutiny. If I’m wrong, I’d love to be educated in how providing Google’s existing anonymised aggregate reports of clicks leading to my site, can yield a breach in an individual’s privacy. Please explain. I don’t get it. At all.
Alternative Explanation
I think, instead, this is an excuse for Google to hide how broad the matches are that they do.
Why is that important?
Because the more bidders there are in an auction, the higher the price. If Google wants to increase revenue, they can widen the mouth of Broad Match. A small increase in the willingness to match otherwise unwanted search queries, massively increases the number of bidders in the auction, driving up the value of the auction and contributing to Google’s bottom line. Advertisers will see declining ROI - but search marketing ROI fluctuates like a mad thing for most companies outside the Fortune 2000 (or thereabouts - that *is* a finger in the air estimate). Google can make large revenue increases from slightly increased pain for a large number of advertisers - the old Nash Equilibrium thingie.
Suddenly seeing a lot of new, strange search queries turn up? That’d be something that advertisers would want to control. And that would possibly negatively affect Google revenues.
Summary
I believe that Google has offered a specious and deliberately misleading reason that stands up to superficial scrutiny, to avoid giving more detailed search query reports. The real reason is not user privacy, but that the exposed information would reveal how Google manipulates Broad Match to generate increasing revenue for itself and its search partners.
I’d rather that Google said that “increased details would provide insight into Google’s valuable trade secrets, the advanced search matching technology at the heart of our search engine”. I can’t argue with that, and few courts would be willing, I think, to require a company to expose trade secrets… unless those were about how to fraudulently increase clicks with little likelihood of a sale. However… IANAL.
If Google *really* want to tackle privacy issues for users, the right way is to deliver adverts that don’t give away the critical information. Withholding already anonymised data about *Google’s* decision on search query matching, is there to protect Google revenues, and is directed at preventing advertisers from controlling spend.
The extra information now available is equivalent to using web server logfiles to analyse the referer_info field - except that it has a 24 hour delay. So Google is now letting us know what we could already know, if we used our web server logfiles properly and tagged our Destination URLs with sufficient detail. That *is* useful. But claiming to withhold more information about *Google’s* decisions to match, has nothing to do with user privacy.

Dean wrote,
I doubt it’s about privacy. My guess is that G prefers to hide this info but figured that the policy in his case is not effective due to the ability to get this info from your server logs or analytics package. So they chose to relax the restriction but only as far as necessary. If the referrer data is cut, marketers can’t collect the data so readily, and so, G isn’t going to give it up either.
Link | May 24th, 2009 at 11:18 am
Charlotte SEO Company wrote,
I never knew that your add will show once you pay more.. I always thought it’s all about how your rank is doing on it’s keyword and also by paying to subscribe the advert. Well, somehow this is a wake up call to people wanting for their ads to be posted on the first page.
Link | May 25th, 2009 at 8:11 am
Jeremy Chatfield wrote,
Erm, not sure where you picked up something about bids and adverts showing. Google strictly disentangles paid search and SEO activity - though there are obviously collateral effects from the use of paid search on SEO.
Link | May 26th, 2009 at 8:57 am
Tomas wrote,
other alternative:
it could be that those reports are not generated from the matching engine, but are (like external tools) based on the actual referer field.
you’ll notice that the first hop of every ad goes back to google, and it could be that they use that hop to extract the query from the referer url.
It would be odd if they did it that way, but it wouldn’t be the first time engineers come up with a shortcut in order to achieve something. Perhaps doing it that way is technically simpler or it is done on purpose in order not to expose the matching algorithm.
Link | May 26th, 2009 at 9:46 am
Jeremy Chatfield wrote,
Hi Tomas - I’m pretty sure that G does collect this data. It is crucial for determining how to extend Broad Match. Low CTRs on poorly chosen search queries will result in the search query no longer being matched, IMO. That means that G is recording the search query… doesn’t it?
Your general point about unexpected simple optimisations is appreciated though. Seen it :)
Link | May 27th, 2009 at 8:22 am
Tomas wrote,
Hi Jeremy,
I agree with you that Google is most definitely collecting that data during matching. But from what i’ve seen and experienced ove the years (and i’m sure you have too) adwords is a fairly disconnected system that collates data from various sources (e.g. clicks and impressions are not recorded/stored at the same moment cause there’s sometimes a time delay between them - i.e. clicks get updated but impressions aren’t).
the search query report is relatively late addition to adwords and i think it’s just another layer/module they put on top of the core. That’s why i think it may be just be mining the referer urls instead of talking to the matching algorithm (which would prob be a lot more complicated/risky to implement)
Link | May 27th, 2009 at 12:22 pm
Jeremy Chatfield wrote,
Yeah, I’ll definitely agree that there’s at least five “layers” - matching service, impression server, click service, reporting/billing and invalid click service. Could add conversion tracking to that, I guess.
However, I’m fairly sure that the matching service runs a regular training update based on the search query matching and clicking. That layer learns - you can see “twitches” typical of an AI with a new training set, every so often, and it adjusts to “low volume keywords”, suddenly presenting them as distinct impressions when the CTR is high enough.
And yes, age of the system seems relevant. The AdWords API took forever to engage with zero-impression reporting; hugely useful and more cost effective. Probably because it needed to develop hooks deeper into the system.
I shall mutter to myself about the likely system design again. Thanks for the provocation :)
It still seems that “User Privacy” is a distinctly red herring, though.
Link | May 28th, 2009 at 11:50 am
Robert Brady wrote,
I think the whole privacy thing is blown out of proportion. If you think Google is violating your privacy, STOP USING GOOGLE. This isn’t like a doctor or lawyer where laws exist to protect privacy. You chose to use Google, so you also choose to accept the possible consequences.
As for Google using broad match to increase their revenues, I totally agree. I manage several accounts and I noticed a significant change in impressions and CTR a few months ago that would indicate a change in the algorithm that AdWords uses. Conspicuous how it matches with their recent earnings slump isn’t it?
Link | May 28th, 2009 at 5:48 pm
Jeremy Chatfield wrote,
I can understand the privacy thing. WIth a geek background, you worry about information flow; use of globals, leakage, security. Most people haven’t got a clue, all they know is that someone else knows something, and that is a threat for many people. The unknown is always fearful. Look at the fuss over cookies, for example.
Marketing dictum: perception is reality. So Google has to deal with the reality that they are perceived as interfering with privacy. But with an already fully anonymised, aggregated search query report - WTF is the issue?
Link | May 28th, 2009 at 8:31 pm
Richard Ball wrote,
My goodness! For the sake of privacy, Yahoo had better drop OVRAW from their tracking URLs. ;-)
Link | May 28th, 2009 at 8:49 pm
Coach Bags wrote,
I completely agree with the person above that “I doubt it’s about privacy. My guess is that G prefers to hide this info but figured that the policy in his case is not effective due to the ability to get this info from your server logs or analytics package.”
So Google is now letting us know what we could already know, and it is helpful for us in managing our web server logfiles properly.
coach bags
Link | August 25th, 2009 at 5:21 pm
Dr Acne wrote,
It is crucial for determining how to extend Broad Match. Low CTRs on poorly chosen search queries will result in the search query no longer being matche
Link | August 26th, 2009 at 9:53 pm
Jeremy Chatfield wrote,
Dr Acne - great name, BTW, how original! [sarcasm] I *must* change my name to “Internet Marketing Strategy” and post on a lot of nofollowed blog comment areas. [/sarcasm] Yes, and I see you, too, Ms Coach Bags. Sheesh. Install some FireFox plugins or set up GreaseMonkey/GreaseKit to highlight blogs that aren’t worth trying to seed with links…
Anyway… Dr Acne misses the key point. Keyword Quality Score is apparently determined by CTR on the exact match, on Google’s search results only. It isn’t affected by CTR on partner sites or by phrase and broad matches… Google is already offering a substantial fraction of the data needed to manage the CTR driven portion of the QS. It’s the *extra* range of augmented enhanced super-wide investor-pleasing revenue-projection-matching Broad Match that is being obfuscated.
Link | August 31st, 2009 at 10:31 am