Oh, this is witty. Here’s your invite to join Google’s Orkut without needing to be invited by a member. I’m interested in social networking software, so I gleefully joined up. I went looking for communities about AdWords and AdSense. What do you think I’ll find? Something that equals the quality of the Google AdWords Help Forum, perhaps?
Surely this trust-based community will be full of people who share an ideal of making the internet a better place. A place where these trusted members respect each other and provide a supporting environment?
Well, kind of.
What do these guys do to support their communities? They solicit and participate in click fraud, mostly.

What’s all this then?
These guys have taken a pretty simple idea and run with it.
- Make a (free) site
- Sign up to AdSense
- Solicit friends and Orkut users to click on adverts, enough to increase income and too little to trigger Google’s detection of fraud (three clicks is safe enough, six clicks is a risk, according to the fraudsters’ tradition and lore)
And remember, these are the stupid thieves. The smart ones are not posting in public or in shared spaces. The smart ones have got decently password protected communities and are doing much more clever things with their fraud.
What if the Orkut community clicked on your adverts? Well, if they were all to click on a single, well paying advertiser, it might cause around $5,000 damage per day. In practice of course, this $5,000 will be taken from hundreds of advertisers. That’s part of the insidious nature of this problem. Because it is spread around advertisers, no one advertiser takes a large cost. The pattern of abuse is easy to lose in the randomness on the internet. Add obfuscatory technology such as dialup connections and dynamic IP addresses for ADSL and the signal of misbehaviour is hard to find. Hard, but not impossible.
What can you do?
Well, now I’ve taken the St Johns Wort and calmed down, the very first thing to do is to join Orkut and add each one of the sites soliciting clicks to your Site Exclusions on every campaign. I’ve done that for our monthly service clients already, and we’re offering to do it for the other accounts joined to our MCC.
Here’s a Google Notebook containing a list of the guys asking for fraudulent clicks. Hey, I figured if Google was supporting people who attack advertisers, there must be a free support mechanism for advertisers. Google shared Notebooks. Probably the first time you’ve seen one…
Now, like any other “busting” resource, this is subject to a problem… If someone wants to run a spoiler, they post a site that they don’t own, causing damage to a legitimate AdSense publisher; in turn, that causes the fraud buster a problem. So this is a one-off list. No maintenance, at least not for free… If I can think of a way to run this is a free service, like a real time black hole mail spam service, that offers only tested information, I’ll do so.
Take a look at the names of these sites. Anything jump out at you? I see a lot of free hosting services, including Google Page Creator. So if you were instituting a click quality control policy, you might want to consider denying all those free sites. The counter-problem is that many of the free sites have people who offer expert content sub-sites, on which you do want to appear. Just ejecting all of Google Page Creator sites may be a competitive disadvantage over just ejecting the cheats.
Now look at your web server log files and claim back any clicks from these sites. We’re doing that for our monthly service web server log file analysis clients. And claim back any clicks for that IP address, especially if you have a first party cookie and can identify the same user - if they knowingly did click fraud on one site, they probably knowingly defrauded on a different site. And add that new site to the list of likely click fraud solicitation sites and claim their clicks, too.
If Google disputes that these are invalid clicks, refer them to their own information resources on Orkut and ask them to explain why clicks from a site that solicits click fraud should not be regarded as invalid, and why someone who has been identified as an invalid clicker shouldn’t be identified on all their clicks as invalid usage. The answer should be educational.
A sense of proportion
This would only be around 600 clicks on content match, if your advert happened to show on all the sites, and if they were stupid enough to click on the same advert rather three different adverts. This is peanuts. Large accounts can see more invalid clicks detected by Google than this, in a single day.
So these hundred or so sites represent a tiny fraction of the problem. Detecting them, well, there’s little that a small advertiser can do. Small advertisers may get only a few dozen clicks on the content network, each day. Even if all of these clicks were from fraudulent users, the low repetition rate makes it unlikely that an advertiser could detect two clicks from the same fraudulent clicker.
Who could detect that these low volume clicks were fraudulent? Actually, no one. The problem is that lots of web site owners have the information that clicks from this site are much less likely than normal to lead to a sale or signup. Google only knows that the advert has been clicked on. There is no way to correlate the multiplicity of small advertisers web logs and nail the cheats. Google can’t do it - it doesn’t have the site usage information. You can’t do it - you only see a fraction of the activity of the thieves.
Large advertisers might catch some of these thieves, but if you’ve ever tried to get action from a paid search engine that might result in a decreased publisher base, you’d realise how truly unlikely this is. The cost of detecting the clicks, organising the logfiles into a complaint, pushing the complaint through and the likelihood of success makes this a pretty fruitless task.
Pre-emption
So if you can’t detect that you are getting fraudulent clicks, can you do anything to prevent yourself from appearing on fraudulent sites? If you can’t, can you do anything that prevents you from appearing on a low quality site, again?
We believe that there are a few resources that let you identify “Made For Adsense” and other low quality sites. This would flood the 500 sites that Google offers for site exclusion. On the other hand, if enough advertisers stuffed their campaign site exclusions with MFA sites, Google may realise that these publishers are undesired by advertisers… and remove them. That would allow advertisers and agencies to select the next least desirable tranche of sites.
We also know that you can use your web server log files to support the detection and automatic exclusion of poor content match sites, using a combination of techniques. Some of the techniques are as basic as simple ROI and statistical analysis. Others rely on more complex detection of whether the publisher site is a likely good citizen offering relevant content.
Official Action
These thieves operate in multiple countries. Getting someone to pay attention to thefts of a few dollars or less from each affected advertiser will be hard - it’s the aggregate of all the advertisers from whom they steal, that makes the activity worth while. I doubt that any of them will end up in jail. I mean, for clicking an advert? You can see that the press might not take the advertisers side in this, but find it shocking that “a careless click cost this man his career”. Yeah. Right.
We could all report these guys as fraudsters to Google. Google will close down the AdSense account, perhaps. Then these scumbags will open another AdSense account and another free site and start all over again. That just means tracking them down all over again. So, a temporary relief. Should last all of about 48 hours (just about time to notice they’ve been shut down, register with AdSense and start up a free site to take those fraudulent clicks).
Permanent banning? Look at the countries that these guys are coming from. I don’t think you’ll find a way to even be sure of identity, much less to ban them from the system.
Nope, it is vigilance, web server log files and information security forensics that will nail these thieves.
And these are the stupid thieves. The smarter ones aren’t soliciting in public places. That’s why you need to look at web server log files and identify not just IP addresses, but geolocations.
Is there a way forward?
We’ll see what Google does and says, shall we? I don’t imagine that it will be helpful for advertisers. Google’s priorities will be to protect Orkut and to protect publishers from any loss of confidence in contextual advertising by advertisers. I expect that if they pay any attention to this article, it’ll be a complaint that I copied a fragment of a message from Orkut, against the T’s and C’s, and my use of Google Notebook to publish addresses lifted off Orkut.
Yeah, that’ll stop Google AdSense click fraud solicitations on their own systems, no doubt of it.
Google needs to wake up. Advertisers are clients, too. Just like searchers, just like publishers. We’re no less deserving of respect and protection. However, Google holds on to or does not pass on to advertisers, information that would help actively manage click quality. Advertisers are second rate citizens in the Googoloply.
In the meantime, we’re improving our analysis of content matching for our larger customers. This investigation, together with some other research we’ve been doing, has been depressing as a signal for the sheer number of likely fraudulent clicks from content matched sites.
There are detection mechanisms that will work for larger advertisers and more stupid fraudsters. The main information resource is Google, and Google’s privacy orientation protects thieves as well as the innocent. While Google continues with the belief that advertisers can be abused while thieves must be protected, advertisers and their agents must be vigilant and must press for changes at Google that offer more assurance of click quality.
One way to exert pressure is to stop funding content match until Google offers more information about publishers. You should also be keeping budgets on content matched campaigns to low levels, because that focuses impressions on sites with higher CTR’s - and a high CTR from fraudulent clickers will be easier to identify.
The good news is that smaller advertisers, who tend to have lower budgets, should already be inadvertently selecting sites that are less fraudulent. This is because smaller advertisers tend to offer lower budgets. Google will try to get that budget by offering the fewest impressions, which means targeting the higher CTR sites for the inferred conceptual page match.
The other big lesson is that you should only be using strictly measurable results against content matched clicks. If you can’t measurably detect that you are getting benefit, then stop. The economic pressure on keyword search is such that it focuses on higher quality sites. The obfuscation, intentional or otherwise, of content match publishers and clickers means that fraudulent publishers are significantly protected by Google and other channels. That means that the economic pressure to generate revenue through fraudulent clicks matches a weakness in detection. So, if you can’t measure that a site offers value, you shouldn’t be using it.
There is another cause of budget consumption on content match that we’ll look at in a future post. This set is even more insidious, and has an unexpected group of causes.

James wrote,
Jeremy, how about forming a network of sites that verifiably agree to a set of publishing standards as regards AdSense and the like?
Perhaps something like a Better Business Bureau whereby the participating publishers receive ratings based upon standards and advertiser feedback. Maybe this is something Google would be interested in?
As is now, there is no way Google’s content network will survive the ever increasing amount of fraud. The internet is still growing and a certain percentage of people coming online daily are catching on to and seeking to exploit the MFA ruse.
Do you think some sort of publisher rating system may be the answer?
Link | November 20th, 2006 at 12:12 am
Jeremy Chatfield wrote,
Hi James
Good suggestion, but for one teensy flaw… Google offers no mechanism to select from such a virtuous group. :)
Ratings - the nature of the industry is against doing something like that. It’s extra work. It is sharing a competitive advantage - I haven’t offered a list of all the sites we exclude. What’s in it for me? Access to your dubious list? What if you include some good sites to prevent me from getting on them?
This is a nasty, intricate problem. I think there’s paths for larger advertisers, that limit the downsides sufficiently that they can effectively use contextual search. But it requires a chunk of knowhow :) Small advertisers - limit the budget for now. That’s all you can sensibly do.
Link | November 20th, 2006 at 6:18 pm