<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Merjis Internet Marketing Blog &#187; click fraud</title>
	<atom:link href="http://blog.merjis.com/category/click-fraud/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.merjis.com</link>
	<description>Effective Internet Marketing Strategy and Tactics Through Test</description>
	<lastBuildDate>Thu, 12 Jan 2012 09:18:42 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>SEO: Click Through Rate and Bounce Rate</title>
		<link>http://blog.merjis.com/2010/03/26/seo-click-through-rate-and-bounce-rate/</link>
		<comments>http://blog.merjis.com/2010/03/26/seo-click-through-rate-and-bounce-rate/#comments</comments>
		<pubDate>Fri, 26 Mar 2010 21:16:53 +0000</pubDate>
		<dc:creator>Jeremy Chatfield</dc:creator>
				<category><![CDATA[adwords]]></category>
		<category><![CDATA[click fraud]]></category>
		<category><![CDATA[intent]]></category>
		<category><![CDATA[SEO]]></category>
		<category><![CDATA[spamfighting]]></category>
		<category><![CDATA[usability]]></category>

		<guid isPermaLink="false">http://blog.merjis.com/?p=340</guid>
		<description><![CDATA[I&#8217;m going to take issue with Rand Fishkin of SEOmoz. I think his most recent White Board Friday video is just plain wrong. Normally, I have a lot of respect for what SEOmoz does, but I think the advice and implications are not just wrong, but dangerously wrong. How Does Google Rank Results I don&#8217;t [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m going to take issue with Rand Fishkin of SEOmoz. I think his most recent <a href="http://www.seomoz.org/blog/whiteboard-friday-influence-of-usage-data#ergabbj-threttuy">White Board Friday</a> video is just plain wrong. Normally, I have a lot of respect for what SEOmoz does, but I think the advice and implications are not just wrong, but dangerously wrong.</p>
<h2>How Does Google Rank Results</h2>
<p>I don&#8217;t know all the details. Rand doesn&#8217;t know all the details. Some guys at Google know a lot of the factors. Matt Cutts, Google&#8217;s head of the search quality team, claims over 200 factors go into ranking. </p>
<p><object width="580" height="360"><param name="movie" value="http://www.youtube.com/v/muSIzHurn4U&#038;hl=en_US&#038;fs=1&#038;rel=0&#038;border=1"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/muSIzHurn4U&#038;hl=en_US&#038;fs=1&#038;rel=0&#038;border=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="580" height="360"></embed></object></p>
<p>What we do know is that backlinks &#8211; credible links regarded by Google as likely for a search user to visit &#8211; are important. We know that anchor text is important. There&#8217;s some other factors that we know influence Google ranking.</p>
<h2>What *else* do we know?</h2>
<p>We (professional search engine optimisation people) know that on-page content is valuable. For low competition keywords &#8211; keywords where there aren&#8217;t a lot of links and anchor text, and hardly anyone searches &#8211; then page content is enough. Look at the example in the graphic below. There&#8217;s precisely one page on the internet, with that text for something that I can&#8217;t find on Google. When I wrote that, it was true; if you search now, you&#8217;ll find that page. Well, until some spoiler copies it elsewhere&#8230;</p>
<div><a href="http://skitch.com/jezchatfield/n59q8/jeremy-chatfield-google-profile"><img src="http://img.skitch.com/20100326-f3g482f6c6a8ew6js18eey13ry.jpg" alt="Jeremy Chatfield - Google Profile" width=600 /></a></div>
<p>However, try putting the word &#8220;bad credit loan&#8221; on a page on a new web site with some other relevant and unique content, valuable to a user, and see how high you rank for the term. You can wait. And wait. And wait. You&#8217;re not going to show up on the first page of results, just by having a great page alone. It&#8217;s not just the content, it&#8217;s the backlinks that make the difference. </p>
<p>So we now know, as a result of this test, that while Google does pay attention to on-page factors, they also pay attention to backlinks. And in competitive spaces, *effective* <a href="http://www.google.com/support/forum/p/Webmasters/thread?tid=1c377284e24be6db&#038;hl=en" title="Amusing thread about 'Best SEO Company Search Engine Placement'">backlinks count for more than the page content</a>. </p>
<p>The important message to understand from this is that different factors apply under different conditions. Content alone won&#8217;t put you on page one. Backlinks alone won&#8217;t keep you there.</p>
<h2>Click Through Rate and Bounce Rate</h2>
<p>So, at some scale, do CTR (Click Through Rate) and Bounce Rate make any difference? I believe they do, and this blog is a testament to that. Look at this screenshot.</p>
<div><a href="http://skitch.com/jezchatfield/n59t5/content-detail-google-analytics"><img src="http://img.skitch.com/20100326-qf7ibj264gmipd2fxg512jxd1m.jpg" alt="Content Detail: - Google Analytics" width=600 /></a></div>
<p>That&#8217;s a Google Analytics shot of the last 15 months activity for a specific page on the Merjis blog. It&#8217;s all about &#8220;<a href="http://blog.merjis.com/2007/07/16/click-fraud-google-adwords-and-gclid/">gclid</a>&#8221; &#8211; something you&#8217;ll probably care about if you do paid search and look in web server logfiles. </p>
<p>I&#8217;m using this blog as an example, because I&#8217;ve been using it for tests for years &#8211; I know how it works, and it isn&#8217;t confidential client data. I can reveal the usage, because I have my own reasons for running a blog, and few of them directly have anything to with making money.</p>
<p>Most other pages on this site get a profile like this other example:</p>
<div><a href="http://skitch.com/jezchatfield/n5917/content-detail-google-analytics"><img src="http://img.skitch.com/20100326-rs3mrt1rbg886putn3jitpmdnh.jpg" alt="Content Detail: - Google Analytics" width=600 /></a></div>
<p>This is pretty typical for a &#8220;newsy&#8221; blog article. Usage on the day that it is written, and a dribble thereafter. It then usually dries up after a few weeks, because the rank has decayed with time. </p>
<p>So why, with a higher bounce rate, does the older article do better than the newer article in rankings? If Bounce Rate is important, then surely the lower bounce rate in a newer article must mean that Google should drop the older article?</p>
<p>I suspect that Google doesn&#8217;t have a rigid number. They look at how well you do relative to other sites. And especially, they look to see whether search users search again for the same or very similar searches. Read that article on SideWiki, and it&#8217;s lightweight. No real information. No real recommendations. The long lived article on gclid has a much higher bounce rate <i>and longer reading time</i>. It&#8217;s the reading time that&#8217;s the clue. When you&#8217;ve read my article on gclid, you probably don&#8217;t want to read another article about gclid. It&#8217;s reasonably definitive.</p>
<p>Google sustains that old article in search results, despite its&#8217; great age, and despite a high bounce rate, because those users who do read it, value it. It&#8217;s there, because it helps Google to deliver a page of search results that users value more than *without* that article present. </p>
<h2>Uh &#8211; You Didn&#8217;t Mention CTR</h2>
<p>Again, I don&#8217;t think it is actually CTR that Google is looking for. It is user satisfaction. So a high CTR, caused by a misleading piece of copy, won&#8217;t help. You have to deliver what you offer. Again, I don&#8217;t think that Google is measuring conversion, either. But a high CTR message with a high conversion rate, meaning that users are highly satisfied &#8211; that&#8217;s what Google wants you to make. </p>
<p>You won&#8217;t be directly rewarded for high CTR &#8211; but you can measure it (especially if you also run PPC and can get the impression rate). You won&#8217;t be rewarded directly by Google for high conversion rates. But Google does appear to prefer sites that answer the question posed by the search query. And the proxy that can be used by Webmasters, who don&#8217;t have access to Google&#8217;s richer data, is their own performance, as CTR and Conversion Rate. Increase those, and you are more likely to increase position.</p>
<h2>Interaction of Factors</h2>
<p>If you have a good site, with highly relevant content, you tend to get more links. So disentangling backlinks, and the immeasurable relative user satisfaction, is difficult. Pretty much the only way that I know it can be done, is when you have web sites with accidental misbehaviours that create the right conditions for a test. The technical problems that create the conditions are rare &#8211; and recreating them in a real website is likely to decrease the performance. It&#8217;s unlikely that anyone will give you the opportunity to mess up their site, just to prove what works.</p>
<p>However, if you want to go about it&#8230; Here&#8217;s what I think you&#8217;ll need:</p>
<ul>
<li>A visibly horrible page, with a low conversion &#8211; as your starting point</li>
<li>Weak Title and Meta Description as a starting point</li>
<li>A lot of visitors per day &#8211; it takes a long time to demonstrate, otherwise</li>
<li>The ability to make sitewide link changes to the page under test</li>
<li>Good backlinks &#8211; you&#8217;ll want to know that you *could* rank well on page one</li>
</ul>
<p>Change the URL for your horrible page, sitewide. Wait for Google to find it and rank it again. Note the position. Watch the position fall over a period of a week or two (depending on visitor volume). Now improve the page, and switch the URL again and wait for Google to find and rank it. Then watch the rankings change and note which way they go. Now revert the page and switch URLs again, and this time change the Title and Meta Description. Now watch the ranking changes. Now fix up the page again and once more switch the URL and watch. </p>
<p>You should, IME, find that you achieve a higher long term position when you have a better title and description, and a higher converting page with a lower bounce rate. If you can explain why you *shouldn&#8217;t* get a higher position with a site that is better for users, I&#8217;d love to know the reasons. But don&#8217;t make your explanation involve &#8220;gaming&#8221; the system. </p>
<p>And, FWIW, I don&#8217;t believe that the Title and Description are important, as direct factors for SEO. You can rank perfectly well for keyword free pointless titles, and descriptions without keywords that are positively turgid and rambling. However, show the user that you are focused on solving their problem, and your CTR increases; and if you are focused on the user, you&#8217;ll probably have a reasonable landing page, which will engage and convert better. Google&#8217;s not going to reward you for a better snippet, directly, but for a better user experience. Your only measures though, will be what you can observe &#8211; CTR, Bounces, Conversions. If I could tell you to look at the &#8220;re-query rate&#8221;, I&#8217;d tell you to do so &#8211; instead, you&#8217;ll have to use the information you can get.</p>
<h2>Implications For SEO</h2>
<p>If a blog article can decay to little traffic in a few weeks, or sustain rankings for years, on the same blog, with the same blogging software, then the difference must be backlinks? Well, not substantially. Over the years, I&#8217;ve had more backlinks to newsy stories, but still this &#8220;gclid&#8221; article keeps ranking. And all the time, the other lighter weight articles just keep falling out of the listings. </p>
<p>There&#8217;s a few other similar articles on this blog that rank, and stay high for years and years and years. Non-competitive searches, but of long lasting traffic value. And the other sites that I&#8217;m competing with, for attention, are large forums. High weight. Much more frequently updated content. I&#8217;m *deliberately* not trying to place links for articles. Just letting what happens, happen &#8211; so I can understand why it happens. So there&#8217;s no contamination effects here with deliberate link placements. </p>
<p>What are the articles? They all tend to be like that gclid article. Something that is detailed, informative, and means that you can go away and do something. Useful articles, in other words. Harder to write than &#8220;straight news&#8221; articles, as you need unique content, written to address the audience. That&#8217;s part of my reason for writing &#8211; attempting to develop clearer communication.</p>
<p>The clear implication is, I think, that useful content matters. And how do we know it is useful? It&#8217;ll show up in search engine rankings, usablility data and other disturbingly hidden and arcane resource. Google will reward useful content with a better sustained rank &#8211; but won&#8217;t put you on page one just because you have a great article, unless you have some backlinks to create credibility. </p>
<h2>But How?</h2>
<p>Rand makes the point that data about use can be gamed. But so can backlinks. That&#8217;s the major part of undeclared paid backlinks, small world building, and other &#8220;black hat&#8221; techniques. We know that Google sees through most black hat techniques, given time. </p>
<p>We also know, or can find out about, Google&#8217;s interest in invalid impressions and invalid clicks. For example, invalid impressions are generated when search engine ranking tools are run &#8211; they reduce the effective CTR. Invalid clicks are generated when users double click, or are paid to click. Just as with paid search, these two types of invalid activity are measurable by Google. In fact, Google can measure a lot more than a webmaster can see. </p>
<p>We webmasters only get to see bounce rates and conversions. Google gets to look at whether users search again. Much more valuable. If you want to build the worlds&#8217; best search engine, then you want to feature the results that tell you that you&#8217;ve got a winning page &#8211; pages where users don&#8217;t need to search any more. Results that have users positively selecting that site again, when they see it in listings. Webmasters just don&#8217;t have that detail, directly. We just don&#8217;t know if the other guy answers better &#8211; unless we expend effort to learn our customers&#8217; minds and make sure we have the best answer.</p>
<h2>User Experience</h2>
<p><a href="http://www.google.com/corporate/tenthings.html">Google&#8217;s Ten Things</a> lists, first, &#8220;Focus on the user&#8221;. The results from this blog, and from other client activities that I&#8217;m not going to reveal in any detail, are fairly clear. Content that Google can measure as being liked by users, rank better and longer than content that is spammy, tedious and weak. The factors that lead to better rankings will include appropriate Titles and Descriptions and engaging content. It has to be, or rule 1 is broken.</p>
<p>We know that Google has experience of measuring impressions and data to look for invalid data. We know that Google is pretty good at it &#8211; or there&#8217;d be more click fraud problems with AdWords. So, if it can be done, and it is an important indication of quality, why wouldn&#8217;t Google use searchers behaviour to modify results, not just personally, but across the index?</p>
<p>Why can&#8217;t you improve the results when you click on your own listings? Because it is identical behaviour to the banned AdSense practice of clicking on adverts on your own site. Detectable. Invalid. Not counted. And for reasons that I don&#8217;t want to go into, I believe the same will be true of botnets and eLance and Mechanical Turk attacks. There will be a signature associated with them, that doesn&#8217;t match normal user behaviour. The signatures can be spotted and countered, by assigning the activities as invalid &#8211; just as it is in AdWords. Since AdWords continues to run without being infested with click fraud to unusable levels, we have a working system, on a global scale, that shows that user behaviour can be extracted from noisy fraudulent behaviour. </p>
<p>It isn&#8217;t perfect, true, but it separates AdWords from being a system that solely acts to transfer advertising funds to thieves, into a system that, more often than not delivers prospective buyers to an advertiser&#8217;s site. It isn&#8217;t perfect, but it works well enough. <b>AdWords only works because it identifies and categorises user behaviour.</b></p>
<p>User behaviour categorisation works in one system that Google has, worldwide, on a service with measurable economic value. Why wouldn&#8217;t it be usable in organic search results?</p>
<h2>Conclusions</h2>
<p>Failing to identify and understand user interests is an SEO mistake. These are reflected by (but are not completely explained by) CTR and Bounce Rates &#8211; because that&#8217;s about the best that Webmasters can get. Google doesn&#8217;t have to use those &#8211; they have better numbers that are more meaningful to user experience. But saying that &#8220;Google doesn&#8217;t use bounce rates&#8221; is not the same as saying &#8220;Google doesn&#8217;t take account of user behaviour&#8221;.</p>
<p>Unlike Rand, I believe that Google cares very deeply about the user experience, and that Google has very sophisticated technology, probably shared with the Google AdWords guys, to identify unusual search behaviours and exclude them from consideration. </p>
<p>Given enough data, probably gained from multivariate testing on all the different data centres, Google can identify whether users are more, or less, satisfied by different ordering in search results than a pure backlinks-plus-content model would give.</p>
<p>Small scale tests probably won&#8217;t show anything about user interaction &#8211; because the activity doesn&#8217;t have statistical significance or because the signature of strange search activity is too obvious. So, don&#8217;t try faking it &#8211; if you&#8217;ve read this far, you probably aren&#8217;t smart enough to outwit Google&#8217;s teams of click-fraud defence guys. They are really pretty good, as anyone with a rational assessment of AdWords click fraud levels will tell you. Not perfect, but good enough to make the effort of using AdWords worthwhile, rather than primarily a way of siphoning your advertising funds to fraudsters. :)</p>
<p>Why do I say &#8220;if you&#8217;ve read this far&#8221;? Because if you really knew how to hide click streams, you&#8217;d be doing it with AdSense. And you&#8217;d have stopped reading at that point &#8211; because you own the game already. If you can&#8217;t own that game, you can&#8217;t own the game of spoofing user behaviour in organic search &#8211; it is (not identical to, but close enough to) the same game. At the moment I don&#8217;t understand why you&#8217;d bother with SEO behavioural spoofing, if you&#8217;d gamed AdSense, because the revenue is a lot more direct&#8230; Maybe that&#8217;s why Rand hasn&#8217;t spoken with any black hatters that have cracked it? </p>
<p>And if Google can detect unusual impression and click data, then they can fulfil their primary mission, with respect to <b>modifying</b> organic rank based on real user data about preferences and satisfaction. </p>
 <img src="http://blog.merjis.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=340" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.merjis.com/2010/03/26/seo-click-through-rate-and-bounce-rate/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Privacy and AdWords Extended Search Query Reports</title>
		<link>http://blog.merjis.com/2009/05/23/privacy-and-adwords-extended-search-query-reports/</link>
		<comments>http://blog.merjis.com/2009/05/23/privacy-and-adwords-extended-search-query-reports/#comments</comments>
		<pubDate>Sat, 23 May 2009 17:26:23 +0000</pubDate>
		<dc:creator>Jeremy Chatfield</dc:creator>
				<category><![CDATA[adwords]]></category>
		<category><![CDATA[click fraud]]></category>
		<category><![CDATA[intent]]></category>
		<category><![CDATA[trust]]></category>

		<guid isPermaLink="false">http://blog.merjis.com/?p=293</guid>
		<description><![CDATA[Good news &#8211; Google is allowing more insight to be gathered from AdWords Search Query Reports, by exposing more search queries to scrutiny. This reduces the need to use third party click redirectors or web analytics tools to extract search queries. However, there&#8217;s strangely spurious logic &#8211; or I&#8217;ve failed to grasp a fundamental point [...]]]></description>
			<content:encoded><![CDATA[<p>Good news &#8211; Google is allowing more <a href="http://adwords.blogspot.com/2009/05/enhanced-search-query-performance.html">insight to be gathered from AdWords Search Query Reports</a>, by exposing more search queries to scrutiny. This reduces the need to use third party click redirectors or web analytics tools to extract search queries. However, there&#8217;s strangely spurious logic &#8211; or I&#8217;ve failed to grasp a fundamental point about internet privacy.</p>
<p>The article cites the following reason for failing to offer more information about searches that users conduct:</p>
<blockquote><p>the Search Query Performance report will show all queries that resulted in a click, where the user has not specifically blocked their referrer URL.</p></blockquote>
<p>At first glance, this seems reasonable. If a user wants to withhold the search query, that&#8217;s fine. Or is it? Just how does a search engine decide to deliver your advert and what role does the search user play in that? </p>
<p>A user types in a search query. In parallel with the organic search activity (in most major search engines from about 1997 to date) a separate process matches search queries to candidate keywords. That separate activity delivers the advert impressions that the search engine most expects to deliver revenue. </p>
<p>So if a user types the search query &#8220;red shoes&#8221; and we have the keyword &#8220;red shoes&#8221;, then we have an exact match. If we have the budget, and our bid or AdRank is high enough, our advert can have an impression delivered. </p>
<p>But what if the user types &#8220;red shoe diaries&#8221; as the search query? Should our keyword match? If we have selected (in AdWords) Phrase Match, then it won&#8217;t match (plurals are not automatically matched); it will probably match Yahoo&#8217;s Standard Match. It should match Google&#8217;s Broad Match. </p>
<p>Now what about &#8220;shoe store&#8221;? Should that match? Well, if our advert or landing pages talk about shoe stores, and online sales, that&#8217;s probably a good candidate for a Broad Match on Google, and Advanced Match on Yahoo.</p>
<p>Then, what about &#8220;evening dress&#8221; as a search query? Should that match &#8220;red shoes&#8221; as a keyword? This is where it gets interesting. If Google think that specific user is really interested in a red shoe seller instead or as well as an evening dress seller, AdWords may show the advert. We get an impression. Now, if we were actually targeting that keyword, and we were a red shoe seller, we&#8217;d probably want different advert copy for someone who was searching for &#8220;evening dress&#8221;. So it&#8217;d be really nice to know that Google thinks that this type of searcher is possibly interested. Instead, I might know that my shoes would be completely unsatisfactory for someone with that search &#8211; so I might want to a negative keyword to prevent showing under conditions where I expect a poor ROI as a consequence of these low purchase likelihood clicks. I&#8217;d really, really like to know that these search queries are being matched *by Google* in response to user searches. </p>
<p>Now, I *can* use web server logfiles and tracking tags to identify the IP addresses of users that have a search query of &#8220;evening dress&#8221; for my keyword &#8220;red shoes&#8221;, Broad Matched, *IF* the user has allowed (as is the default) the referer_info field to be passed to my server, *AND* if the user has clicked. </p>
<p>If the user withholds that data, then my server doesn&#8217;t see it. If the user doesn&#8217;t click, I don&#8217;t see that the user has even searched. That is, if there&#8217;s an impression, but no click, Google will know what they have tried, but I won&#8217;t. That means that I can&#8217;t either create a more relevant advert, or use a negative keyword to improve my targeting. </p>
<p>But hang on&#8230; The *USER* was never consulted about the degree of matching. The extent to which Google spreads Broad Match affects the *advertiser* but the user isn&#8217;t consulted. Why should the users preference for referer_info, affect what an advertiser learns about the search queries that are matched? Especially since in the usual case, there is no way to tie a click (much less an impression) to a search query in the case that referer_info is not passing on the data.</p>
<h3> What About Exact Match And Privacy</h3>
<p>If a user types in a search query that matches an exact matched keyword, and clicks, and I have tagged my keywords for analytics purposes, I *don&#8217;t care* about the referer_info. I know what they&#8217;ve typed. If a user withholds the referer_info, it doesn&#8217;t matter. Their privacy has been breached. </p>
<p>What will Google do to ensure their privacy? If privacy is so important, then *don&#8217;t* serve adverts to users who have no referer_info, or *only* serve phrase and broad matched adverts &#8211; treating users with no referer_info as if they have an implicit negative exact match for their search query.</p>
<p>If the *user* is to be respected, then the place to act is in ad serving, *not* in withholding information to the advertisers *paying* for the whole system to work. Google has it backwards&#8230;</p>
<h3>Motivation?</h3>
<p>Google&#8217;s reasons for suppressing the search query data are not, I believe, for privacy. Except for very low volume advertisers, pairing up search queries with IP addresses, is tedious. And for impressions that have not resulted in clicks, or for the 30% or so of browsers that have no usable referer_info, there&#8217;s no data, anyway.</p>
<p>Google&#8217;s Search Query Reports are, in any case, already anonymised. There&#8217;s no tracking data in the reports. You can&#8217;t pair the existing reports with any specific user &#8211; a properly anonymised design. </p>
<p>So, I call &#8220;foul&#8221; on Google. They have published an excuse that *I* at least, don&#8217;t think stands up to minimal scrutiny. If I&#8217;m wrong, I&#8217;d love to be educated in how providing Google&#8217;s existing anonymised aggregate reports of clicks leading to my site, can yield a breach in an individual&#8217;s privacy. Please explain. I don&#8217;t get it. At all. </p>
<h3>Alternative Explanation</h3>
<p>I think, instead, this is an excuse for Google to hide how broad the matches are that they do. </p>
<p>Why is that important? </p>
<p>Because the more bidders there are in an auction, the higher the price. If Google wants to increase revenue, they can widen the mouth of Broad Match. A small increase in the willingness to match otherwise unwanted search queries, massively increases the number of bidders in the auction, driving up the value of the auction and contributing to Google&#8217;s bottom line. Advertisers will see declining ROI &#8211; but search marketing ROI fluctuates like a mad thing for most companies outside the Fortune 2000 (or thereabouts &#8211; that *is* a finger in the air estimate). Google can make large revenue increases from slightly increased pain for a large number of advertisers &#8211; the old Nash Equilibrium thingie. </p>
<p>Suddenly seeing a lot of new, strange search queries turn up? That&#8217;d be something that advertisers would want to control. And that would possibly negatively affect Google revenues. </p>
<h3>Summary</h3>
<p>I believe that Google has offered a specious and deliberately misleading reason that stands up to superficial scrutiny, to avoid giving more detailed search query reports. The real reason is not user privacy, but that the exposed information would reveal how Google manipulates Broad Match to generate increasing revenue for itself and its search partners. </p>
<p>I&#8217;d rather that Google said that &#8220;increased details would provide insight into Google&#8217;s valuable trade secrets, the advanced search matching technology at the heart of our search engine&#8221;. I can&#8217;t argue with that, and few courts would be willing, I think, to require a company to expose trade secrets&#8230; unless those were about how to fraudulently increase clicks with little likelihood of a sale. However&#8230; IANAL. </p>
<p>If Google *really* want to tackle privacy issues for users, the right way is to deliver adverts that don&#8217;t give away the critical information. Withholding already anonymised data about *Google&#8217;s* decision on search query matching, is there to protect Google revenues, and is directed at preventing advertisers from controlling spend.</p>
<p>The extra information now available is equivalent to using web server logfiles to analyse the referer_info field &#8211; except that it has a 24 hour delay. So Google is now letting us know what we could already know, if we used our web server logfiles properly and tagged our Destination URLs with sufficient detail. That *is* useful. But claiming to withhold more information about *Google&#8217;s* decisions to match, has nothing to do with user privacy.</p>
 <img src="http://blog.merjis.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=293" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.merjis.com/2009/05/23/privacy-and-adwords-extended-search-query-reports/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Search Engine Marketing 2009 Projections</title>
		<link>http://blog.merjis.com/2009/01/05/search-engine-marketing-2009-projections/</link>
		<comments>http://blog.merjis.com/2009/01/05/search-engine-marketing-2009-projections/#comments</comments>
		<pubDate>Mon, 05 Jan 2009 09:57:58 +0000</pubDate>
		<dc:creator>Jeremy Chatfield</dc:creator>
				<category><![CDATA[analytics]]></category>
		<category><![CDATA[click fraud]]></category>
		<category><![CDATA[internet strategy]]></category>
		<category><![CDATA[malware]]></category>
		<category><![CDATA[marketing]]></category>
		<category><![CDATA[microeconomics]]></category>
		<category><![CDATA[paid search]]></category>
		<category><![CDATA[trust]]></category>

		<guid isPermaLink="false">http://blog.merjis.com/2009/01/05/search-engine-marketing-2009-projections/</guid>
		<description><![CDATA[The main trends that will be visible in 2009: Google will struggle to retain revenues using a variety of techniques Searchers will spend more time browsing and convert after more clicks Online revenues will generally increase &#8211; but business margins will be squeezed Internet Theft Scandals &#8211; Click Fraud, Phishing and Account Theft Details and [...]]]></description>
			<content:encoded><![CDATA[<p>The main trends that will be visible in 2009:</p>
<ul>
<li>Google will struggle to retain revenues using a variety of techniques</li>
<li>Searchers will spend more time browsing and convert after more clicks</li>
<li>Online revenues will generally increase &#8211; but business margins will be squeezed</li>
<li>Internet Theft Scandals &#8211; Click Fraud, Phishing and Account Theft</li>
</ul>
<p>Details and the consequences? Read on&#8230;</p>
<h3>Google Will Adjust To Preserve Revenues</h3>
<p>The basic idea of paid search is simple. Searchers submit search queries, and advertisers pay to have their adverts shown to more or less interested searchers. This meshes with the purchasing process (buying model) in several ways. At the most basic, when someone searches for your unique trademarked business or product name, it is likely that this searcher is intending to buy something. More likely than if they&#8217;d typed a competitors name. When they search for something that describes the product category, they are less likely to buy &#8211; they are in an earlier stage of the buying process. </p>
<p>Auctions have some basic characteristics. As a broad generalisation, the more bidders there are in an auction, the higher the revenue for the sellers. So Google&#8217;s goals have to include increases in the number of bidders in an auction &#8211; Broad Match and adding ever more advertisers all help with this. Convincing advertisers to bid high &#8211; traffic scales heavily with position; Google needs to convince advertisers (or, rather, prevent them from becoming aware of the relationship) that position and click quality are not related, but that position and volume are related. Finally, Google needs more outlets, so that advertisers see more reasons (more impressions) to be part of the ecosystem; however, that largely means adding new, smaller volume, publishers &#8211; some will have niche specialist interests, but overall this encourages more Made For AdSense sites, which, IME, have an appallingly poor conversion rate and value (reducing click quality by adding more outlets).</p>
<p>You&#8217;d expect that in retaliation, advertisers would then want to stick to Exact Match and using only Google&#8217;s Search Pages. There&#8217;s other factors at play, though. </p>
<p>Users miskey &#8211; they can&#8217;t remember the name of the company properly. So someone looking for the travel company Thomas Cook might key &#8220;thomson cooke&#8221;, &#8220;tomas cuik&#8217; and all sorts of other variations. This means that Exact Match isn&#8217;t enough to identify all people looking for the trademark or brand name. Broad match helps by allowing Google to send these near misses to the right advertiser. Google do a phenomenal job of matching; if an advertiser like Travelocity doesn&#8217;t use &#8220;cheap holidays&#8221; as a search term, then Google will match a high bidding Broad Match keyword to that search. </p>
<p>People will also type more specific searches, longer searches; most searches, by a long margin, have more than two words in the search query. Users will type &#8220;logitech mac support&#8221; or similar, to more rapidly jump to stuff they know exists somewhere in the vendors site. Some of these longer searches won&#8217;t lead to sales &#8211; as in this example, some searches are support inquiries. So Exact Match also fails as being too specific, because these search queries wouldn&#8217;t be matched. Phrase Match is useful to find unexpected variations and more specific searches &#8211; leading to the idea that you can help users to jump deeper into the site with the right combination of technology and advert. Again, typos and miscomprehension by users will mean that Broad Match captures additional searchers who intended to find the business, but would have failed with solely Exact or simple Phrase Matches in the campaign.</p>
<p>Google&#8217;s problem is that when someone has decided on &#8220;Honda&#8221;, or &#8220;iPhone&#8221;, the searcher is unlikely to be deflected by alternatives. The consequence is that there will be few competitors for a brand name. Brand names are usually the best return on investment &#8211; so the average costs per click will tend to decrease for brand names. This is part of what you&#8217;d expect from a basic analysis of the buying process &#8211; you generally type a specific company name after you have researched and typically just before buying. So there is an inbuilt pressure to decrease advertising on exact matched competitors name &#8211; resulting in lower revenues from trademark terms. That&#8217;s a problem for Google. </p>
<p>The earlier phase searches, which tend to be naturally higher in volume as people look for alternatives from which to choose, naturally imply a worse ROI &#8211; you need to pay for more clicks to support the research, and the conversion rate is lower because searchers are researching. This too tends to drive Average Cost Per Click down, in order to retain a positive return on investment. </p>
<p>Google needs to keep advertisers focused on Broad Match. This lets Google place unsold inventory, offer competitor adverts on trademark searches, and so on. The goal for Google is to increase the number of participants in the auction (helping increase average cost per click and hence inreased revenues), to find ways to increase advertising on unsold inventory and to affect the minimum bid. </p>
<p>Some techniques that are likely to be used by Google?</p>
<p>Expect to hear more about the difficulty of reaching interested searchers with organic search, now that personal search is becoming more deeply embedded. If advertisers think that they are missing audience because of personal search, that they can only reach with paid search, then there will be more advertisers and increased competition. This will be partially driven because of the effects of recession&#8230; in some market segments there will be reduced interest, and this will naturally translate as reduced search volumes, resulting in fears that personal search is sapping share of voice.</p>
<p>Expect to hear more about Quality Score changes, probably surrounding the calculation of the minimum cost to appear on a page. This is not the same as, but is confusingly similar in name to, the estimated cost to appear on the first page. The intention will probably be that if the CTR is lower than some target value derived from the rest of the network, that the minimum price is adjusted to meet some revenue goal &#8211; it won&#8217;t be written like that &#8211; I can&#8217;t guess how Google will describe the technique, other than that is likely to involve some description about &#8220;improving user search experience&#8221;. There will still have to be a way to allow $0.01/click &#8211; a feature that attracts many advertisers, few of whom have any real chance of achieving this&#8230; but it remains an important differentiator over Yahoo and the other competitors in this space. I think it is possible to offer both, because you only get $0.01 on high volume, high CTR keywords &#8211; IOW, not things that affect most advertisers. </p>
<p>Expect more dilution of &#8220;real search&#8221;. Google still wants additional outlets and will continue to re-present domain parks, &#8220;fake search&#8221;, selected content match outlets and other &#8220;opportunities&#8221; as if they were what naive advertisers expect from keyword search (that is, that the advert is shown directly in response to a real user search that leads to a directly relevant site &#8211; much as happens with organic search results). Google will want the increased impression volume and the *overlap* of keywords with different intent that allows a single outlet to have larger counts of competing advertisers. </p>
<p>Expect more messaging about the synergy of paid and organic search &#8211; that even when you have achieved page dominance for the targeted keyword, you should still be advertising as well topping organic rankings. This increases competition, ensures that there are at least ten plausible advertisers, etc. I expect the messaging around this to increase as advertising volume and value shrink. </p>
<p>Expect more overseas advertisers. Google doesn&#8217;t offer USD bidding to smaller overseas bidders. The exchange rate isn&#8217;t notified to overseas bidders. So, especially when currency exchange rates are in flux, Google can make margin on exchange rates. I expect that Google can easily and all-but-undetectably collect additional revenue on overseas bidders, by tweaking exchange rates &#8211; AFAICS, the auction is held in USD, so exchange rate changes are needed for anyone not using USD. This would be non-US income &#8211; no-one in the USA will blink at using exchange rates to improve revenues from non-US companies. </p>
<p>Radically &#8211; if Google were to remove Exact Match, they could make dramatic transformations of revenue expectation. Would they do this? I think they would, if they could claim that it improved the search users experience. If advertisers fail to appear on a substantial fraction of searches, especially when those clicks turn up in organic search clicks, it would allow Google to say that exact match was preventing searchers from seeing the results that they want to see. This would be a huge step for Google, and they&#8217;d probably need a lot of research. It&#8217;ll be piloted by some large accounts and evidence will probably be drawn from wide-category vendors, like eBay. It&#8217;ll turn out to be complete rubbish for narrowly focused vendors, but Google&#8217;s objectives are maintain share of search, and only secondarily to satisfy advertisers &#8211; the auction will take care of some that advertiser anxiety. </p>
<h3>Increased Search Time &#038; Reduced Conversion Rates</h3>
<p>Most people are planning on reducing spend. They&#8217;ll do so by spending more carefully. They&#8217;ll look around more. They&#8217;ll be taking more personal recommendations. Trust in the stability of large organisations and well known brands will continue to be eroded &#8211; which means that the right businesses, with the right trust validations, can emerge from nowhere; just because your business has been running for 5, 10, 50 or 150 years is no guarantee that you haven&#8217;t recently based your business on false expectations of investments and earnings. A new business with the right accreditations can play on the same field as a 200 year old business, and can raise fear and doubt about the financial stability of well established players. </p>
<p>So, shoppers will be more wary. You&#8217;ll need to improve your sites to have online messages address current concerns. The fears, uncertainties and doubts that were addressed in marketing communications last year, won&#8217;t work so effectively this year; the concerns are different. Your customers will be concerned that even buying from a major brand will damage them, or is at least risky. Reassurance and validation will be important &#8211; especially with the media focusing on bad news. Bad news sells and media outlets will want more viewers. The at-risk media outlets, old broadcast media, seeing declining share of advertising budgets will want to stimulate sales through increasing interest in controversy, further weakening consumer confidence. Failure to address offline media reports online will impact conversion rates &#8211; though you will want to avoid directly feeding the controversy.</p>
<p>Minor brands should be able to gain, if they are cash flow positive or can find investors with imagination, foresight and cash. Guarantees and warranties, future-safe products and ways to reduce service costs for consumers; ways to manage domestic finances effectively; ways to cut back while still having luxuries. </p>
<p>The results of these will be to put some strange pressures on paid search. Companies that have previously survived on word of mouth and low or zero advertising, will need new customers. They&#8217;ll advertise. As new advertisers, they&#8217;ll make a lot of mistakes and won&#8217;t be willing to pay for expert assistance. That will increase claims of click fraud, and apply upward pressures on bids, sometimes from advertisers that shouldn&#8217;t be in the auction. </p>
<p>Existing advertisers will want to improve performance and will decrease budgets and target the spend on the most effective keywords and adverts. This will increase pressures on the paid search companies to dilute the inventory &#8211; the result is likely to be that overall clicks/conversion will continue to worsen, otherwise the paid search companies will report losses, and decreases in search volumes. It&#8217;s always bigger news if a big company loses, than a bunch of smaller businesses &#8211; so expect that larger businesses will use their asymmetric control of information to manage smaller customer businesses expectations and margins, through manipulating click quality.  Expect the search engines to protect themselves at the expense of smaller advertisers. </p>
<p>ROI will generally worsen, but there will be islands where specific companies have struck the right messages that resonate with users. The result will be that paid search competition will continue to heat up. I expect that the average CPC will decline, the total value of paid search will decline, but some advertisers in various niches will have justifiably higher Average Cost Per Click and increased spending. </p>
<h4>Affiliates</h4>
<p>The affiliate industry will also become even more heated. Out of a job? Looking to make money with a low capital investment? Then  you could become an affiliate&#8230;</p>
<p>However, who needs more novice affiliates? What does an influx of untrained, cash starved and at least initially ineffective affiliates do, when the real super affiliates (not the one man and a dog operations blogging from &#8220;super affiliates are us&#8221;, but the real, quiet, and highly effective super affiliates) already handle about 80% of affiliate traffic? The answer is that these new affiliates will provide free advertising for marginally effective businesses, by spending their own money essentially as an investment. Expect more companies to switch to affiliate advertising models to control their in-house marketing costs and to reduce spend on advertising agencies. Expect a lot more annoyed novice affiliates. </p>
<p>The affiliate industry will boom; but leave a lot of disillusioned &#8220;internet advertisers&#8221; in their wake. There will be some emergent new stars &#8211; not everyone who comes in will fail, and some existing leaders will cash out and move on. Expect some churn in the top affiliate products &#8211; but that&#8217;s pretty standard in the industry anyway. </p>
<p>I expect that the real super affiliates will be the same next year as this year &#8211; they have developed their techniques, and they are effective. They might retrench a little and change their focus on the businesses they are interested in, but they&#8217;ll survive and probably continue modest growth; their main obstacle to growth has historically been cash; 60 and 90 day payment cycles on sales conversions mean that working cash is tied up, effectively limiting them to &#8220;four to six inventory turns&#8221; a  year. That reduces their rates of growth, and with banks not in lending mode, these guys are bottled up, at least in paid search. </p>
<h3>Online Revenues Will Increase</h3>
<p>Well, they will unless the internet theft scandals are outrageously large. In a search for the best value, comparison shopping online and internet supported purchases over the phone will become more important. Where businesses aren&#8217;t already selling online, there will be pressures now to do so, to reduce the costs of sales. The final result will be that more is sold online &#8211; even though total sales from all sources will decline. There may be some countries where there is a decline in online sales, but it will be less than the decline in total sales &#8211; the internet will be seen as one of the brighter spots. Just. </p>
<p>There&#8217;s only one reason for internet sales to increase. If executed effectively, you can reduce the costs of making the sale. That tends to be less and less true the lower the volume of sales; so expect continued outsourcing to places like Yahoo!Merchant Stores, where the infrastructure and development costs have already been put in place, and allows dropping the cost of sales. The interim step between often dreadful sites with no compelling call to action and a full online sales site, is a phone number; but placed on an inactive site that doesn&#8217;t reflect your best price and the unique value in buying from you rather than a competitor, this will have no effect. So expect the more savvy small business to look for CMS based websites where they can update their own content. I&#8217;m expecting that smarter SME&#8217;s will be investing more in templated CMS backed websites than the current static crop.</p>
<p>And metrics. Web analytics &#8211; free tools like Google Analytics &#8211; properly used, can reveal a lot. The problem for SMEs is setting up and understanding the data. There should be an increased demand for skills in setting up analytics, and interpreting what is happening. </p>
<h3>Internet Theft Scandals</h3>
<p>With money tight and some smart people out of a job&#8230; expect internet fraud to increase. With businesses looking for excuses to write off problem debts, expect more disclosure of problems, probably triggered by an inadvertently exposed online scam. This is obviously a tentative projection, as it depends on unforeseeable circumstances. However, this year is the first year since 1994 (when I started internet sales seriously), when I can see reasons to add excuses to the balance sheet, some understanding of the risks to businesses and that even corporations are affected by fraudsters, and a large value of online business that probably will increase (at least in relative terms), and an increase in technological capability by scammers and fraudsters. The combined pressures might make it attractive for the first time to attribute losses to technologically sophisticated thieves. </p>
<p>Online security for the average consumer has not improved in any seriously identifiable way since 1994, other than the progressive plugging of vulnerabilities in web browsers and servers. The introduction of Secure HTTP (https, or &#8220;secure server&#8221; technology) back then, provided users with a somewhat artificial degree of confidence. Many web sites offer access to financially significant resources with only an account name and password. Account names and passwords are insufficient for best security practices. Banks now often offer multiple levels of password and CAPTCHA, and may require authentication through encrypted PIN checkers &#8211; beginning to approach the holy trinity of security (&#8220;something that you are, something that you have and something that you know&#8221; &#8211; at least two of those are addressed by the better banks). However, there are still far too many ways to spend money online that are protected only by guessable account names and passwords, and by essentially unprotected mechanisms to send money to scammers. </p>
<p>There will be scandals about this. Probably soon. If there&#8217;s a sniff of a problem, then the offline media will pounce. These media need to pounce and will argue forcefully; their businesses are in decline, and they&#8217;ll need to savage the failures of online businesses to help protect their own. While this would be a significantly negative sum game for all online sales, the benefit of defection by an attacked business will be high; &#8220;We really made money, and we only show a loss because of internet theft we couldn&#8217;t control&#8221; will be an attractive excuse to some business, at some point. And then the skies will open and the extent of internet crime will be an issue. </p>
<p>That will damage sales. Hugely. Even if the effects are already part of current accounting systems, reported as part of business accounts and effectively managed by all online businesses, the *perception* of reductions in safety will be very damaging. I don&#8217;t see any sign of widespread adoption of even &#8220;best practice&#8221; account management &#8211; internet wide consistent login methods and messages, proper understanding of what a &#8220;secure server&#8221; means, and so on. While users are prepared to pay money to sites they can&#8217;t properly identify in return for promises to deliver products they don&#8217;t receive, across national borders (losing nationally accountable and interested police enforcement), this problem will continue to grow. The individual losses are small; the investigations complex. At some point, this will become a more serious problem to deal with than Enron and Madhoff. I think it&#8217;ll be this year. </p>
<h3>Consequences</h3>
<p>New advertisers will start by thinking about paid search &#8211; increasing pressures there. This will result in disappointment and claims of click fraud, especially if new advertisers discover that a click is not a click&#8230; different clicks have different values, even if Google conflates clicks from multiple sources. Expect a lot more noise and heat from new advertisers who feel they have been mislead. </p>
<p>Existing advertisers will mostly reduce and focus spend. AvCPC will have a downward pressure, counteracted by the SE&#8217;s including low value inventory &#8211; weakening conversion rates and hiding it in a generally slower market with inbuilt tendencies for longer latency and reduced likelihood of sales. ROI will get worse; the question is whether the business can take a few quarters of bad ROI in order to survive to an upturn. There should be increased attention to improving the web site, however, this will be diluted by the feelings of senior management that enough work has been done on the web site &#8211; &#8220;it&#8217;s a finished product&#8221; &#8211; and that there have been sales, so all that is needed is more visitors. IME, that&#8217;s usually wrong &#8211; there&#8217;s usually a lot of ways to make a site more effective in selling, even for long latency sales. </p>
<p>There will be a switch to increase SEO efforts. If purchasers will spend more time in research anyway, increasing the clicks and site visits per sale, then organic search results become more interesting. This will increase interest in spammy linking as people learn their way into search engine marketing. There will also be increased interest in affiliate advertising &#8211; if you aren&#8217;t skilled in internet advertising, then getting affiliates to advertise on your behalf will be interesting. However, this again should result in increased interest in web site design improvements; it won&#8217;t, because once a business have &#8220;put your brochure online&#8221; the naive perception is that there&#8217;ll be nothing else to do. There&#8217;s still far too many naive advertisers with ineffective web sites and a management team that focuses on volume of traffic, not the quality of that traffic and the quality of the response from the website. </p>
<p>There will be a decrease in search volumes for products. Despite the increased research for purchases, search volumes for commercial products will be negatively affected. That&#8217;s partially because the economy is slowing, but mostly because people are learning the online environment; they know where to go to get various things at decent prices, so they don&#8217;t have to do so much searching and researching &#8211; I haven&#8217;t heard anyone complain for the last 18 months that they are &#8220;useless at searching&#8221;. This year may be first year where the combined effects of knowing your internet neighbourhood and slowing commercial activity, actually dent the growth of both paid and organic search. </p>
<p>A significant internet theft scandal will cause a dramatic decrease in online consumer spending, if it bursts. Expect new web browser technology and new web site authentication services. At the moment, I&#8217;m pretty sure that the only way to tackle the identify theft problem is solutions that, as an industry, we&#8217;re nowhere near considering. The short term consequences will be to push the internet clock back towards use supporting research for purchases, not for actual online purchases. </p>
<p>CMS&#8217;s will become more important for SME&#8217;s. They&#8217;ll finally be able to build customised landing pages, important for improving the effectiveness of paid search.</p>
<p>Web analytics, neglected or misinterpreted, will increase in significance &#8211; and Google Analytics low price point makes it a very attractive platform to use, despite the poor general knowledge about how to best use it. The downside of &#8220;free&#8221; is that expectations of the value are low, and users are unwilling to pay a lot to understand the benefits; smarter businesses will invest in learning how to use GA effectively (IOW, not as it comes out of the box). </p>
 <img src="http://blog.merjis.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=266" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.merjis.com/2009/01/05/search-engine-marketing-2009-projections/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Google Throws Away The Rules, Again</title>
		<link>http://blog.merjis.com/2008/10/29/google-throws-away-the-rules-again/</link>
		<comments>http://blog.merjis.com/2008/10/29/google-throws-away-the-rules-again/#comments</comments>
		<pubDate>Wed, 29 Oct 2008 06:52:53 +0000</pubDate>
		<dc:creator>Jeremy Chatfield</dc:creator>
				<category><![CDATA[adwords]]></category>
		<category><![CDATA[click fraud]]></category>
		<category><![CDATA[content match]]></category>
		<category><![CDATA[trust]]></category>
		<category><![CDATA[web analytics]]></category>

		<guid isPermaLink="false">http://blog.merjis.com/2008/10/29/google-throws-away-the-rules-again/</guid>
		<description><![CDATA[Google&#8217;s user base may be built on high reputation with organic search visitor volume, but that doesn&#8217;t prevent the search giant from leading users a merry dance in the pursuit of profit. Here&#8217;s a real world example. I&#8217;ve taken this from a real Google Account, with real web analytics data. I&#8217;ve concealed the precise search [...]]]></description>
			<content:encoded><![CDATA[<p>Google&#8217;s user base may be built on high reputation with organic search visitor volume, but that doesn&#8217;t prevent the search giant from leading users a merry dance in the pursuit of profit. Here&#8217;s a real world example. I&#8217;ve taken this from a real Google Account, with real web analytics data. I&#8217;ve concealed the precise search queries and the exact nature of the client; you can assume that it isn&#8217;t a TV company, so anything that strikes you as bizarre about the scenario, is probably bizarre because that&#8217;s *not* the client :)</p>
<p>Pretend that you advertise for a TV network. The TV company offers live scores on their web site. So you pay for the keywords &#8220;football live scores&#8221; and &#8220;live football scores&#8221;, &#8220;live scores&#8221;, etc. You&#8217;ll bid less for &#8220;live scores&#8221; than the keywords that include &#8220;football&#8221; &#8211; people also look for &#8220;nba live scores&#8221;, &#8220;cricket live scores&#8221;, etc and their CTR and conversion rates are lower on your football-only pages. </p>
<p>For this account, you disable Content Match &#8211; this is initially focused on keyword search; eventually you&#8217;ll operate another campaign with differently constructed AdGroups and adverts for the AdSense network. So, no content match for now. </p>
<p>You use Broad, Phrase and Exact Match. Broad match will catch &#8220;cricinfo.com&#8221;, &#8220;crickinfo.com&#8221; and other stuff that is related to sports scores &#8211; so you need to start building up a good long list of negative keywords. But whatever else, at least you know that you have excluded the Content Network.</p>
<p>Despite that&#8230; your advert could appear here:</p>
<p><img id="image236" src="http://blog.merjis.com/wp-content/uploads/2008/10/picture-20.png" alt="&quot;liver scores&quot; matches &quot;live score&quot;" /></p>
<p>Can you tell the difference between this pages presentation of a keyword search targeted advert and AdSense? No? I can&#8217;t. If that&#8217;s not content network, then what is it? It isn&#8217;t keyword search in any way that I understand the term. Of course, Google don&#8217;t actually define &#8220;search pages&#8221; and &#8220;content pages&#8221; in their advertising contracts&#8230; So if Google defines this as a &#8220;Search Page&#8221;, no advertiser has a leg to stand on in complaining to them. </p>
<p>Here&#8217;s the three live campaigns for this client, this month, with identifying details and exact keyword data blocked out. This is just to show that this client has not been paying for contextual adverts:</p>
<p><img id="image241" src="http://blog.merjis.com/wp-content/uploads/2008/10/picture-21-edited.png" alt="Campaigns have no content match enabled" width=600 /></p>
<p>How can I tell that the account has been displaying adverts here? Referral Information. Web server log files and web analytics reveal the source:</p>
<p><img id="image238" src="http://blog.merjis.com/wp-content/uploads/2008/10/picture-22.png" alt="Referral Information Shows Where Clicks Come From, Kinda" width=600/></p>
<p>See that third line? &#8220;RightHealth.com&#8221;? That&#8217;s the one showing &#8220;Liver Scores&#8221; adverts. You can&#8217;t tell from this snippet *when* that data was collected. I can&#8217;t see a good way to show you that this was collected recently, but I observed this in the clients web analytics between the dates 23rd October and 26th October.</p>
<p>Note that referral information can be missing &#8211; some browsers don&#8217;t send it, bots don&#8217;t usually set it, and it can be withheld intentionally under various conditions. Referrer information can also be wrong &#8211; intentionally or unintentionally. The Referrer information comes from the web browser of the user, not Google. So if this referral source has been faked by a user, they&#8217;ve gone to some considerable effort to make the click look like a plausible one &#8211; they&#8217;ll have had to use the keyword to find a page with a spelling variation that could have triggered a content match advert, and then stuffed the referrer with that. The question is&#8230; why would they bother? Until I can understand an economic motive to go this level of work, I&#8217;m going to assume that, for this instance, there is a low incentive for the user to lie about the referral source. </p>
<p>Had this genuinely been a content network click, there *is* an economic incentive to fake the browsers information &#8211; it deflects attention from a real source of low quality clicks. Only the &#8220;<a href="http://blog.merjis.com/2007/07/16/click-fraud-google-adwords-and-gclid/">gclid</a>&#8221; can reveal where the impression was actually served, and Google appear to be the only organisation that can decode the meaning of the gclid value. </p>
<p>There is one additional clue as to what is happening, and the dates. I was initially running this campaign only on Google Search. I enabled the Partner Search Network on Thursday 23rd this week. Screenshot from the Account History with identifying data removed:</p>
<p><img id="image242" src="http://blog.merjis.com/wp-content/uploads/2008/10/picture-23-edited.png" alt="History - opting into Search Network." width=600 /></p>
<p>These types of click only happen when the Search Network is enabled. They disappear when using only Google Search.</p>
<p>Additionally note that I have excluded 404 pages, war pages, domain parks, etc from these campaigns. So there&#8217;s no reason for anything remotely resembling a content network advert to appear. I&#8217;ve done my best to remove any reason to show these adverts for any reason other than keyword search.</p>
<p>A completely inappropriate context (liver scores), on a targeting method I&#8217;ve denied.</p>
<p>Does it get worse than that?</p>
<p>Could Google have already excluded this click from my clients&#8217; bill? Quite possibly. But who can tell &#8211; Google won&#8217;t offer a click by click itemisation of which clicks are billed and which are excluded. So you have to vet everything and appeal everything. Extra costs for the client &#8211; adding to advertising costs. </p>
<h3>Ignoring the arbitrage</h3>
<p>I&#8217;m not going to get started on the <a href="http://www.antezeta.com/blog/adsense-arbitrage/">arbitrage</a> &#8211; see that Ask advert? <a href="http://www.marketingpilgrim.com/2008/01/back-door-arbitrage-at-google.html">Ask resells Google Keyword Search adverts</a>. So the low cost Content Match advert brings people to Ask for Keyword Search, amplifying Ask revenues. </p>
<p>The value to *advertisers* of this type of arbitrage is completely a different question. </p>
<h3>Questions</h3>
<p>Wall Street Analysts &#8211; want to know why Google&#8217;s result are better? Does this example suggest a mechanism? </p>
<p>Advertisers? Want to know why your ROI goes down as well as up? Do you check your referral sources exist and are relevant to the type of advertising that you paid for? </p>
<p>Search users? Ever wondered why some adverts seem so poorly aimed? It may not be the advertiser choosing inappropriate keywords and sites, it might be Google&#8217;s greed. OTOH, I see so many badly designed AdWords accounts that I wouldn&#8217;t be surprised if you did see a lot of irrelevant adverts.</p>
<p>Why is Google so nasty about landing page quality scores and so ineffectual at matching search queries and adverts? Is it that they have a large stock of unsold inventory and they throw any advert at that to see what sticks? Should advertisers treat Google&#8217;s apparent interest in Quality Scores as a way to raise expectations that Google is as diligent on search queries &#8211; a marketing technique to raise advertiser&#8217;s bids in the face of a variable quality searches? </p>
<h3>Conclusions</h3>
<p>I&#8217;ve fired off a pretty stroppy email to my account rep. I&#8217;ll let you know how it goes.</p>
 <img src="http://blog.merjis.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=239" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.merjis.com/2008/10/29/google-throws-away-the-rules-again/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Google&#8217;s Approach To Click Fraud &#8211; 2007</title>
		<link>http://blog.merjis.com/2008/10/18/googles-approach-to-click-fraud-2007/</link>
		<comments>http://blog.merjis.com/2008/10/18/googles-approach-to-click-fraud-2007/#comments</comments>
		<pubDate>Sat, 18 Oct 2008 10:29:58 +0000</pubDate>
		<dc:creator>Jeremy Chatfield</dc:creator>
				<category><![CDATA[adwords]]></category>
		<category><![CDATA[click fraud]]></category>
		<category><![CDATA[content match]]></category>
		<category><![CDATA[conversion]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[microeconomics]]></category>
		<category><![CDATA[paid search]]></category>
		<category><![CDATA[trust]]></category>
		<category><![CDATA[web analytics]]></category>

		<guid isPermaLink="false">http://blog.merjis.com/2008/10/18/googles-approach-to-click-fraud-2007/</guid>
		<description><![CDATA[Well, I&#8217;m a year late finding this PDF about Click Fraud, ROI and Advertiser Response by Kourosh Gharachorloo of Google. I was doing a periodic scan to see if anyone else has published how to interpret the autotagged gclid in AdWords. It&#8217;s nice to see that my old article anticipated many of the arguments &#8211; [...]]]></description>
			<content:encoded><![CDATA[<p>Well, I&#8217;m a year late finding this PDF about <a href="http://www.google.com/adwords/adtrafficquality/files/adfraud_anecdotes.pdf">Click Fraud, ROI and Advertiser Response by Kourosh Gharachorloo of Google</a>. I was doing a periodic scan to see if anyone else has published <a href="http://blog.merjis.com/2007/07/16/click-fraud-google-adwords-and-gclid/">how to interpret the autotagged gclid in AdWords</a>. It&#8217;s nice to see that my old article anticipated many of the arguments &#8211; though this is a better paper than my old article in some important ways. More diagrams. Fewer words. </p>
<h3>Weaknesses in the paper</h3>
<p>There are a few embedded misperceptions in the &#8220;AdFraud Anecdotes&#8221; PDF. </p>
<p>I&#8217;ll probably add to this list as I think more about it.</p>
<ul>
<li>Advertisers do not react rationally. That&#8217;s partially a consequence of information holding asymmetry.</li>
<li>Google controls Broad Match and Auction Quorum Sizes to improve Google&#8217;s returns &#8211; and hides the crap search queries in reports as &#8220;18,0000 other unique searches&#8221;</li>
<li>Google is in complete control of the quality of matching, by default. Google controls click quality, advertisers choose bids and can forgo volume by selecting phrase match and exact match, or accept Google&#8217;s decisions about matching. Most of the variation in CPA &#038; ROI that I find can be directly attributed to Google changing the nature of search queries that are being matched.</li>
<li>Advertisers are remarkably reluctant to measure &#8211; because for a small business the costs of developing the understanding of measurement is expensive; so is hiring in the talent to understand the data. I infer that small businesses have a higher percentage of undetected and undetectable click fraud and greater difficulty in establishing a &#8220;useful for management purposes&#8221; CPA (internet noise makes data collection long period &#8211; I may write up the details of this).</li>
<li>The paper addresses large advertisers quite well. Typical of economists and governments &#8211; but single large entities are *not* representative of the larger numbers of smaller businesses &#8211; scale changes impact.</li>
</ul>
<h3>Strengths of the paper</h3>
<p>This shows that some people within Google are clearly understanding the role that Google has. </p>
<p>It is interesting that the thinking and insights exposed in this paper *haven&#8217;t* percolated to the AdWords Sales Teams, at least in the UK. I&#8217;ve recently had a rather unsatisfactory meeting with a major account team, who said that Google has no insight into conversion&#8230; Despite our mutual client having AdWords Conversion Tracking enabled for several years. If ROI was important to Google, you&#8217;d have thought they&#8217;d have noticed, and been rather more interested in how recent change have affected ROI. Wouldn&#8217;t you? Instead, I got the usual lecture about increasing spend&#8230; instead of them giving me techniques to manage AdWords to improve ROI in the face of Google&#8217;s efforts to undermine that. I may write about that more, when I can work out how to disentangle any explanation from any specific clients&#8217; data.</p>
<p>The PDF provides some rather Delphic oracular premonitions of the recent user interface changes, for example, separating metrics between Google Search and the Search Partners and the Content Network; allowing (some) control over domain parks, 404 page advertising, etc. Careful reading is required and may be rewarded. </p>
<p>I may make some predictions about likely further AdWords UI changes, when I&#8217;ve had a bit more of a think&#8230; Frankly, stumbling on this has been a bit of a shock&#8230;</p>
 <img src="http://blog.merjis.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=232" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.merjis.com/2008/10/18/googles-approach-to-click-fraud-2007/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>AdWords Phishing &#8211; Another Type Of Click Fraud?</title>
		<link>http://blog.merjis.com/2008/07/02/adwords-phishing-another-type-of-click-fraud/</link>
		<comments>http://blog.merjis.com/2008/07/02/adwords-phishing-another-type-of-click-fraud/#comments</comments>
		<pubDate>Wed, 02 Jul 2008 15:07:00 +0000</pubDate>
		<dc:creator>Jeremy Chatfield</dc:creator>
				<category><![CDATA[adwords]]></category>
		<category><![CDATA[click fraud]]></category>
		<category><![CDATA[malware]]></category>
		<category><![CDATA[phishing]]></category>
		<category><![CDATA[trust]]></category>

		<guid isPermaLink="false">http://blog.merjis.com/2008/07/02/adwords-phishing-another-type-of-click-fraud/</guid>
		<description><![CDATA[I now know what at least one of the scammers are doing. Screenshots of the fraudulent activity signature of this scammer are shown below. If you&#8217;ve clicked on a URL in an email apparently from Google, recently, then you might want to check your AdWords History &#8211; as shown below. If these international fraudsters have [...]]]></description>
			<content:encoded><![CDATA[<p>I now know what at least one of the scammers are doing. Screenshots of the fraudulent activity signature of this scammer are shown below. If you&#8217;ve clicked on a URL in an email apparently from Google, recently, then you might want to check your AdWords History &#8211; as shown below. If these international fraudsters have gained access to your account &#8211; get hold of Google support ASAP, and change both the account you use and the password. </p>
<p>If this is what I think it is, then it this is another variety of click fraud &#8211; clicks paid from your money, in your account, that don&#8217;t benefit you but indirectly benefit the scammers and Google. Let&#8217;s have a look at what these scammers do, and see if they are detectable. </p>
<h3>Cautionary Note</h3>
<p>Be careful when you visit a site like this. A classic malware attack is to get people to visit a compromised site, that hosts malware that will enlist your machine in a botnet. If you go looking at stuff like this you need to be very careful that you don&#8217;t get compromised. </p>
<p>I think I&#8217;m pretty safe. I used an old email address that hasn&#8217;t been used for a Google Account previously. I used a unique password for that address &#8211; never previously used. I also changed everything immediately afterwards. I didn&#8217;t connect it with our MCC, and I used an account with no adverts and no funds from any source &#8211; so there&#8217;s no trace of my identity or connection with our business. I also used Flock for the first time &#8211; a social networking variant of Mozilla Firefox &#8211; to avoid any residual cookies and so on. I did most of the initial work in a virtual machine running Windows on my Mac &#8211; making it pretty difficult to penetrate the security &#8211; and when I saw no malware, switched back to Mac OS X for screenshots. I really wouldn&#8217;t advise looking at these criminal activities unless you take at least the steps I used. I expect that someone who has been involved in InfoSec more recently would suggest even more protective measures. </p>
<h3>Domains</h3>
<p>The domains used for this phishing attempt were <em>source-adwords.com</em> and <em>ads-source.com</em>. Like previous domains, these are registered to French mailing addresses. Not the same addresses as the previous round of messages &#8211; so it may be that they are abusing the identity of otherwise innocent parties. Since the scammers aren&#8217;t counting on the domain lasting for enough time to be fully registered, they don&#8217;t really need a real physical address that reaches them. I&#8217;m excluding legal requirements &#8211; these guys are criminals after all, so expecting them to obey any European laws about registering correct business addresses is excessively optimistic. </p>
<p>The name servers they use do seem to be consistent. This may suggest a relationship with some kind of hosting service. I must check that out and see whether these servers are all in the same facility. </p>
<h3>What I Did</h3>
<p>When I got this round of phishing emails, I checked the &#8220;whois&#8221; records, and captured info about the claimed domain owner. I then attempted to log in with fake password &#8211; looking like a typo of the real password. If it was a malware download, I figured they&#8217;d go for both valid and invalid logins. They don&#8217;t appear to be delivering a malware load, or at least not the range of sites that I&#8217;ve seen. </p>
<p>Another common Trojan technique is to put up a fake login page, and then issue an error message, even if the right data is submitted, redirecting to the right site &#8211; so at the point at which you become suspicious, you are now looking at the real site. When you rekey your details, they work. Most people assume that they miskeyed the blanked out password. The scammers meanwhile have collected your details and can now login safely. </p>
<p>With a twinge of doubt, I submitted a real account name and password &#8211; knowing that the account was pretty much vanilla, having been just set up and being completely unfunded. That let me in to their stumpy site. Half the links don&#8217;t go anywhere. What it did give me was this offer:</p>
<p><img id="image189" src="http://blog.merjis.com/wp-content/uploads/2008/07/picture-46-anon.png" width="600" alt="SMS Alerts - New! Or, Actually, Not That New And Genuinely Fake." /></p>
<p>Google&#8217;s been offering SMS alerts for some time. I&#8217;ve signed up to them for many of my client accounts, and I know what the screen looks like. This isn&#8217;t it &#8211; and notice the wierd check box with pseudo-English offering &#8220;I agree with security types&#8221;? </p>
<p>This screen may give the scammers another revenue opportunity, if you give your cellphone number &#8211; but I don&#8217;t know much about mobile fraud mechanisms, yet. They obviously don&#8217;t care about that mechanism though, because they&#8217;ll gladly accept an empty phone number while giving a message that, yes, they&#8217;ll be giving me alerts. A note for non-US users &#8211; this page may strike you as odd, because it is clearly configured for a US phone number. Inside the US, of course, you won&#8217;t see the number format as jarring. </p>
<p><img id="image190" src="http://blog.merjis.com/wp-content/uploads/2008/07/picture-49-anon.png" width="600" alt="Yay! Register a Null Phone Number, Successfully!" /></p>
<p>If you think you&#8217;ve seen this on your screen, you&#8217;re probably at risk.</p>
<h3>The Evidence Trail</h3>
<p>I briefly did some work in Information Security a few years ago, working with <a href="http://www.uk.capgemini.com/services/technology/security/">CAP Gemini&#8217;s InfoSec teams</a> in the UK, and others. This data is not up to the standards of their digital forensics, but there are some interesting pieces of information we can pick out. </p>
<p><img id="image191" src="http://blog.merjis.com/wp-content/uploads/2008/07/picture-53-anon.png" width="600" alt="Google AdWords History Tool Shows MonetaAccount." /></p>
<p>Oh ho &#8211; here&#8217;s the hot clue &#8211; <em><a href="https://www.monetacorp.com/">MonetaAccount</a></em>? Not something associated with anything I&#8217;ve been doing. Obviously, just as they use multiple peoples names for the Domains they use, this name may not be unique. I&#8217;d have to look at a few more phishing attempts before seeing the pattern here. Moneta does seem to be associated with mobile phone topups and instant charging. Perhaps this is way to send clicks to Moneta, or that Moneta is being used to extract funds (e.g. asking Google to close the account and send funds to Moneta?). Remember that Moneta may not be directly involved. If they run an affiliate program, this could be a, hrrm, &#8220;excessively enthusiastic&#8221; affiliate, using someone elses&#8217; money.</p>
<p>These guys apparently aren&#8217;t using the AdWords API. If they were, there&#8217;d be a clue in the Access tab of my compromised account. It would probably also be easier for Google to detect and track them down.</p>
<p>However, the speed of checking the account name and password means that the Phishing Server is passing data back to the malicious software pretty quickly. There may be a signature that *Google* could recognise, of attempted access to an account from a know suspicious IP address. I&#8217;ve certainly had no warning that my account ID&#8217;s have had attempted use, and that I should check my account. </p>
<p>I can think of other techniques they might use, but I&#8217;m not sure that they are using them. I only like to document stuff that I&#8217;m pretty sure they&#8217;ve thought of already. They&#8217;ll be spending a lot longer thinking about and doing this activity than I can afford to spend pre-emptively working out what they do &#8211; an old dilemma for InfoSec. The baddies only have to break your site once to count a win and you have to defend against all the baddies, all the time, and can&#8217;t count coup on a successful defence. </p>
<h3>What Is Google Doing?</h3>
<p>They are clearly working with domain administrators. Nether of these latest sites are now working. </p>
<p>OTOH, Google sent a pretty bland and generic message when I told them that this account was compromised.</p>
<p>Their email also didn&#8217;t clearly explain how to create a new account ID and to change the password, though it said that you should. This sounds like classic advice from a technical wizard, who has no idea that ordinary users have problems translating the words into actions. I&#8217;ve been playing with some screen capture toys for the Mac, so I may make a video about adding accounts and removing access for the old account. It&#8217;s a good reason for all that playtime (well, there&#8217;s another reason, too&#8230; and you might find out about it!)</p>
<p>There appear to be at least two different levels of response that Google offer. If I have an ordinary account, not linked to my MCC, and I use the support contact information, I get a response offering the right suggestions, slightly more slowly than I&#8217;d expect for a security/financially related response, with information that is hard to parse for non-IT/InfoSec literate users. If I have a similar problem with an agency linked account, I can phone, and get specific immediate advice within about 10 minutes and an escalation to a security specialist. </p>
<h3>What Should You Be Doing</h3>
<p>Since I wrote this article, another of my clients has had suspicious activity. We&#8217;re looking in to it, but it currently appears that a secondary user with a unique user account, may have clicked on a phishing message, giving access to some third party who set up an AdSense account link. </p>
<p>So, what can you do to defend yourself?</p>
<ul>
<li>Don&#8217;t click on links in emails that lead to account name and password forms &#8211; type the name directly (PayPal, eBay or AdWords) or use a bookmark that you set up.</li>
<li>Read the URL of the site carefully before you do anything involving secrets or money.</li>
<li>Make sure the secondary account users know the hazards &#8211; and disable (remove) unused secondary accounts to reduce risk.</li>
<li>Read your History log every so often &#8211; exclude bid changes, which are probably the most common activity and look for the wierd events, such as new ID&#8217;s being added, or new destination URLs being set for keywords and adverts, or new and unexpected campaigns.</li>
</ul>
<h3>Summary</h3>
<p>At least one class of AdWords Phishing scam is gaining access to accounts. What they do to accounts with funds is not yet known &#8211; but I&#8217;ll guess that you find new keywords added, with a new destination URL or possibly even new adverts. You may find links to AdSense or new payment processors &#8211; possibly signalling funds being leached through fraudulent clicks or by shutting down the account and stealing residual funds. </p>
<p>Advertisers and agencies should always key in the name of the AdWords site or use a known good personal bookmark. Don&#8217;t use links in email.</p>
<p>Google could do more to authenticate their emails to users and establish that they have access to data that scammers would have to guess. EBay does this &#8211; using a personal name that I have registered so that my account email includes details that are only shared between me and eBay. This helps me to trust those emails more.</p>
<p>Google&#8217;s explanations are not yet clear enough for ordinary members of the public to manage a problem. A clearer, step by step description of the process to change account details would help. </p>
<p>Google could more positively warn users. I&#8217;m expecting that the scammers use a characteristic signature for account access &#8211; a server somewhere that is logging in. If they were really smart, they&#8217;d use a botnet and have a compromised home machine access the account. However, there is a pattern for legitimate users. Most AdWords users will use the same IP addresses or ranges. Google could establish the normal pattern and then send emails to warn of abnormal patterns, when those abnormal patterns have gained account access and performed a signature set of activities. The signature looks detectable&#8230; </p>
<p>I&#8217;d like to see some active blogging from Google about this threat to advertisers, and how Google is protecting us. After all, activity undetected by advertisers is cash in Google&#8217;s pocket &#8211; not fraud. It only becomes fraud when known. There is an incentive for Google to brush this under the carpet and recognise more spend as revenue and less as fraudulent activity. If Google want to be my friend, then they have to act like it, and not hide the evidence of third party malfeasance.</p>
<p>While the frequency of these scam messages have been increasing recently, I suspect that the volume of click fraud is low. So it is a low risk &#8211; but a likely high impact for each affected account. </p>
<p>I suspect that the main subjects of this scam will be small volume advertisers, who are not AdWords daily usage experts &#8211; they won&#8217;t know what normal Google messages are like or what the evolving UI now looks like.</p>
<p>Looks like Moneta should be involved &#8211; the AdWords History Tool shows the Moneta Account ID. Backtracing within Moneta should allow identifying the perpetrators.</p>
<h3>Updates</h3>
<p>2008-07-03 Added new details of a real attack. Language tidy up. New section on activity you can do for self protection. Clarity on investigative activity vs screenshots &#8211; sense s/be unchanged but now clearer why the screenshots are Mac based. </p>
 <img src="http://blog.merjis.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=185" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.merjis.com/2008/07/02/adwords-phishing-another-type-of-click-fraud/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Malware Detection Breaks Web Metrics?</title>
		<link>http://blog.merjis.com/2008/06/15/malware-detection-breaks-web-metrics/</link>
		<comments>http://blog.merjis.com/2008/06/15/malware-detection-breaks-web-metrics/#comments</comments>
		<pubDate>Sun, 15 Jun 2008 20:36:08 +0000</pubDate>
		<dc:creator>Jeremy Chatfield</dc:creator>
				<category><![CDATA[adwords]]></category>
		<category><![CDATA[click fraud]]></category>
		<category><![CDATA[malware]]></category>
		<category><![CDATA[web analytics]]></category>

		<guid isPermaLink="false">http://blog.merjis.com/2008/06/15/malware-detection-breaks-web-metrics/</guid>
		<description><![CDATA[The Register has an article about web analytics problems caused by the malware detection of an anti-virus package. This may have implications for advertisers and SEO, too. I have not downloaded and tried this anti-virus package, yet. I didn&#8217;t see any white papers or clear explanation for the AVG LinkScanner technology on the AVG site. [...]]]></description>
			<content:encoded><![CDATA[<p>The Register has an <a href="http://www.theregister.co.uk/2008/06/13/avg_scanner_skews_web_traffic_numbers/">article about web analytics problems caused by the malware detection</a> of an anti-virus package. This may have implications for advertisers and SEO, too. I have not downloaded and tried this anti-virus package, yet. I didn&#8217;t see any white papers or clear explanation for the AVG LinkScanner technology on the AVG site. Come Monday, when I sit in front of Mac with a Windows license, I&#8217;ll investigate further. </p>
<p>If <a href="http://nonbovine-ruminations.blogspot.com/2007/12/on-cade-metz.html">Cade Metz</a> is correct, then this software might be causing some interesting problems for Google and Google&#8217;s customers and possibly even for SEOs. Why? Because every time a customer of Version 8 of the AVG Technologies Linkscanner tool does a search, the tool checks all the links on the returned results page. It does so by emulating user behaviour. That means that web server log file based web analytics packages may have problems, because the interaction looks like users visit and mostly bounce immediately. The volume of apparent users would increase, and all the increase would be associated with increasing bounce rates. I can imagine spending a lot of time working on client web sites to try and solve that conversion decrease problem &#8211; only to discover that my time was wasted by this tool. </p>
<p>I can understand why AVG Technologies have gone this route &#8211; but I think there&#8217;s better ways to implement what is needed, than this. The current implementation apparently causes a lot of other problems. Let&#8217;s have a look at what the problems might be &#8211; admittedly this is speculative, until I actually test the wretched stuff&#8230; No, I don&#8217;t deeply trust reporters to get things completely accurate. I&#8217;ve been quoted in news myself, and so have some of my colleagues and clients; I know how the quest for a story can lead to implications that weren&#8217;t in the technical origins of the item. That link for Cade Metz above takes you to a page that says he&#8217;s pretty good and trustworthy, though. </p>
<h3>Things That Jump Out As Problems</h3>
<p>Apart from the suspected clicking on paid search adverts? Well, this software visits sites and tries to look like a real user. So if Google doesn&#8217;t have a threat signature for this activity, it&#8217;ll feed AdSense adverts to those pages. You won&#8217;t get clicks on those adverts &#8211; driving CTR down &#8211; but this will affect all advertisers, so you should not *relatively* suffer. But there will still be ugly questions asked of Google about decreasing CTR and to the web marketing team about why the latest innocuous changes cause response rates to collapse&#8230;</p>
<p>And, of course, the publishers of CPM adverts (e.g. placement targeted adverts) will pay for every thousand impressions &#8211; even if they are just bogus web page loads generated by software. So advertisers will have to expect an increased vigilance and the possibility that Google might be missing a chunk of fake clicks. Now, while Google could find a signature for this (ten or twenty requests for links off the same page), advertisers can&#8217;t &#8211; they just get an extra hit each time there&#8217;s a search results page on which they appear. Advertisers have no clue that every one else on that page also gets a hit&#8230; which means that Google could hide the problem and just reap higher impression rates and more clicks. </p>
<p>So the key advertiser problems are possibly clicking through on paid adverts and that might cause advertisers additional clicks that should be identifiable as invalid, and additional CPM payments for content match on clicked through adverts.</p>
<h3>Tough On Threats. Easy On You. Looks Like Malware.</h3>
<p>AVG has focused on servicing user needs. Normally that&#8217;s a good thing. <a href="http://www.google.com/corporate/tenthings.html">Google also focuses on user needs</a>. One of the golden goals of marketing is to satisfy an unstated need. &#8220;Needs&#8221; are crucial to effective marketing. </p>
<p>However, destroying the value of analytics systems, and causing additional advertiser costs and provoking web management teams worldwide to go into a tizzy over declining conversion rates and increased bounce rates&#8230; could be seen as less than helpful. </p>
<p>This smacks more of an unwillingness to look at the problem properly, or ignorance, than of malice. However, the effects on advertisers, web analytics providers, Google, hosting services and so on &#8211; well, that&#8217;s quite a substantial group of offended suppliers. </p>
<p>This section title, BTW, is derived from AVG&#8217;s corporate tagline&#8230; I thought it was mildly humorous, anyway, in a satirical way. </p>
<h3>Better Ways To Build A Mousetrap</h3>
<p>I&#8217;m a fan of Akismet. This is the tool that protects comments for this blog. Akismet has a central repository of known bad comments. When a comment is submitted to this blog, Akismet looks for the same comment in the repository. If found, then the comment is tagged as spam. Frequent and worthwhile commenters get passed by Akismet. However, I sometimes get new comments in my moderation queue from people that the software wants me to review. If I approve the comment, then it is added to the central repository as a positive for that commenter &#8211; but if it is marked as spam, then it joins the other spammy comments in the DB. That way the community gets to check and rate, and not everyone has to look at every comment &#8211; only those freshly exposed to new comments and commenters are asked to evaluate. It is an effective tool &#8211; though I can think of techniques that might undermine it. Of much more than 10,000 comments tagged as spammy, I&#8217;ve personally been asked to review less than a hundred. That&#8217;s an insignificant ratio and effective protection. </p>
<p>Using a similar technology might slow lookup for AVG (one query with ten items to a central DB, followed by some number of visits for sites with insufficient records; versus at least ten visits per search query) or speed it up &#8211; I haven&#8217;t done the math; it isn&#8217;t my product and I&#8217;m more concerned to make sure I have usable analytics. With my InfoSec-stuff hat on top of my AdWords and Internet Marketing hats, I&#8217;d prefer something that would allow a statistical sampling of a site by a range of browsers from different IP addresses, by means that reduce the load on web servers and minimise the perversion of web analytics.</p>
<h3>Things That I Need To Check</h3>
<p>Size of the problem&#8230; Cade Metz estimates up to 20 million users of AVG. AVG claim to be fourth in size on the AV market. In reasonably mature global markets like AntiVirus products, I usually assume that fourth typically means single figure percentage &#8211; somewhere in the 5-9% share range. That may not sound a lot, but 20 million seats is still a good sized business. If all 20 million used this link checker, then for search usages, it looks like 200 million or so users. That&#8217;s noticeable, especially given that for many businesses search (organic and paid) is a substantial fraction of all visitors. So this doesn&#8217;t look like a small problem &#8211; but I do want to invest a little time in confirming those numbers from some recent market share data and something that gives the market size. If it proves that AVG only have 1% of the market and less than 10% of all users use AV, the problem is a non-issue. :)</p>
<p>Cade Metz didn&#8217;t clearly state whether the AVG link checker executes JavaScript. If it is a browser plugin, I can imagine that it might hook to deep layers that allow it to look at the results of executing JavaScript. That would mean that it could also submit the image requests used by JS based web beacons/Page Bugs as part of its&#8217; investigation to discover malware on the target site. So even Google Analytics, NedStat, CoreMetrics and Omniture would not be immune from perverted statistics. If the LinkChecker doesn&#8217;t check for images fed by JS based Page Bugs, then it misses a source of possibly compromised image files &#8211; so good InfoSec practice should be to check those servers, because if I were malicious, that&#8217;s where I&#8217;d hide a payload. </p>
<p>It isn&#8217;t clear from the article whether this LinkChecker does anything with Flash. <a href="http://blog.merjis.com/2006/11/01/tracking-with-flash-cookies/">Web Analytics and user tracking with Flash Cookies</a> are increasingly popular &#8211; users mostly don&#8217;t know about them, and web browsers don&#8217;t have mechanisms to clear them, unlike ordinary cookies. If this malware checker is to be effective, it probably should be looking at Flash Cookies, as I can imagine that these might be used as an attack vector. So even Flash Cookie based web analytics could be affected. </p>
<p>It isn&#8217;t clear from the article whether the AVG LinkChecker *only* looks at organic search results or whether it also clicks through paid search links. If it does, then advertisers will see unusually high CTR from users with the LinkChecker installed, and will see conversion rates decrease and conversion costs increase. This is&#8230; undesirable&#8230; it&#8217;s a form of invalid click, normally a result of some type of malware! </p>
<p>However, if this tool is supposed to defend against malware, then the programmers that make malware can adapt by using low cost adverts &#8211; because if the LinkChecker *doesn&#8217;t* check adverts, that&#8217;s where anyone with the wit to write effective code will hide the payload. Duplicate an OK site under a new URL, submit 2 cent adverts and hope to pay 1 cent if the volume and CTR can be made high enough. Takes a bit of work, but I can imagine doing it. </p>
<p>That&#8217;s a bit of a nasty problem. If you don&#8217;t check the advertised target site, then you might offer malware loaded sites to users. If you do check, you increase advertisers costs and increase the accusations against Google that it is a rip off. Pragmatically &#8211; AVG is responsible for the implementation &#8211; if they click on my clients adverts with no intention of purchasing, and cause cash to flow to Google&#8230; if the pattern is visible to Google it should be an invalid click and no cost should be paid by my client. So AVG&#8217;s implementation has a cost implication to Google (tracking and denying this stuff costs), and an indirect cost resulting from further increased mismatch between Google Analytics and Google AdWords, thereby increasing fears of click fraud. </p>
<p>More subtly &#8211; what is it that triggers this code into doing a malware check? Is it the name of the site (so have AVG hardcoded all the Google domain variants?) and how do they recognise stuff like a Google Custom Search Engines? What about the Google Search Box on various sites &#8211; do they recognise those results pages? What about Yahoo and MSN Live? My guess is that if this has been implemented to work as widely as possible, then they&#8217;ll be looking at the URL parameters (tags) to see whether they look like Search Engine tags, and perhaps coupling it with some kind of wildcard (regex) matching for a built in list of major search engines. If they haven&#8217;t then they are missing at least 30% of search activity. </p>
<p>If they do check on signatures of search, then they may also catch some non-search sites &#8211; e.g. CMS and product catalogue sites and in-site searches, identifying links on those pages as being worth checking for malware. Oh dear. </p>
<p>We do have some tools for checking web server log files and we have clients with multi-GB of compressed log files per day&#8230; So I can check for the signature and get some clue as to the magnitude of the problem. However, chances are that AVG has focused on specific segments. If those segments don&#8217;t overlap with my clients segments, then I will underestimate the global impact. If the segments overlap significantly, then I&#8217;ll overestimate the effect. </p>
<h3>Actions?</h3>
<p>Apart from firing up Windows on Monday, I&#8217;m going to write to AVG. If their customer base is the size they claim (fourth largest AV solution) then this malware tool is likely to account for something in the range of 25% to 50% of search engine traffic from users with an active AV installed. That&#8217;s possibly quite a lot. </p>
<p>If The Register article is based on anything real, then this could have a significant adverse effect on metrics. Even worse, the effect will involve an escalating number of users. You can&#8217;t just apply a fixed offset (e.g. &#8220;subtract 5%&#8221;) that works for all time &#8211; it&#8217;ll need tweaking as the AVG customers upgrade. I need to check some log files on Monday and get some idea of the impact.</p>
<p>This may change some of the SEO and conversion improvement activities that I have ongoing. At least until we have retrospectively cleared recent months analysis of user behaviour. Some analytics packages can&#8217;t (won&#8217;t) do this. For example, a Google Analytics filter added today to remove the signature, takes effect from the time it was written, not retrospectively. So my client stats only work properly from today forwards, and until AVG change the signature. </p>
<h3>Conclusions</h3>
<p>The web is getting complex. Complex enough that efforts intended to achieve protection against malware can be interpreted by a different community also using the web, as malware. </p>
<p>More investigation is needed to see how this product handles JavaScript, Flash, other search engines and paid search adverts.</p>
<p>The original report suggests that this will make a difference to some advertising metrics &#8211; mostly making it look that there are more searches on at least organic rankings than before, and that a very low percentage of these will convert. This could mean increased advertising costs, until Google add an invalid click pattern. Google has to do this as individual advertisers will be unable to see the coordinated clicking on Google. It would be good to hear from Google (Analytics Blog, Inside AdWords blog, Ghosemajumder&#8217;s blog) how malware like this is handled. </p>
<p>If Google handle this properly, and the impact is as large as it seems, advertisers should be looking for an increased invalid click and impression rate &#8211; possibly both on content match and search. </p>
<p>The link checker could imply increased CTR and decreased ROI, by reducing the likelihood of sale in response to a visit. </p>
<p>I&#8217;m still thinking about this one. It could account for some stuff that I&#8217;ve seen on some client sites over the last month or so. I&#8217;ll just have to wait until Monday to get a good start on it.  I&#8217;ll try to flag any updates&#8230;</p>
<h3>Related Links</h3>
<p><a href="http://www.webmasterworld.com/search_engine_spiders/3657953.htm">WebMasterWorld thread about signatures and webmaster responses to link scanners</a>.</p>
<p><a href="http://www.seomoz.org/blog/analytics-spam-coming-to-an-internet-near-you">SEOmoz article about malware abuses of Web Analytics.<br />
</a></p>
<h3>Updates</h3>
<p>2008-06-15 &#8211; even before I publish&#8230; I&#8217;ve heard back from Pat Bitton, Head of Global Communications at AVG. I&#8217;m impressed that they are actively involved in the emerging news about this, rather than hiding till it blows over. </p>
<p>2008-06-16 &#8211; Fixed some typos. Edits for clarity. Heard back a second time from Pat Bitton &#8211; AVG have read this blog article. I&#8217;m beginning to think that Cade Metz should be on my watch list. Very interesting issue he dug up. AVG 8 Trial Version download now complete. Installation in progress.</p>
<p>2008-06-17 &#8211; Installed AVG 8 Trial. AVG 8 offers a toolbar with a Yahoo!Search default. Ran a few tests. LinkScanner shows site check mark in MSIE for Google, but doesn&#8217;t for Yahoo. Why would the preferred search engine not have link checks? Odd. Is this a way to penalise Google in some wierd way? Or an oversight to neglect checking Yahoo? Must do some more thinking about what is happening here, and some more experiments and reading of log files. </p>
 <img src="http://blog.merjis.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=184" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.merjis.com/2008/06/15/malware-detection-breaks-web-metrics/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Is AdWords Search History Permutation Fraudulent?</title>
		<link>http://blog.merjis.com/2008/03/17/is-adwords-search-history-permutation-fraudulent/</link>
		<comments>http://blog.merjis.com/2008/03/17/is-adwords-search-history-permutation-fraudulent/#comments</comments>
		<pubDate>Mon, 17 Mar 2008 12:40:35 +0000</pubDate>
		<dc:creator>Jeremy Chatfield</dc:creator>
				<category><![CDATA[adwords]]></category>
		<category><![CDATA[click fraud]]></category>
		<category><![CDATA[conversion]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[intent]]></category>
		<category><![CDATA[marketing]]></category>
		<category><![CDATA[microeconomics]]></category>
		<category><![CDATA[trust]]></category>

		<guid isPermaLink="false">http://blog.merjis.com/2008/03/17/is-adwords-search-history-permutation-fraudulent/</guid>
		<description><![CDATA[Update 2009/02 I can no longer detect Search History Permutation using the diagnostic tests that I previously used &#8211; I believe that this is no longer operating, or it has become more subtle in its effects. Original article My first article on Search History usage was experiential; you can do the searches yourself and see [...]]]></description>
			<content:encoded><![CDATA[<h3>Update</h3>
<p>2009/02 I can no longer detect Search History Permutation using the diagnostic tests that I previously used &#8211; I believe that this is no longer operating, or it has become more subtle in its effects. </p>
<h3>Original article</h3>
<p>My first article on Search History usage was experiential; you can <a href="http://blog.merjis.com/2008/03/11/adwords-died-2008-rest-in-peace/">do the searches yourself and see the strange results</a>. This article offers a different type of explanation with a lot more detailed argument. It raises the question for me &#8211; is Google&#8217;s use of AdWords Search History to generate adverts for unrequested keywords, fraudulent? The development of the argument travels through marketing, micro-economics and a bit of stats. I&#8217;ve tried to set a background that makes this easier to follow &#8211; but it is a bit lengthy. Sorry &#8217;bout that.</p>
<h3>Signs And Portents</h3>
<p>I treat AdWords a bit like cryptography, or black box control engineering. Messages go in (adverts, keywords, bids, geotargets, etc) and translated messages come out (impressions, clicks, conversions, paid prices, etc). Google does something in the middle. It&#8217;s part of my job to infer what Google does, and manipulate inputs to optimise the output. There&#8217;s a bunch of fancy techniques that could be used (Stochastic Perturbation with Simultaneous Annealing, for example &#8211; what a wonderful name, eh?), but much of it comes down to, in my opinion, marketing messages, the buying process and micro-economics. The complex maths get their day only with large data sets and high volumes. </p>
<p>For small advertisers, the volume of data needed to make techniques like SPSA and Taguchi work, are hard to achieve and pay for, and subject to a lot of statistical noise caused by uncontrolled and/or unmeasured factors such as seasonality, market price fluctuations, confidence, organic search results, press releases, <a href="http://blog.merjis.com/2006/11/24/adwords-qs-is-bs/">Google&#8217;s changing policies and procedures</a>, <a href="http://blog.merjis.com/2006/11/21/google-adwords-editorial-review-hazards-and-workrounds/">the Editorial Review process</a>, <a href="http://blog.merjis.com/2007/05/25/trademarks-and-google-adwords/">trademarking protection</a>, and so on. Mostly I rely on techniques that appear to work in general cases, with tuning when enough historical records are established. </p>
<p>I determinedly read messages, signs, sigils and portents from Google &#8211; blogs, Wall Street Analysts reports, Nielsen, the occasional goat sacrificed at midnight, etc. But the most important pieces of evidence come from the AdWords account, from web analytics or web server log files, web site contents and visitor trajectory, and from looking at competitive advertising. </p>
<p>The consequence of a lot of thinking, experiments, and the experience derived from spending a few million of my clients&#8217; funds in advertising budgets, is that I&#8217;ve developed some techniques that give me the most data for interpretation. I&#8217;m quite happy to share these techniques &#8211; my primary value to my clients is &#8220;insight&#8221;. Paid search work is just one tool in that shed. </p>
<p>I&#8217;ve found that it is most useful to focus on Exact and Phrase matched AdGroups. There&#8217;s a specific way to set these up to get the greatest signal &#8211; I&#8217;ll detail it some other time. But the effect is that even if the bid for exact matched keywords drops, that the phrase match and broad match adverts don&#8217;t run &#8211; think &#8220;negative exact keywords&#8221; and &#8220;negative phrase keywords&#8221;.  Also think &#8220;This&#8217;ll Stop Google From Abusing Expanded Broad Match And Destroying My ROI&#8221;. </p>
<p>This technique reveals some intriguing things about users and search. Most importantly, it used to show how and when users repeated searches. </p>
<h3>Characteristics Of Search</h3>
<p>If you have a rare search query, as an exact matched keyword, it might only attract a few hundred impressions per year. As Google have become more experienced, they&#8217;ve started aggregating small volume queries, so this signal is harder to see now, than it was in 2003-2007. However, old records show something interesting. If you have one of these &#8220;long tail&#8221; queries, then you see that they tend to get used in bursts. You get no activity in months, and then you get a run of 2-15 (usually around three to six) searches with the same query. Then it goes quiet again, perhaps for months. </p>
<p>Why? Because one searcher has decided to find something and is repeating their search. I know that I do this. I&#8217;ve seen other people do it. You are trying to research around a problem and you keep coming back with the same query, perhaps to look at more results on the page, or to get a second or third page. </p>
<p>Patterns of search query repetition appear to vary in some ways that are amenable to analysis. AFAICS, it changes with how the search is regarded by the user (what I call the &#8220;intent&#8221;), and the phase of the buying process. If the searcher is in the early phases, then they do a lot more repeat searches. If they are late phase, for example after making up their minds to buy from a vendor &#8211; then the number of repeated searches tends to be small &#8211; they know what they want and they just have to find it. A characteristic migration of phrases might be to start with &#8220;Cheap Holiday&#8221; multiple times, then some specific searches for types or locations of holiday which may have some repetitive components, and ending up with a single search query for &#8220;Expedia.com&#8221;, &#8220;travelocity&#8221; or some other site that was identified as having <strong>the</strong> product to buy. </p>
<p>This evolution of search queries interacts with the new search history permutation mechanism in a regrettable way. </p>
<p>When a user repeats the searches, the results stabilise, and are just like the results you&#8217;d normally get from AdWords. If there&#8217;s been a break in activity, then a new search after the break duration is treated as if there is no search history. For example, type &#8220;Brazilian Tax Credit&#8221; tonight as the last search you do, then come back after a nights&#8217; sleep and search for &#8220;US Vacation&#8221; and you *won&#8217;t* see the Brazilian Holiday and US Tax adverts (from the example in the previous article). Well, I don&#8217;t, anyway. </p>
<p>Prospective buyers, early in the buying process, appear to repeat searches. If these users walk up to a cold machine, boot and start searching, they see nothing different caused by the search history. However, early phase searchers are not, usually, close to buying &#8211; by definition. You can, of course, help accelerate them to purchase with the right advert copy and the right content on your site. But the general behaviour *appears* to be that multiple repeat searches are mostly coming from people who are unlikely to buy now, today. </p>
<p>Early phase searchers may not even be looking at paid search adverts seriously &#8211; I believe that I tend to see lower CTR&#8217;s for keywords where I regard the intent as being weak; early phase searchers often seem to use organic results rather than paid search, because organic results will often lead to pages with discussion of alternatives, while paid search typically leads to a specific solution. In other words: consistently targeted adverts are shown to the group who probably could benefit from some message variation (use those alternate creatives, marketeers!). These early phase searchers are also, typically, the lowest converting group &#8211; because they may get distracted by a different type of solution or decide to not continue with the purchase process, or the latency exceeds the 30 day AdWords Conversion Tracking or the default Google Analytics 6 month tracking or ritual cookie deletion.</p>
<p>In the later phases, I believe that people vary the search more. This is where they are comparing features, benefits, finding discussions and threads to justify their decision. These later phase prospects are the ones that see randomised adverts, because the search history is invoking bizarre extraneous keywords to the party. </p>
<p>The consequence of the evolution of search queries is that the most bizarre conjunctions of irrelevant adverts are shown to the users with the highest interest in buying right now. </p>
<p>This is not desirable for advertisers. This is not helpful for search users. Why do it?</p>
<h3>Permutation</h3>
<p>The nature of the Search History interaction appears to be a simple combinatorial permutation. That is, given &#8220;cheap holiday&#8221; and &#8220;us vacation&#8221;, it probably generates &#8220;cheap vacation&#8221;, &#8220;cheap us&#8221;, &#8220;us holiday&#8221; and &#8220;holiday vacation&#8221;. While this example generates some plausibly interesting candidate keywords for searchers, it is demonstrably weakening the answers to the most recent request. See the two examples in the previous article for details. </p>
<p>The core question is whether permutations of the current and previous search query generate plausible searches that the user might have made. If this was a valid technique, might we not expect it to have been used for organic search, first? </p>
<p>That is to say, if Google, who are probably about the largest, most rational and experimentationist entrepreneurial organisation on the planet have *NOT* used permutation to improve organic search results, why would anyone imagine that it benefits paid search to do so? I&#8217;m not aware of any previous activity in paid search that has resulted in organic search following the behaviour. I can see that Google copies lessons learned from organic search into paid search. You want some examples?</p>
<ul>
<li>404 checking &#8211; 404&#8242;s are lethal for organic search, so Google verifies that paid search pages are present.</li>
<li>Landing Page Quality Scores &#8211; appear to be based on what makes a good result for organic search (though I have a nasty suspicion that the nature of search evolution and the buying process implies that there *should* be a difference between organic and paid search optimal landing pages).
</li>
</ul>
<p>This paid search usage of the search history appears to be novel, and appears to offer a reduction in the overall effectiveness of searches. Combinatorial explosion also means that with lengthier queries, the number of combinations is increased markedly. Some of these may well overlap with keywords that have a higher bid and/or a better CTR &#8211; so these will be favoured over other adverts that are more directly relevant to the searchers intent. </p>
<h3>Markets and Market Prices</h3>
<p>OK, now we&#8217;re in the dismal world of Economics. </p>
<p>If I&#8217;m an advert trader, I can set up multiple markets for bidding for placement. I might set up one market for &#8220;ford prefect&#8221; and another market for &#8220;edsel&#8221;. This is like Google &#8211; there used to be a market for each keyword. Google&#8217;s Broad Match extended the search queries that were involved in the market, but usually in identifiably sane ways. So a market for the keyword &#8220;cheap holiday&#8221; could include search queries for related concepts such as &#8220;cheap vacation&#8221;, &#8220;free holiday&#8221;, places known for or associated with inexpensive vacations such as &#8220;cancun vacation&#8221; or &#8220;holiday cyprus&#8221;, and even specific company names closely associated with cheap holidays. </p>
<p>The more of these other search queries that I can recruit to my market, the more price competition I can engender. This is why, I think, Google likes paid search users to use Broad Match by default &#8211; you participate in a larger market and that makes obtaining a higher value price in the auction. More bidders implies a higher price from the market &#8211; I can&#8217;t currently think of a counterexample in which more participants recruited to an auction will reduce the price struck. The generalised second auction apparently used by Google, doesn&#8217;t seem to offer a way to allow larger volumes of bidders to reduce the price that is struck, compared to an auction with fewer participants. </p>
<p>Indeed, this observation is the entire basis of an industry &#8211; keyword finders for SEO and paid search. Find the rare keywords in which the auction includes fewest bidders, and, the argument for this industry goes, you have found the auction in which you can pay the least to obtain traffic (or use SEO to get that &#8220;free traffic&#8221;). Of course, Broad Match negates this industry for paid search to a great extent. I have (seriously) considered setting up AdWords for well established brands, in which I simply used the brand name, with carefully selected negative keywords, to do all the work. Google will gleefully match well established brands with the primary and even secondary characteristics for their main search queries. Why do work you don&#8217;t need to, eh? </p>
<p>So, given Google&#8217;s ability to extend an inference for, for example, the keyword &#8220;thomas&#8221; to imply &#8220;cheap holiday&#8221;, or the search query &#8220;thompson cooke&#8221; to bring visitors to a major travel site (check your web server log files and observe the mismatch between keyword and search queries &#8211; that&#8217;s where the evidence is), why would Google need to use permutation to increase the auction? </p>
<p>As another example, look at what happens in organic search. You miskey &#8220;erlers danlos syndrome&#8221; and it offers to correct the spelling &#8211; because the history of interactions and the nature of searches have taught Google what the normalised spelling should be. Users have become used to the idea that Google will correctly guess the question that was intended. Why would paid search uniquely require a type of search query expansion that is *not* used for organic search conceptual extensions? </p>
<p>I believe that Google uses search history permutation to create an artificial market. As well as related search queries, the search history drags in unrelated queries. So a chain of &#8220;Brazilian Tax Credit&#8221; and &#8220;US Vacation&#8221; searches yields combinations including &#8220;us credit&#8221;, &#8220;brazilian vacation&#8221; and so on &#8211; and you can see adverts in the primary article with these exact adverts shown. *BUT* it is a false market. These advertisers had no strong intention to appear on those searches. If they did, they&#8217;d have used those keywords &#8211; just as an organic search would have extended the searches to include those results. If Google had seen a relationship, then Broad Match should bring in those queries to the advertiser. </p>
<p>Is this fraud? I&#8217;m no legal expert &#8211; I have no real idea if this is technically a fraud. But doing things to paid search that you wouldn&#8217;t do for organic search at least raises the question of whether this is deceptive advertising for Google &#8211; if advertisers were operating in the reasonable expectation that search for keywords worked like search for search queries, then might Google have implicitly broken the contractual expectation for advertisers?  If you create a market that recruits bidders who don&#8217;t really want to participate, so you can increase the paid price in the auction, is that fraudulent? </p>
<h3>Local Optimisation vs Global Optimisation</h3>
<p>I suspect that the essence of the problem is an attempt to locally optimise. For example, if you are a programmer, you can expend effort to refactor an algorithm for a local optimum, and later discover that the global performance has been damaged. </p>
<p>I suspect that this is what Google have done. They&#8217;ve apparently decided to stop showing adverts on searches where few people clicked on adverts. They&#8217;ve compensated for this by using a piece of information that they hold &#8211; your last search &#8211; and created a higher value market for the adverts. This increases revenue in that transaction. But I think that this creates a problem for global optimisation, and Google&#8217;s brand value &#8211; it also does very little for advertisers. Users seeing irrelevant adverts will also often blame the advertiser for stupidity, as much as they blame Google for poor quality matching. </p>
<p>Optimising my clients&#8217; accounts won&#8217;t help much, though. The problem is that I <em>can&#8217;t</em> optimise my clients. Google has created a strange new market, apparently to boost the Average CPC at the point at which people are most likely to have abruptly shifted search focus, and are generating the most random results at the point when someone is about to buy. I can&#8217;t control Google&#8217;s recruitment of irrelevant advertisers. Even if my adverts are exactly on focus for the intent, when pushed down the list of results, I get fewer impressions and a lower CTR &#8211; because people don&#8217;t want to go wading through random garbage results. I might even get pushed off the first page. It&#8217;s simply frustrating, because it is beyond my control. </p>
<p>The adverts that I most need to place, are the ones that Google has apparently decided are a reasonable place to extract additional revenue from advertisers.</p>
<h3>Monetisation</h3>
<p>Historically, Google has succeeded by consistently producing good search results. They resisted introducing paid search until they had a model that worked &#8211; no inline search results, and an apparently rigid wall between paid and organic search results. I am in awe at the subtlety that AdWords used to have. It was a fantastic tool that I used for market research, as well as lead generation or sales.</p>
<p>The virtuous cycle for Google has been that good search results meant more user recommendation and so a growth of the user base. Advertisers like being able to reach a large audience. </p>
<p>Note how this depends on the quality of the search results page. </p>
<p>Evans and Wurster&#8217;s 1999 book, &#8220;Blown To Bits&#8221; (<a href="http://harvardbusinessonline.hbsp.harvard.edu/b02/en/common/item_detail.jhtml;jsessionid=IQAK4QCVE30CWAKRGWDSELQBKE0YIISW?id=877X">Harvard Business Press</a> &#8211; no commission on this link, folks) was the first book that clearly explained, to me, the theoretical underpinnings of the value of Yahoo, and tied it to marketing principles that made sense for me. This book is also a great guide to monetisation and market share. In other words, it explains the economics and marketing principles behind what seems to have been a core Google strategy for the last decade:</p>
<p>By delivering the best page of search results, Google stays ahead of the competition, and thereby dominates search.</p>
<p>As we&#8217;ve seen, permutation damages the likelihood, for later phases of the buying process, that Google will give the *best* page of search results. It may be a better page than competitors, but it is not the absolute best page that Google could deliver. Arguably, when looking for a commercial solution, paid search is a better answer than organic. It depends on the market, and the state of organic results. I know that I have, and have had, over the years, clients with huge CTR&#8217;s and high conversion rates, because the organic results are wrong for final phase product oriented searches. </p>
<p>Does this damage the Google brand in the eyes of the search user? I think it does, and I have some clues that point to this. </p>
<p>Paid search vendors are intermittently criticised by industry watchdogs for failing to adequately clarify that paid search results are not chosen on the same basis as organic search results. In conversation with users, they will often hold that the number one advert shown by Google has been selected with the same or similar criterion as the number one organic search results. There is certainly a higher intrinsic CTR for the number one advert position, especially if it appears above the organic search results. </p>
<p>It is intriguing, isn&#8217;t it, that industry commentators and so many paid search booklets of &#8220;affiliate secrets&#8221;, emphasise that appearing as the Number One paid search result may actually dampen overall ROI. I believe that this is because it recruits searchers who are in the wrong phases. Instead of being at the point to consider purchase, these searchers are still discovering what it is they need to know, before they go shopping. They also trust Google to deliver relevant results. </p>
<p>What will happen to users as they approach the final phases of an attempted purchase? They will see a largely static list of advertisers for each repeated query turn into a set of adverts in which perhaps half or more of the adverts have no direct relationship to the search intent. I believe from purely personal observation and extensions of marketing psychology, that users will extend less trust to paid search results, especially when the top poistion from advertising is clearly off-topic &#8211; as it is, accidentally, in both of the test cases that I previously published. </p>
<p>Far from demonstrating Google&#8217;s long term commitment to improving search results, permutating the recent search history appears to weaken page relevance, and this drives down the long term monetisation of search &#8211; because it means that other vendors offer a relatively better search results page. This is really good news for Yahoo! and MSN. Their adverts will actually have a higher trust placed on them, so long as the auction is kept reasonably fair and uses only the contextual extensions that organic search would use to extend the keyword to similar *weighted* searches. If I was a user, looking for things to buy, and in final phase search, I&#8217;d change my strategy to use MSN or Yahoo! &#8211; because I&#8217;d get more consistently focused adverts, that help me achieve my ends. </p>
<h3>Advertiser Effects</h3>
<p>Because Search History Permutation enlists more advertisers, the effect is quite interesting. Painful. But definitely interesting. It seems to differ according to your normal bidding strategy. </p>
<p>If you typically bid below position 5, then you are likely to be pushed further down the page, or even off the first page. Impression rates will decline, sometime markedly. CTR will be typically somewhat decreased &#8211; but no huge decrease in CTR, I think. You should see that your average position is declining, as irrelevant adverts are brought in above you. </p>
<p>If you typically bid above position 5, you may be recruited to show adverts on irrelevant searches &#8211; your adverts have a high CPM, making them valuable for Google to show elsewhere. The result is booming impression volumes, as your adverts are shown on searches you don&#8217;t care about &#8211; and your CTR crashes down. I think that your average position will be largely unaffected. </p>
<p>Any way you slice and dice it, this looks like a poor move for advertisers. </p>
<h3>Summary</h3>
<p>Searchers evolve their search queries during the buying process. From my research, the greatest variation in search queries appears to happen as searchers near the actual purchase. The final search appears to consist of a single query &#8211; for the business that was identified as the right one to buy from. This is statistical inference &#8211; so for some users there will be a single search, and for other users there many be multiple repeat searches that lead to a sale; the argument is not invalidated by a single counterexample &#8211; you need a chunk of data to reveal that this is wrong. </p>
<p>Google&#8217;s current use of search history recruits new keywords to participate in the auction, resulting in the appearance of adverts for which the searcher has demonstrated no interest.</p>
<p>This appears to be something that is not in the best interests of searchers.</p>
<p>This appears to be something that is not in the best interests of advertisers.</p>
<p>This appears to be something that is intended to increase revenues to Google, by manipulating the conditions in which bids are evaluated. </p>
<p>The exposure to additional participants in the auction appears to allow a higher price to be returned from the auction. </p>
<p>This appears to damage the long term brand value of Google, at least for advertisers and probably for searchers. </p>
<p>It appears to offer comfort to Yahoo and MSN search &#8211; their page values are now relatively higher than they were before Google made this change. </p>
<p>If this isn&#8217;t evil, what is it? </p>
<p>If this isn&#8217;t fraudulent activity, what is it?</p>
<p>This appears to be yet another example of covert messages from Google that indicate how they really think. Ignore the babble about how much they love you. Google apparently thinks of advertisers as gullible stooges who are there for the money they yield. Advertisers, for Google, it seems, are a resource that demands no respect. They won&#8217;t tell you this, but it is how they treat you &#8211; sending you clicks that won&#8217;t convert is great for Google and lethal for your business. Removing Advertisers ability to control click quality is great for Google and destructive for the relationship between Google and Advertiser. </p>
<p>Trust is earned, but it is fragile. Google keep destroying the trust that their organic search results have built, when dealing with advertisers.</p>
<h3>Update</h3>
<p>2008-03-18 Slightly clarified criticism of Google, bringing it back to AdWords, in the summary. I vacillate as to how much damage this does to the relationship between Google and Searcher, and between Advertiser and Searcher. Seeing <a href="http://blog.merjis.com/2006/11/03/wsj-google-content-match-advertisers-and-abuse/">irrelevant AdWords adverts</a> is often, but not always, blamed on stupid advertisers. </p>
 <img src="http://blog.merjis.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=156" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.merjis.com/2008/03/17/is-adwords-search-history-permutation-fraudulent/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>AdWords Search History Permutation &#8211; Short Form</title>
		<link>http://blog.merjis.com/2008/03/17/adwords-search-history-permutation-short-form/</link>
		<comments>http://blog.merjis.com/2008/03/17/adwords-search-history-permutation-short-form/#comments</comments>
		<pubDate>Mon, 17 Mar 2008 11:32:42 +0000</pubDate>
		<dc:creator>Jeremy Chatfield</dc:creator>
				<category><![CDATA[adwords]]></category>
		<category><![CDATA[click fraud]]></category>
		<category><![CDATA[google]]></category>

		<guid isPermaLink="false">http://blog.merjis.com/2008/03/17/adwords-search-history-permutation-short-form/</guid>
		<description><![CDATA[Update 2009/02 I can no longer detect Search History Permutation using the diagnostic tests that I previously used &#8211; I believe that this is no longer operating, or it has become more subtle in its effects. Original Article Are you suffering from a lower CTR recently? Have your conversion rates declined? Have your impression rates [...]]]></description>
			<content:encoded><![CDATA[<h3>Update</h3>
<p>2009/02 I can no longer detect Search History Permutation using the diagnostic tests that I previously used &#8211; I believe that this is no longer operating, or it has become more subtle in its effects. </p>
<h3>Original Article</h3>
<p>Are you suffering from a lower CTR recently? Have your conversion rates declined? Have your impression rates declined, or suddenly boomed? I think I know why. Here&#8217;s a simple description of the problem that Google has caused. I&#8217;ve a <a href="http://blog.merjis.com/2008/03/17/is-adwords-search-history-permutation-fraudulent/">longer article, with more of the background</a>.</p>
<p>At some point (I don&#8217;t have an exact date and I&#8217;m still digging through client logs) Google changed the basis on which adverts are shown. Until this change, Google would look at what the user typed in a search query, and deliver adverts where the advertiser had selected a keyword that matched the search query. Google has not announced that they are now doing Search History Permutation. </p>
<p>After the introduction of the Search History Permutation, Google takes all the words in the previous search, and all the words in the current search, and jumbles them to make new search queries.</p>
<p>The result is that adverts mix relevant results and bizarre conjoined researches &#8211; even if the series of searches conducted by a user are related, the result of permutation can deliver completely irrelevant adverts. </p>
<p>This is not AdWords, as I&#8217;ve used it from 2004 to 2007. This is a different AdWords. It is less effective. It is more expensive to use. It is less under my control &#8211; I can&#8217;t find any simple way for advertisers to prevent appearing on irrelevant search queries.</p>
<p><img id="image154" src="http://blog.merjis.com/wp-content/uploads/2008/03/us-vacation-after-brazil.png" alt="US Vacation Search, after &quot;Brazilian Tax Credit&quot;" width="600"/></p>
<p><img id="image155" src="http://blog.merjis.com/wp-content/uploads/2008/03/us-vacation-after-cheap-holiday.png" alt="US Vacation Search Result, after Cheap Holiday search." width="600"/></p>
<p>Look at these two search results clips. Both searches are for &#8220;US Vacation&#8221;. But the search before these was varied. Notice that Organic Search results are the same. Only paid search has changed &#8211; and it has changed so that the top position results are irrelevant. This decreases the CTR for what should have been the number one advert. It has pushed to lower positions, previously relevant adverts, off the first page &#8211; so for some advertisers, impression volumes will have crashed. The recruitment of irrelevant adverts means that *high bidding* adverts, will get increased impressions, as they will be dragged into irrelevant searches more frequently.</p>
<p>The general effect of this though, will be plummeting CTRs. It&#8217;s almost as if Google was spamming its own pages!</p>
<h3>Summary</h3>
<p>I believe that Google has fundamentally changed the nature of AdWords, without any contractual variation or notice. If so, this is not ethical behaviour by Google, and it might not be legal &#8211; I don&#8217;t know enough about US law to say &#8211; but it really, really sucks.</p>
<p>I believe that the change benefits Google &#8211; by having more high bidding advertisers in every auction.</p>
<p>I believe that the change does not benefit users. </p>
<p>Longer term, I believe that this damages the brand value of Google, by destroying the basic proposition that Google offered to search users &#8211; that Google would deliver the best page of search results. By delivering adverts from irrelevant searches, Google has reduced the value of the page, for advertisers and for users. </p>
<p>I have a *much* longer article, that covers all of this, in more detail. :(</p>
 <img src="http://blog.merjis.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=163" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.merjis.com/2008/03/17/adwords-search-history-permutation-short-form/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Automating Content Network Management &#8211; Part 1</title>
		<link>http://blog.merjis.com/2008/02/12/automating-content-network-management-part-1/</link>
		<comments>http://blog.merjis.com/2008/02/12/automating-content-network-management-part-1/#comments</comments>
		<pubDate>Tue, 12 Feb 2008 00:00:01 +0000</pubDate>
		<dc:creator>Jeremy Chatfield</dc:creator>
				<category><![CDATA[advert automation]]></category>
		<category><![CDATA[adwords]]></category>
		<category><![CDATA[API]]></category>
		<category><![CDATA[click fraud]]></category>
		<category><![CDATA[content match]]></category>
		<category><![CDATA[conversion]]></category>
		<category><![CDATA[web analytics]]></category>

		<guid isPermaLink="false">http://blog.merjis.com/2008/02/12/automating-content-network-management-part-1/</guid>
		<description><![CDATA[About three years ago (2005), we started efforts to automatically improve performance of the AdWords Content Network, for advertisers. We were hoping to develop a product, but we were also doing some research to see how things worked and what lessons we could learn. This is intended to be part 1 of a multipart article, [...]]]></description>
			<content:encoded><![CDATA[<p>About three years ago (2005), we started efforts to automatically improve performance of the AdWords Content Network, for advertisers. We were hoping to develop a product, but we were also doing some research to see how things worked and what lessons we could learn. This is intended to be part 1 of a multipart article, focusing on the background for the most obvious techniques to improve performance. In part 2, I expect to cover more of the practical issues and some more of the observations we made. </p>
<p>The Content Network is Google&#8217;s name, in AdWords, for most of the AdSense network. There&#8217;s an additional component in AdWords that accounts for the rest &#8211; &#8220;Placement Targeting&#8221;. You need a different campaign type for Placement Targeting, and you may find that sites that are presented using the Content Network are not available using Placement Targeting, even if they deliver a good ROI for you. I do not know, and have not researched, why some Content Network sites are  unavailable for Placement Targeting. </p>
<p>Managing keyword search and content match in a single campaign is fraught with problems &#8211; detailed in other articles, <a href="http://blog.merjis.com/2006/12/15/twelve-ways-plus-a-bit-less-a-few/">here</a> and <a href="http://adwords.blogspot.com/2007/12/google-content-network-tips-part-3.html">elsewhere</a>. As long as three years ago, we&#8217;d recognised the issues and were already optimising separated Content Network campaigns. The problem that we were trying to resolve was that, even with separated content match campaigns, we had great difficulty in achieving a good volume of conversions at a competitive ROI. Every time we drove up the volume, the ROI got worse. Every time we drove to a great ROI, we lost volume. </p>
<p>After some investigation of the actual sites that drove business or drove unproductive costs, we became unhappy with the state of many of the sites on which adverts were published. Generally either because the content match yielded low relevance sites, or because the sites were constructed around steering traffic to adverts &#8211; and sites like either of these categories generally showed the lowest conversion rates. </p>
<p>The graphic below shows how a UK English user, with no search history in Chinese, a browser set to English, in receipt of Chinese language spam in Gmail, is shown Chinese adverts.  This sort of &#8220;just the content&#8221; matching is partially why the Content Network has such a low CTR. There&#8217;s other reasons to do with the Buying Process and Intent, but contextually irrelevant matches through word matching is probably one of the main causes: </p>
<p><img id="image143" src="http://blog.merjis.com/wp-content/uploads/2008/02/content-network-mismatch-chinese.png" alt="Chinese Language Advert Triggered By Chinese Spam in Gmail" /></p>
<h3>The Content Network &#038; Gilligan&#8217;s Island</h3>
<p>We decided that the only ways to escape this trap (the inverse correlation of ROI and volume) was to exclude sites that absorbed money, but delivered no business. The measurement system was to be Google&#8217;s own <a href="http://blog.merjis.com/2007/04/03/macros-analytics-paid-search-performance-improvement/">AdWords Conversion Tracking</a> (GACT) and the metric would be unique measured conversions. </p>
<p>GACT does not offer impression tracking, so conversions would only be trackable if the visitor clicked on an advert. In addition, Google appears to prefer the first click in a 30 day window &#8211; so if the visitor clicked first on a paid search advert and later returned via content match adverts, then the conversion would be attributed to a paid search. It is *NOT* clear to me what Google does when the first click exceeds the 30 day cookie window &#8211; for example, perhaps the second click could now be counted as a conversion in the 30 day cookie window, providing new attribution of a previously ascribed sale, to a later click. Using measurements of this stuff is pretty complex, especially when you use someone elses&#8217; stats collection and reporting system and their documentation is kept simple. However, if the products that are sold are unlikely to be repeat sales over a 30 day window, GACT stats will be much more useful than not having them. </p>
<p>There are other techniques to optimise the Content Network. For example, the choice of <a href="http://blog.merjis.com/2008/01/28/google-doing-less-evil/">geotarget, keywords, the budget and bid</a> all affect where the impressions are delivered. Google also probably has controls, hidden from the advertiser, that affect the choice of sites for publishing. We infer the likely presence of these Google-managed controls, from periodic changes in behaviour that would otherwise require seismic shifts in the way that the internet works. Groups of sites and the flexibility of content matching varying in huge ways over a period of days; I contend that the simplest explanation is that Google is testing ways to optimise their revenue, rather than that substantial sections of the internet change allegiances to publisher networks in such short time periods &#8211; especially when some of the sites coming and going clearly follow &#8220;Made For AdSense&#8221; templates. </p>
<p>To do the site optimisation research, we assumed that geotarget, advert copy, keywords and budgets had been optimised by another system. The product research area was to investigate whether it was possible to optimise site exclusion with automated techniques. Complicating it further was that at that time, Google restricted accounts to 500 site exclusions &#8211; this may sound like a lot, but it can be exhausted by a moderate sized advertiser in a month or so. This limit has been relaxed in late 2006, IIRC. </p>
<p>If the project worked, then a major human exercise in identifying sites that were likely to perform poorly, could be automated. This would save costs for advertisers by denying unproductive spend with Google, and reduce management fees for users, allowing a larger user base for my business. The causes of the unproductive spend could be click fraud &#8211; the Content Network seems a particularly ripe target for fraud &#8211; or could be users who just weren&#8217;t anywhere near ready to purchase, outside the 30 day window allowed by Google&#8217;s cookie and that of most web analytics packages, or poor relevance. For the purposes of improving ROI and volume, the exercise is not specifically to tackle any single cause, but to maximise the profit whatever the problems are. However, when we get to the next article, we&#8217;ll see that the causes of low performance are intricately bound up with performance improvement. </p>
<p>We decided that there were two main techniques for optimising site exclusions, which need not be exclusive. The rational basis for decision making is to use measured ROI. The other technique was to train an Artificial Intelligence (we picked a neural network) to choose sites as likely to be effective, or ineffective, after a single click from that site.  Indeed, for the purposes of training a neural network to recognise low performing sites, we could use the ROI technique to identify how sites performed, and then use those sites as the learning sets for the AI. If the predictive powers of the AI are good, after training, then sites can be rapidly identified as being more or less likely to generate sales, saving serious costs. </p>
<p>So here, then, was our escape from the trap. Use economically justifiable techniques to identify sites. Build up a database of sites that work and sites that fail. Infer the characteristics, using an AI, that will allow us to select likely poor performing sites and high performing sites after just a single click. Would this be enough to get us off Gilligan&#8217;s annoying Island, or see us still trapped and ready for the next episode? </p>
<h3>Identifying Poorly Performing Sites Through ROI Targets</h3>
<p>The first technique is a basic &#8220;proof of incompetence&#8221; test. Assuming that the bid, budget and advert has been optimised, then ROI calculations will offer a count of clicks. Example:</p>
<p><code>Average Paid Price is $0.10</code></p>
<p><code>ROI Target is $10.00</code></p>
<p><code>Average number of clicks to achieve ROI = $10.00 / $0.10 = 100 clicks</code></p>
<p>There&#8217;s a particularly complicated calculation to work backwards from this, to the number of clicks you must see before you can assume with a specific confidence level that you are not going to achieve the target ROI. I&#8217;ll run through a simplified version of the thinking that leads to the calculation. </p>
<p>Imagine that the first 100 clicks do not lead to a sale&#8230; can you turn off the advertising to that site? No. Because click 101 could yield a sale. So long as we then get a second sale before click 300, we&#8217;re still achieving an average of 100 clicks per sale &#8211; to a definable confidence level. The more clicks and sales we see, the higher the confidence level. So if we got one sale in 1,000,000 clicks, we&#8217;d be very confident that we were not going to average 100 clicks per conversion. OTOH, if we saw 10,000 orders in 1,000,000 clicks, we&#8217;d be very confident that even if we saw 500 clicks and no sales, that it was not likely to be a sustained problem.</p>
<p>For simplicity, lets double the target ROI click volume&#8230; In this example, we must see 200 clicks and no sales at all in order to decide that this site is unsuitable *for this offer*. It may be suitable for a different offer, of course&#8230; Because Content Network matching is literal, irrelevant advertising is a frequent hazard (look at the screenshot of a <a href="http://blog.merjis.com/2008/02/07/spam-in-comments-unattributed-content/">poor match to a page resulting from a search</a> for &#8220;akismet-admin&#8221;, offering Windows XP Registry tweaking &#8211; completely irrelevant and probably matched on &#8220;admin&#8221;). </p>
<p>Using the numbers above, the proof that the site under consideration will not convert, is a spend of $20.00 (twice the ROI target). The confidence level for this is pretty low; it&#8217;s better than random, but still quite low. However, stick with it and let&#8217;s see where it gets us. </p>
<p>If you have only a few sites appearing in content network AdGroups, then this additional payment over the ROI, to prove that sites *can&#8217;t* make the ROI,  is a burden initially. Say that you have ten sites under consideration, and they have equal volumes of traffic. Overall you may be achieving a $20.00 ROI, and so you need to double the performance (remember that all other factors have been optimised &#8211; we&#8217;re now only looking at which sites to exclude). So we need to at least halve the number of sites involved. That means we need to waste 5 times $20.00 (the proof of incompetence level) to identify sites that definitely won&#8217;t work in the target ROI range &#8211;  a &#8220;wasted&#8221; $100.00. However, that means that the remaining sites must achieve an ROI of $5.00 or the clients&#8217; target has been failed&#8230; </p>
<p>In practice, you need to allow a significant overspend on useless sites, in order to ensure that you end up with a collection of sites that achieve $10.00 average ROI. The overspend is function of the numbers of sites to which you are exposed. The larger the publishing network, the less likely you are to see the same site repeatedly appear and the more sites that appear with low spend levels below the point at which you can reject them. </p>
<p>Now, that&#8217;s assuming an unrealistically simple model. Let&#8217;s make that model more complex and closer to reality. </p>
<p>The usual behaviour in the content network is a profile similar to a power law (Zipf&#8217;s law). The distribution of clicks and impressions will tend to follow a curve with a few sites that attract a lot of impressions and clicks, and a lot of sites that attract a few impressions and clicks. For a large client (think $10,000 per month, testing on the Content Network) this might translate to around 2,000 sites, of which less than ten will achieve or exceed the $20.00 target spend, and the vast majority of which have a handful of clicks. </p>
<p>This means that the client faces a first month cost of $10,000, but has only positively identified a handful of sites as meeting or failing to meet the success criterion. We can exclude these sites (make them a placement targeting target if they worked, otherwise we exclude them). We now have another month of spend&#8230; and a similar sort of ratio. A handful of sites will rise above the detection threshold, and in addition to the previous thousand sites, we&#8217;ll see four or five hundred new sites, and of the sites that we previously saw, we won&#8217;t see half of them again, this month. So the spend gradually increases, but the rate of accreting validated sites is just a few identified sites every month, with an ever increasing count of sites that we&#8217;ve not previously seen (though this rate of adding new sites declines &#8211; but is subject to factors outside the control of the advertiser). </p>
<p>Particularly note that the ROI is being measured with respect to each site. This means the overall campaign ROI will be much worse than the target. In the first month, there&#8217;ll be quite a bit of explaining to worried clients that this is just what was expected and that it will eventually get better. IMO, that&#8217;s not a message that most new clients will be happy to hear&#8230; so this technique isn&#8217;t very &#8220;client friendly&#8221;&#8230;. You&#8217;ve just charged them a bunch to set up this complex stuff and the first thing that happens is an overspend against the ROI target. OTOH, they could have stayed with the old system and have achieved pretty much the same result. This isn&#8217;t a good harbinger. This makes for a specific type of sales problem, that you probably recognise from products that you&#8217;ve used. </p>
<p>It takes many months to identify a large enough group of repeating conversions to present a useful collection of validated sites&#8230; but each month we see a $10k spend. It shouldn&#8217;t take any more explanation to reveal that payback times using this technique are *very* long.  Practically, the reliability is low because it depends on the sites with success continuing to succeed&#8230;. </p>
<p>I&#8217;ve developed a sincere scepticism of assumptions of sustained performance of a site in the content network, over the years I&#8217;ve been using it. I&#8217;ve found sites that provided repeat conversions and then after a while, the site changes tactics, or Google tweaks their hidden levers, and the performance falls through the floor. I&#8217;ve even had repeat converting sites simply drop out of the Content Network &#8211; for me&#8230; continuing to display adverts from competitors and completely irrelevant advertisers and having those sites unavailable in site targeting. It is an exercise in frustration. </p>
<p>Take Google&#8217;s own property, YouTube, for example. Last summer, I had a client with some conversions using a specific target on YouTube. Google changed the targeting and despite overspending the ROI target, I couldn&#8217;t achieve another sale &#8211; the location and the matching of video content were simply not working, for this client, any more. Achieving a single sale, or even repeat sales, does not guarantee that the site or the performance will be the same, next month. </p>
<h3>Second Technique &#8211; the AI</h3>
<p>OK, so we&#8217;ve seen that only using target ROI (plus some fudge to allow for the internet being noisy) can be a pretty fast way to lose large quantities of money and a slow way to identify good sites. If we can use the patterns that we find, perhaps we can use a pattern-recogniser to more quickly identify sites that don&#8217;t work? Then we can choose to dump the ineffective sites and focus spending on sites where we can&#8217;t determine an answer or know the returns to be fine.</p>
<p>AI techniques offer some help. A good tool for recognising unknown patterns is the Neural Network. We built a three layer neural network. You need to train a neural network. We believed that around a thousand sites would be needed to train and test the Neural Network&#8230; a thousand good sites, and a thousand bad sites, and an additional set of sites that had known value, but that were not part of either of the training sets &#8211; 3,000 sites in total.</p>
<p>Complicating this further, the results only hold true for a specific product. If a different product is offered, it is possible that a site that was previously failing, may now work &#8211; and vice versa. So the sites that were marked previously as &#8220;Good&#8221;, &#8220;Bad&#8221; and &#8220;Unknown&#8221;, could be marked after a second trial as &#8220;Always Good&#8221;, &#8220;Always Bad&#8221;, &#8220;Sometimes Good&#8221;, &#8220;Sometimes Bad&#8221; and &#8220;Still Undecided&#8221;. That is, there is probably a group of sites that, whatever the advert, will tend to have a lower than usual conversion rate. Conversely it is unlikely that there will be many sites that, whatever the advert offers, have a high conversion rate. </p>
<p>Another way of categorising would be &#8220;Sites that seem unlikely to convert, whatever the offer&#8221;, &#8220;Sites that might convert for a relevant offer&#8221; and &#8220;Sites where we need more evidence before deciding about this offer&#8221;.</p>
<p>Categorising the sites is the most difficult part &#8211; because content matching is so dependent on the use of words, not of context. The result is that a site that may not convert for one offer is not always poor at converting for other offers &#8211; maintaining a universal list of sites that don&#8217;t convert is made harder by having to expose those sites to multiple offers, until proving that *nothing* sells on them. </p>
<p>AI&#8217;s need good categories &#8211; the clearer the signal in training the AI, the more likely you are to get a decent result when applying the AI to real world data. So, for example, if the criterion is &#8220;artistic&#8221; &#8211; then you&#8217;ll really have to train the AI to detect what &#8220;art&#8221; means. If, on the other hand, the decision is &#8220;half the adverts appear between top of page and the first content&#8221;&#8230; well, it is *likely* to be easier to train &#8211; if you can write the software to perform all the CSS and JS jiggery that is possible. </p>
<p>Part of the exercise, then, is to define the types of entity that are present in the input, and from which the AI will draw its&#8217; inferences. </p>
<h3>Rounding Up Part 1</h3>
<p>ROI based techniques aren&#8217;t worse than standard management practice &#8211; but are more expensive than techniques that rely on identifying whether a site is likely to be effective, based on immediate inspection after the first click. </p>
<p>Automation could be used to identify sites on first click, and add them automatically to site exclusion, pending investigation (human or automaton). </p>
<p>There appears to be a potential to identify, especially across a wide range of advertisers, sites that generate clicks that don&#8217;t convert for any offer. Optimising for a single advertiser looks like a long, slow process, from the stats given above.</p>
<h3>Coming up, in Part 2</h3>
<p>Collecting the data, hazards of data collection, risks of site exclusion and the likely response of the Get Rich Quick guys who build websites that earn them money, but don&#8217;t earn anything for you. This has significant impact on whether a system is workable, and the investment involved to build and deploy it. It also ties into bot nets and human networks of fraudulent clickers, and Google&#8217;s undisclosed techniques for identifying clicks. </p>
 <img src="http://blog.merjis.com/wp-content/plugins/wordpress-feed-statistics/feed-statistics.php?view=1&post_id=141" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.merjis.com/2008/02/12/automating-content-network-management-part-1/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

