Duplicate content is a big issue, particularly for e-commerce websites. At the SES San Francisco, 2011, the leading lights of search engine optimization discussed how to put up with the issue.
Website owners are apprehensive that they might get penalized because of duplicate content. Agenda of this session of SES San Francisco, 2011, was Duplicate Content & Multiple Site Issues. The participants analyzed all concerned issues and suggested solutions.
Danny Goodwin, Associate Editor, Search Engine Watch
- Katy Collins, Senior Product Manager, AOL
- Eric Enge, President, Stone Temple Consulting
- Chris Keating, Director, SEO, Performics
- David Naylor, SEO, Bronco
The discussion was initiated by Eric Enge who began talking about the details on major Google patents. He stated that one needs major medication to get through the Google patents and put up an example of a typical website wireframe. Clarifying people's misconception regarding what goes into a duplicate content algorithm, he told that everything is ignored except for a a content block that lies in the middle. Search engines focus on the main content and not the secondary one.
Answering a question raised by Dave Naylor whether the elements like right or left column, footer when randomized are counted in the duplicate content equation, he said that it just depends. Illustrating his point, he put up an example of e-commerce website and the duplicate content situation that can occur with such sites. He made it a point to state that it is the manufacturer content which is considered the duplicate content.
Taking the discussion a level deeper, he talked about shingles which are blocks of content. He tried to make through his point by putting up an example of four pieces of content on one page, and then put the same content in rearranged manner. Search engine algorithms can figure it out. Hence playing with the content blocks will not make any difference. He also gave an e-commerce example for shoes. For humans, colors of shoes might make a difference but to the search engines, the content is pretty much same. Search engines will find no reason to show multiple pages of the same info.
He suggested rel=canonical tag as a solution to duplicate content. Search engines are also good at simple database substitution. This simply means that if you substitute words out in a block of content, the search engines can detect it. For instance, if one changes 'biggest' with 'largest', search engines will detect it. Another type of duplicate content on e-commerce websites is query specific.
Eric came up with several solutions for duplicate content such as:
- Delete the duplicate, and 301 the page to the master copy.
- Keep both copies but implement the rel=canonical tag.
- Go into webmaster tools and instruct Google to ignore certain parameters.
He was followed by Dave Naylor who continued with where Eric had left. Clarifying a point mentioned by the previous speaker, he said that the rel=canonical is suggestive, not directive. Using an AJAX call is another solution, where you show the users what they want and, at the same time, you also give the search engines what they want. However, he cautioned against the use of No Index as it can really slow down the crawling of page. Using Robots.txt and Ajax calls is better.
Next speaker was Chris Keating who showcased some interesting stats which were the outcome of a research on 100 websites. 93% of these sites had duplicate content of some form. Majority of them had duplicate content within their own internal websites, while a substantial chunk of the lot had duplicate content on other websites as well. 73% websites had taken some defensive measures to prevent duplicate content. While watermarks were used by none, 6% were using the Google Author tag while others were using a copyright in some form at the bottom of the page.
Chris gave some useful tips regarding content distribution. For instance, if one gave RSS to AOL or Webpronews, you will lose. Search engines will give credit for that article to AOL and not you. If you really want to benefit on RSS feed out, give AOL or whatever site only a thousand words and retain the bulk of your data. Only then, you will derive value from the article.
Another speaker to address the session was Katy Collins who stated that if one had a good relationship with a specific syndication partner, he could request them to put in place canonical tags. She talked on various aspects of content syndication discussing what would be the better way to share content without putting search engines on alert.
There were a few questions and answers from the audience regarding e-commerce websites. Eric stated that if one has designed a website to rank in co.uk, it might not rank well in the .com results. Dave opined that one needs to get in the mindset of Google. As per his view, search engines do not care about the websites anymore but it depends on who is behind it.
He advised website owners to know the exact reason why they created it, and how close it is to the objective. This is the parameter to judge the success of the endeavor.Duplicate Content & Multiple Site Issues: SES San Francisco 2011, Day 1!,