Recently I came across a very interesting blog that was related to the DUPLICATE CONTENT PENALTY. It was written by Susan Moskwa, who is a very famous Webmaster Trends Analyst. I would like to share some of the features of that blog with my readers…
She states that:
In particular, I still hear a lot of webmasters worrying about whether they may have a "duplicate content penalty."
There are some penalties that are related to the idea of having the same content as another site√Ę‚ā¨‚ÄĚfor example, if you're scraping content from other sites and republishing it, or if you republish content without adding any additional value. These tactics are clearly outlined (and discouraged) in our Webmaster Guidelines:
- Don't create multiple pages, sub domains, or domains with substantially duplicate content.
- Avoid… "cookie cutter" approaches such as affiliate programs with little or no original content.
- If your site participates in an affiliate program, make sure that your site adds value. Provide unique and relevant content that gives users a reason to visit your site first.
"But most site owners whom I hear worrying about duplicate content aren't talking about scraping or domain farms; they're talking about things like having multiple URLs on the same domain that point to the same content. Like www.example.com/skates.asp?color=black&brand=riedell and www.example.com/skates.asp?brand=riedell&color=black. Having this type of duplicate content on your site can potentially affect your site's performance, but it doesn't cause penalties. From our article on duplicate content"
Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. If your site suffers from duplicate content issues, and you don't follow the advice listed above, we do a good job of choosing a version of the content to show in our search results.
"Google tries hard to index and show pages with distinct information. This filtering means, for instance, that if your site has a "regular" and "printer" version of each article, and neither of these is blocked in robots.txt or with a noindex meta tag, we'll choose one of them to list. In the rare cases in which Google perceives that duplicate content may be shown with intent to manipulate our rankings and deceive our users, we'll also make appropriate adjustments in the indexing and ranking of the sites involved. As a result, the ranking of the site may suffer, or the site might be removed entirely from the Google index, in which case it will no longer appear in search results."
She further continued saying…
"This type of non-malicious duplication is fairly common, especially since many CMSs don't handle this well by default. So when people say that having this type of duplicate content can affect your site, it's not because you're likely to be penalized; it's simply due to the way that web sites and search engines work."
Most search engines strive for a certain level of variety; they want to show you ten different results on a search results page, not ten different URLs that all have the same content. To this end, Google tries to filter out duplicate documents so that users experience less redundancy. You can find details in this blog post, which states:
- When we detect duplicate content, such as through variations caused by URL parameters, we group the duplicate URLs into one cluster.
- We select what we think is the "best" URL to represent the cluster in search results.
- We then consolidate properties of the URLs in the cluster, such as link popularity, to the representative URL.
Here's how this could affect you as a webmaster:
- In step 2, Google's idea of what the "best" URL is might not be the same as your idea. If you want to have control over whether www.example.com/skates.asp?color=black&brand=riedell or www.example.com/skates.asp?brand=riedell&color=black gets shown in our search results, you may want to take action to mitigate your duplication. One way of letting us know which URL you prefer is by including the preferred URL in your Sitemap.
- In step 3, if we aren't able to detect all the duplicates of a particular page, we won't be able to consolidate all of their properties. This may dilute the strength of that content's ranking signals by splitting them across multiple URLs.
I would like to give a brief summary about her blog for my readers…
If you will have duplicate content for your site, then it will surely affect your image in so many ways… But until and unless you are doing this deliberately, it's unlikely that one of those ways will be a penalty.
It further means…
- You don't have to give any kind of reconsideration request when you're indulged in the process of cleaning up innocently duplicated content.
- If you're a webmaster of beginner-to-intermediate savviness, then you don't have to take tension about duplicate content, since most search engines are geared up with the ways of handling it.