How To Verify The GoogleBot Crawling Your Site Is Not A Fake

Mar 10, 2007 | 2,904 views | by Navneet Kaushal
VN:F [1.9.20_1166]
Rating: 0.0/5 (0 votes cast)

The Googlebot that just visited your website might not have been from Google. There are rogue bots out there which imitate the Googlebot to steal content. Barry Schwartz has compiled few resources which elaborate upon ways to verify the authenticity of a bot.
    
Included in the post are some past posts on the issue, a post by Matt Cutts and discussions at Cre8asite Forums. Matt Cutts has explained how to verify the bot with reverse DNS.

Telling webmasters to use DNS to verify on a case-by-case basis seems like the best way to go. I think the recommended technique would be to do a reverse DNS lookup, verify that the name is in the googlebot.com domain, and then do a corresponding forward DNS->IP lookup using that googlebot.com name; eg:

> host 66.249.66.1
1.66.249.66.in-addr.arpa domain name pointer crawl-66-249-66-1.googlebot.com.

> host crawl-66-249-66-1.googlebot.com
crawl-66-249-66-1.googlebot.com has address 66.249.66.1

I don't think just doing a reverse DNS lookup is sufficient, because a spoofer could set up reverse DNS to point to crawl-a-b-c-d.googlebot.com.

Barry also advises to use CrawlWall to automate the process. 
 

4.thumbnail How To Verify The GoogleBot Crawling Your Site Is Not A Fake

Navneet Kaushal

Navneet Kaushal is the founder and CEO of PageTraffic, an SEO Agency in India with offices in Chicago, Mumbai and London. A leading search strategist, Navneet helps clients maintain an edge in search engines and the online media. Navneet's expertise has established PageTraffic as one of the most awarded and successful search marketing agencies.
4.thumbnail How To Verify The GoogleBot Crawling Your Site Is Not A Fake
4.thumbnail How To Verify The GoogleBot Crawling Your Site Is Not A Fake