Debugging blocked URLs

Confused by "blocked by robots.txt" errors? Read this post at Official Google Webmaster Central Blog for debugging robots.txt problems. The post has a handy checklist for debugging a blocked URL.

Like if you are looking at crawl errors for your website and notice a URL restricted by robots.txt that you did not intend to block then:

1. Check the robots.txt analysis tool
"The first thing you should do is go to the robots.txt analysis tool for that site. Make sure you are looking at the correct site for that URL, paying attention that you are looking at the right protocol and subdomain."

2. Check for changes in your robots.txt file
"If these look fine, you may want to check and see if your robots.txt file has changed since the error occurred by checking the date to see when your robots.txt file was last modified. If it was modified after the date given for the error in the crawl errors, it might be that someone has changed the file so that the new version no longer blocks this URL."

3. Check for redirects of the URL
"When Googlebot fetches a URL, it checks the robots.txt file to make sure it is allowed to access the URL. If the robots.txt file allows access to the URL, but the URL returns a redirect, Googlebot checks the robots.txt file again to see if the destination URL is accessible. If at any point Googlebot is redirected to a blocked URL, it reports that it could not get the content of the original URL because it was blocked by robots.txt."

And, if you still can’t pinpoint the problem then you can post on Google’s forum for help.