Google Webmaster Central Introduces New Robots.txt Analysis Tool & Unavailable After Meta Tag

Aug 16, 2007 | 2,362 views | by Navneet Kaushal
VN:F [1.9.20_1166]
Rating: 0.0/5 (0 votes cast)

John Blackburn announces that the refreshed robots.txt analysis tool will now be able to recognize sitemap declarations and relative urls.

"Earlier versions weren't aware of sitemaps at all, and understood only absolute URLs; anything else was reported as Syntax not understood. The improved version now tells you whether your sitemap's URL and scope are valid. You can also test against relative URLs with a lot less typing.

Reporting is better, too. You'll now be told of multiple problems per line if they exist, unlike earlier versions which only reported the first problem encountered. And we've made other general improvements to analysis and validation."

In order to let search engine bots index all in your portal (barring the images folder). Your robots.txt file will look like this:

disalow images

user-agent: *

Disallow:

sitemap: http://www.example.com/sitemap.xml

You visit Webmaster Central to test your site against the robots.txtanalysis tool using these two test URLs:

http://www.example.com

/archives

Previous version of the tool

gwtbefore Google Webmaster Central Introduces New Robots.txt Analysis Tool & Unavailable After Meta Tag

Whereas the latter image of the tool would look like this.

gwtafter Google Webmaster Central Introduces New Robots.txt Analysis Tool & Unavailable After Meta Tag

For more information read Google Webmaster Central official blog.

In other news, Google confirms the new unavailable_after META tag which we reported about last week.

"Let's assume you are running a promotion that expires at the end of 2007. In the headers of page www.example.com/2007promotion.html, you can use the following:

untitled Google Webmaster Central Introduces New Robots.txt Analysis Tool & Unavailable After Meta Tag

The second exciting news: the new X-Robots-Tag directive, which adds Robots Exclusion Protocol (REP) META tag support for non-HTML pages! Finally, you can have the same control over your videos, spreadsheets, and other indexed file types. Using the example above, let's say your promotion page is in PDF format. For www.example.com/2007promotion.pdf, you would use the following in the file's HTML headers:

X-Robots-Tag: unavailable_after: 31 Dec

2007 23:59:59 EST

Remember, REP META tags can be useful for implementing noarchive, nosnippet, and now unavailable_after tags for page-level instruction, as opposed to robots.txt, which is controlled at the domain root."

4.thumbnail Google Webmaster Central Introduces New Robots.txt Analysis Tool & Unavailable After Meta Tag

Navneet Kaushal

Navneet Kaushal is the founder and CEO of PageTraffic, an SEO Agency in India with offices in Chicago, Mumbai and London. A leading search strategist, Navneet helps clients maintain an edge in search engines and the online media. Navneet's expertise has established PageTraffic as one of the most awarded and successful search marketing agencies.
4.thumbnail Google Webmaster Central Introduces New Robots.txt Analysis Tool & Unavailable After Meta Tag
4.thumbnail Google Webmaster Central Introduces New Robots.txt Analysis Tool & Unavailable After Meta Tag