Aug 16, 2007 114 reads by Navneet Kaushal

John Blackburn announces that the refreshed robots.txt analysis tool will now be able to recognize sitemap declarations and relative urls.

"Earlier versions weren't aware of sitemaps at all, and understood only absolute URLs; anything else was reported as Syntax not understood. The improved version now tells you whether your sitemap's URL and scope are valid. You can also test against relative URLs with a lot less typing.

Reporting is better, too. You'll now be told of multiple problems per line if they exist, unlike earlier versions which only reported the first problem encountered. And we've made other general improvements to analysis and validation."

In order to let search engine bots index all in your portal (barring the images folder). Your robots.txt file will look like this:

disalow images

user-agent: *

Disallow:

sitemap: http://www.example.com/sitemap.xml

You visit Webmaster Central to test your site against the robots.txtanalysis tool using these two test URLs:

http://www.example.com
/archives

Previous version of the tool

gwtbefore.gif

Whereas the latter image of the tool would look like this.

gwtafter.png

For more information read Google Webmaster Central official blog.

In other news, Google confirms the new unavailable_after META tag which we reported about last week.

"Let's assume you are running a promotion that expires at the end of 2007. In the headers of page www.example.com/2007promotion.html, you can use the following:

untitled.jpg

The second exciting news: the new X-Robots-Tag directive, which adds Robots Exclusion Protocol (REP) META tag support for non-HTML pages! Finally, you can have the same control over your videos, spreadsheets, and other indexed file types. Using the example above, let's say your promotion page is in PDF format. For www.example.com/2007promotion.pdf, you would use the following in the file's HTML headers:

X-Robots-Tag: unavailable_after: 31 Dec

2007 23:59:59 EST

Remember, REP META tags can be useful for implementing noarchive, nosnippet, and now unavailable_after tags for page-level instruction, as opposed to robots.txt, which is controlled at the domain root."

Navneet Kaushal

Navneet Kaushal

Navneet Kaushal is the founder and CEO of PageTraffic, an SEO Agency in India with offices in Chicago, Mumbai and London. A leading search strategist, Navneet helps clients maintain an edge in search engines and the online media. Navneet's expertise has established PageTraffic as one of the most awarded and successful search marketing agencies.
Navneet Kaushal
Navneet Kaushal
FOLLOW US AND GET LATEST NEWS
125,891
Most popular Posts
Tweets
Upcoming Events
Events are coming soon, stay tuned!More