Latest Crawler Improvements for Live Search Announced!

Feb 13, 2008 | 5,031 views | by Navneet Kaushal
VN:F [1.9.20_1166]
Rating: 0.0/5 (0 votes cast)

Microsoft's announced today several improvements in the crawler for Live Search, stating that the improvements “should significantly improve the efficiency” with which the MSN Bot crawls and index web sites.

The latest improvements entail:

HTTP Compression: HTTP compression allows faster transmission time by compressing static files and application responses, reducing network load between the user's servers and Microsoft's crawler. It is supported by the most common compression methods: gzip and deflate as defined by RFC 2616 (see sections 14.11 and 14.39). The compression is being supported by all major browsers and search engines. You can check your server for HTTP compression support using this online tool.

Conditional Get: Microsoft supports the conditional get as defined by RFC 2616 (Section 14.25), generally they would not download the page unless it has changed since the last time it was crawled. In accordance with the standard, their crawler will include the "If-Modified-Since" header & time of last download in the GET request and when available, the crawler will include the "If-None-Match" header and the ETag value in the GET request. If the content hasn't changed the web server will respond with a 304 HTTP response.

Furthermore, there are several other improvements in performance that should help further optimize Live's crawling. As a consequence of these changes Microsoft's user agent has also been altered to reflect the changes, it is now "msnbot/1.1".

In case you have any issues with MSNbot, or any questions you might want to make use of the Crawler Feedback & Discussion form.

While the improvements are welcome, some of them are being viewed with suspicion. As Vanessa Fox notes that the other search engines, Google, Yahoo! and Ask have had similar features to the so called “improvements” already and for as long as over two years now.

Google: Google had made changes to their crawler to reduce bandwidth usage in 2006 as part of the "Bigdaddy" infrastructure change. Consequently, Googlebot increased its support for HTTP compression and started using a crawl caching proxy. Further the Google webmaster help center describes that Googlebot's handling of conditional get similar to MSNbot's.

Yahoo!: Had announced support of both HTTP compression and conditional get in 2005.

Ask: While Ask's webmaster documentation has information about HTTP compression, it doesn't mention about the conditional get support.

It seems that the webmasters agree to with the “improvement” part for sure, as states zekele.

“At last they are catching up with some basic functionality! Google and Yahoo have been supporting this for years. It's a rather sad testament to Microsoft's progress that they have taken this long to get to their crawler up to standard.”

Microsoft's seems to be working on many fronts in addition to its most coveted possession for now.

4.thumbnail Latest Crawler Improvements for Live Search Announced!

Navneet Kaushal

Navneet Kaushal is the founder and CEO of PageTraffic, an SEO Agency in India with offices in Chicago, Mumbai and London. A leading search strategist, Navneet helps clients maintain an edge in search engines and the online media. Navneet's expertise has established PageTraffic as one of the most awarded and successful search marketing agencies.
4.thumbnail Latest Crawler Improvements for Live Search Announced!
4.thumbnail Latest Crawler Improvements for Live Search Announced!