Meet The Crawlers: SES, San Jose 2007

Aug 24, 2007 | 3,129 views | by Navneet Kaushal
VN:F [1.9.20_1166]
Rating: 0.0/5 (0 votes cast)

Representatives from major crawler-based search engines cover how to submit and feed them content, with plenty of Q&A time to cover issues related to ranking well and being indexed.


  • Danny Sullivan, Conference Co-Chair, Search Engine Strategies San Jose


  • Peter Linsley, Sr. Product Manager,
  • Evan Roseman, Software Engineer, Google, Inc.
  • Sean Suchter of Yahoo! Search.
  • Eytan Seidman, Microsoft

First to speak is Eytan Seidman from Microsoft. He shows a presentation on Microsoft's Live Webmaster Portal which explains how Microsoft's crawler will index your site. Live Webmaster Portal supports map submissions and one can also view their website's statistics. Microsoft has many search engine crawlers and all their names begin with "MSNBot" –

  • web search
  • news
  • academic
  • multimedia
  • user agent

Microsoft also supports "NOODP" and "NOCACHE" tags.

Next is yahoo! Search's Sean Suchter who also has a presentation about Yahoo's crawler.

dynamic URL rewriting via Site Explorer "Robots-nocontent" tag. Yahoo! employs crawler load improvements (reduction and targeting). The new Yahoo! search engine crawler targets better and has a comparatively low volume.

Google's Evan Roseman steps up to explain and discuss webmaster central's features. He recommends taking advantage of Webmaster central's submit a site option so that Google's search engien crawler can index all your content.

Next up is's Peter Linsley who discusses catering to the search engine robot as many times in catering to the actual human visitor, the robot is forgotten. Some problems include requiring cookies. He points out that Ask does accept site map submissions but points out that they'd rather be able to crawl naturally.

Peter uses the Adobe site to demonstrate some issues that they may have with multiple domains and duplicate content. He then uses the site and shows that they are disallowing crawlers to index the root page. This creates problems with crawling.

Q & A

  • Q: First question is for the Google rep. Wants to know whether they will allow users to see supplemental results within Webmaster Central now that they are no longer tagging them in search results.
  • A: Evan stated that being in supplemental is not a penalty but did not provide a definite answer as to whether they would allow users to discover if or not results are supplemental.

    Danny interjects that all engines have a two-tier system and Eytan, Sean and Peter confirmed that. So… they all have supplemental indices but people only seem to be concerned with Google's, most likely because they used to identify them as such in the regular search results.

  • Q: What can a competitor actually do if anything to hurt your site?
  • A: Evan says that there is a possibility where a competitor could hurt your site but did say it is extremely difficult. Hacking, domain hi-jacking are some of the things that can occur.
  • Q: Question relates to scenario when you re-publish content to places such as eBay but the sites you re-publish to rank better than original. How can a webmaster identify original source of information?
  • A: Peter answers that one could try to get places they republish content to use robots.txt to block spidering of content. Another thing to do is have link back to original site. However on a site such as eBay, that is not always possible. The response to that is to create unique content for these sites that this person is re-publishing content on.
  • Q: Robert Carlton asks if all engines are moving towards having things like Webmaster Centrals. Also asks how they treat 404s and 410s.
  • A: As for 404s and 410s, Ask, Google and Yahoo! treat them the same. Robert points out that they should treat them differently as a 410 indicates the file is gone whereas 404 is an error.
  • Q: Question regarding getting content crawled more frequently.
  • A: Evan suggest to use the Site Map feature in Webmaster Central and keep it up to date. He also suggest promoting it by placing a link to it on the home page of their site.
  • Q: How can one use site maps more effective for very larges site that have information changing on a regular basis? Also inquired how to get more pages indexed when only a portion are being indexed.
  • A: Submitting a site map with Google is not going to cause other URLs to not be crawled. Evan also points that they are not going to be able to crawl and include ALL the pages that are out there. Again suggests that webmaster promote them such as listing them on home page. However when dealing with hundreds of thousands of pages, that is not always feasible.
  • Q: How do engines interpret things like AJAX, JavaScript, etc.?
  • A: Eytan answered that if webmaster wants things interpreted, they are going to have to represent those in a format the engine can understand, AJAX and JavaScript currently not being one of them.
  • Q: Question regarding rankings in Yahoo! disappearing for three weeks but then they get back in. Is his due to an update?
  • A: Sean answers that it certainly could be and suggests using Site Explorer to see if there is some kind of issue.
  • Q: How many links will engines actually crawl per page? How much is too much?
  • A: Peter says there is no hard and fast rule but keep the end user in mind. Evan echoes the same feeling.
  • Q: Do the engine use meta descriptions?
  • A: All engines use them and may use them if the algorithm feels they are relevant.
  • Q: For sites that are designed completely in Flash, can you use content in a "noscript" tag or would that be considered as some type of cloaking?
  • A: Sean said IP delivery is a no-no but if the content is the same as Flash, he'd rather see content in noscript than traditional cloaking. Evan suggests avoiding sites in complete Flash but rather use Flash components.
  • Q: Is meta keywords tag still relevant?
  • A: Microsoft – no, Yahoo! – not really, Google – not really, and Ask – not really. All read it but it is has so little bearing. For a really obscure keyword where it only appears in the keyword tag and no where else on the web, Yahoo! and Ask are the only ones that will show a search result based on it.
  • Q: How do engines view automated submission/ranking software?
  • A: Evan – don't use them.


4.thumbnail Meet The Crawlers: SES, San Jose 2007

Navneet Kaushal

Navneet Kaushal is the founder and CEO of PageTraffic, an SEO Agency in India with offices in Chicago, Mumbai and London. A leading search strategist, Navneet helps clients maintain an edge in search engines and the online media. Navneet's expertise has established PageTraffic as one of the most awarded and successful search marketing agencies.
4.thumbnail Meet The Crawlers: SES, San Jose 2007
4.thumbnail Meet The Crawlers: SES, San Jose 2007