In a recent post on the Bing Webmaster Blog, the Principal Program Manager at Bing, Fabrice Canel has highlighted some of the best practices for sitemaps and few tips on scaling up sitemaps to large websites.
Here’s a rundown of the best practices:
- Bing asks webmasters to follow the sitemaps reference at www.sitemaps.org. According to Bing, some of the common mistakes are malformed XML Sitemaps, large XML Sitemaps with 50,000 links and up to 10 megabytes uncompressed and links in sitemaps incorrectly encoded.
- Relevant sitemaps should link to the most relevant content. Bing recommends avoiding duplicate and dead links. Webmasters should generate sitemaps at least once a day if they want to minimize the number of broken links in sitemaps.
- Apart from this, list all the updated content posted on your site in the last 24 hours, using RSS feed. Do not just list the 10 latest links on your site because search engines do not visit RSS so frequently and can miss out the new URLs. You can also use XML Sitemap files and sitemaps index file to create a snapshot of all relevant URLs on your site everyday.
- Don’t use too many XML Sitemaps per site and RSS feeds. You should use only one sitemap index file listing all relevant sitemap files and sitemap index files, and one RSS listing the newest content on your site.
- Use sitemap properties and RSS properties as appropriate.
- Provide reference to sitemaps XML URLs and RSS URLs in your robots.txt files to help search engines find out where they are located. You can also publish the location of your sitemaps in search engines’ Webmaster Tools.
Tips on Scaling Up Sitemaps to Very Large Sites:
- Sites which have billions of links, should first consider whether they really need so many links on their site. Bing recommends to link only to relevant web pages to ensure these pages are discovered, crawled and indexed.
- You can also manage two sets of sitemaps files to ensure search engines discover all the links on your site. This will not only give sufficient time to search engines to discover all your sites URLs but also allow them to download a set of sitemaps not modified.