XML Sitemap

Posted by on Jul 10, 2014 in | 0 comments

The Sitemaps protocol allows a webmaster to inform search engines about URLs on a website that are available for crawling. A Sitemap is an XML file that lists the URLs for a site. It allows webmasters to include additional information about each URL: when it was last updated, how often it changes, and how important it is in relation to other URLs in the site. This allows search engines to crawl the site more intelligently. Sitemaps are a URL inclusion protocol and complement robots.txt, a URL exclusion protocol.

Sitemaps are particularly beneficial on websites where:

  • Some areas of the website are not available through the browsable interface, or;
  • Webmasters use rich Ajax, Silverlight, or Flash content that is not normally processed by search engines.

Search Engine Indexing

Sitemaps supplement and do not replace the existing crawl-based mechanisms that search engines already use to discover URLs. Using this protocol does not guarantee that web pages will be included in search indexes, nor does it influence the way that pages are ranked in search results (added by Jan: however, a sitemap is still the best insurance for getting a search engine to learn about your entire site.). Specific examples are provided below:

  • Google – Webmaster Support on Sitemaps: “Google doesn’t guarantee that we’ll crawl or index all of your URLs. However, we use the data in your Sitemap to learn about your site’s structure, which will allow us to improve our crawler schedule and do a better job crawling your site in the future. In most cases, webmasters will benefit from Sitemap submission, and in no case will you be penalized for it.”
  • Bing – Bing uses the standard sitemaps.org protocol and is very similar to the one mentioned below.
  • Yahoo – After the search deal commenced between yahoo and bing, yahoo site explorer has merged with bing webmaster tools.

Ref: Wikipedia: Sitemaps

« Glossary Index