Google Testing New Indexing Approach
Since its inception, Google has tried to make sense of billions of Web documents and using advanced in-house technology. But now, Google is experimenting with a new concept to better its search crawlers: ask webmasters for help. The program, called Google Sitemaps, could revolutionize how the Web is indexed.
Specifically, Sitemaps will direct Google's Web crawlers to content that has been changed or added, removing the need for Google to spider an entire site. Sitemap files are based on XML and contain a number of parameters to aid in the search indexing process.
Engineering Director Shiva Shivakumar says the project "will either fail miserably, or succeed beyond our wildest dreams, in making the web better for webmasters and users alike."
"Initially, we plan to use the URL information webmasters supply to further improve the coverage and freshness of our index. Over time that will lead to our doing an even better job of delivering more search results from more websites," explained Shivakumar in the Google Blog.
To aid in the creation of Sitemap files, Google has developed an open source generator utility that runs on Web servers. Sitemaps are then submitted to Google, which uses them to create a better index of the site. Google says the end result will be the search engine crawling more pages and staying up to date with any changes.
But Sitemaps aren't restricted to Google. The project has been released under the Attribution/Share Alike Creative Commons license so other search engines, such as Yahoo or MSN, can easily implement the same functionality.
Google hopes Web servers such as Microsoft IIS and Apache will eventually include native Sitemap support and remove the need for manual work from webmasters.