Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

From my understanding, the premise is that the 'index' generated from each crawled site will be some set of metadata smaller than the site's actual content. So instead of many robots, each crawling through all the data on your site, there could be one bot, which updates a single (smaller) index that all search engines can access.

I agree that Google's index is probably optimized to work with their search algorithm. From what the author claims, though, this doesn't mean that Google would be losing anything by allowing other engines to use the index, as "all the value is in the analysis" of the index.



There's also significant value in knowing when to re-index sites due to changing conditions. For example, if some new iSomething is announced, re-indexing apple.com as well as a number of popular Apple-related news sites would be very helpful in keeping the index fresh. There's also feedback from the ranking algorithm in determining how often to re-index, how deep to index, etc etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: