New Google’s caffeinates
Carrie Grimes, one of Google’s software engineers, said in an official blog post that the new system had been designed because web users now expected search results to be fresher.
She said: ‘Caffeine provides 50 percent fresher results for web searches than our last index, and it's the largest collection of web content we've offered.
‘Whether it's a news story, a blog or a forum post, you can now find links to relevant content much sooner after it is published than was possible ever before,"
In the past Google updates it search results by refreshing different ‘layers’ of content.
Want to know how big Google's new search backend is? They try to perspective its growth:
- Every second, Caffeine processes hundreds of thousands of pages in parallel.
- If this were a pile of paper it would grow three miles taller every second.
- Caffeine takes up nearly 100 million gigabytes of storage in one database
- It adds new information at a rate of hundreds of thousands of gigabytes per day
- You would need 625,000 of the largest iPods to store that much information; if these were stacked end-to-end they would go for more than 40 miles.
-Seth Weintraub, Cable News Network

Google Caffeine
"Every second Caffeine processes hundreds of thousands of pages in parallel," Google said. "If this were a pile of paper it would grow three miles taller every second. Caffeine takes up nearly 100 million gigabytes of storage in one database and adds new information at a rate of hundreds of thousands of gigabytes per day. You would need 625,000 of the largest iPods to store that much information; if these were stacked end-to-end they would go for more than 40 miles."
Caffeine scans the Web, attempting to pick up and index data as quickly as possible, to return the most relevant links as quickly as possible. According to the company, the new engine features "50 percent fresher results for Web searches than our last index."
"Whether it's a news story, a blog or a forum post, you can now find links to relevant content much sooner after it is published than was possible ever before," Grimes wrote.
Google is now analyzing smaller portions of the Web, then immediately updating its search index, Grimes wrote. Before, Google might analyze the entire Web before updating its results, so a lag of several weeks could be possible before some new Internet pages were included.
Under the old index, when you did a search, Google would scan the various layers of its index, prioritized by importance. It would search one group of high priority sites, then another less prioritized group of sites, and so on. Each layer was updated on a schedule.
“Our old index had several layers, some of which were refreshed at a faster rate than others; the main layer would update every couple of weeks. To refresh a layer of the old index, we would analyze the entire web, which meant there was a significant delay between when we found a page and made it available to you,” Carrie Grimes, a Google software engineer, explained on the Official Google Blog.
Basically, Google was sometimes missing the latest, breaking search results - tweets, recently updated news items, etc.
“Content on the web is blossoming. It's growing not just in size and numbers but with the advent of video, images, news and real-time updates, the average webpage is richer and more complex. In addition, people's expectations for search are higher than they used to be. Searchers want to find the latest relevant content and publishers expect to be found the instant they publish,” said Grimes.
Popularity: unranked [?]
Related posts in This Blog:
- To Succeed in SEO with Google and Small Business Administration (SBA)
- AOL 25th Anniversary and Content Strategy
