Google is the undisputed King of Scalability. The High Scalability blog collected the open secrets of the Google Architecture. Two important components are:
The open source Hadoop project implements these functions so you can build a Google like cluster in your data center. In case you do not have 100s of servers at your disposal ask Amazon for help. The Amazon EC2 and S3 utility computing services can host your cluster at a reasonable price. Check out this tutorial by Tom White who illustrates how to use Hadoop and Amazon Web Services together using a large collection of web access logs:
announced their support for Hadoop.
"Looking ahead and thinking about how the economics of large scale computing continue to improve, it's not hard to imagine a time when Hadoop and Hadoop-powered infrastructure is as common as the LAMP (Linux, Apache, MySQL, Perl/PHP/Python) stack that helped to powered the previous growth of the Web."