BackRub is a "web crawler" which is designed to traverse the web.

Currently we are developing techniques to improve web search engines. We will make various services available as soon as possible.

BackRub is a research project of the Digital Library Project in the Computer Science Department at Stanford University.

Some Rough Statistics As of about August 29th, 1996

Total indexable HTML urls: 75.2306 Million
Total content downloaded: 207.022 gigabytes
Total indexable HTML pages downloaded: 30.6255 Million
Total indexable HTML pages which have not been attempted yet: 30.6822 Million
Total robots.txt excluded: 0.224249 Million
Total socket or connection errors: 1.31841 Million

BackRub is written in Java and Python and runs on several Sun Ultras and Intel Pentiums running Linux. The primary database is kept on an Sun Ultra II with 28GB of disk. Scott Hassan and Alan Steremberg have provided a great deal of very talented implementation help. Sergey Brin has also been very involved and deserves many thanks.

Before emailing, please read the FAQ. Thanks.

-Larry Page page@cs.stanford.edu