This is an index of 78,535 documents on Harvard Web servers, totaling 500 megabytes, and including 1.3 million distinct words.
Have a look at the source and a description of how it works.