Wednesday, March 30, 2005

Statistically Improbable Phrases

Amazon.com says "...Statistically Improbable Phrases, or "SIPs", show you the interesting, distinctive, or unlikely phrases that occur in the text of books in Search Inside the Book. Our computers scan the text of all books in the Search Inside program. If they find a phrase that occurs a large number of times in a particular book relative to how many times it occurs across all Search Inside books, that phrase is a SIP in that book.

Once we identify a phrase that is statistically improbable:

For books where the phrase is a SIP, we provide an exact count of and link to the occurrences in those books.
For books where the phrase merely appears in the book, we provide a link to those occurrences
We also display a link to search A9.com for the phrase"

A search on Seeing What's Next: Using Theories of Innovation to Predict Industry Change and many others will result in SIPs appearing at the top of the page. They work rather like subject headings for which there was no metadata. Interesting.

No comments: