If you publish a lot of content online, having a good site search engine is important. But increasingly, it's not enough to simply allow visitors to search for text-string matches.
Recently I was talking to John Peterson Myers, CEO of Environmental Health Sciences, the nonprofit publisher of Environmental Health News -- a well-respected daily roundup of worldwide news on all sorts of matters related to environmental health. Each news summary entered into EHN's database gets categorized by a combination of automated and manual techniques, to offer very rich and specific results when a visitor clicks on a "related stories" link.
Here's an example: the "related stories" results page for a recent EHN blurb about research into PCB levels and endometriosis. The left-hand column offers a rich list of links to previously cited stories, categorized according to how they're related to the initial blurb.
As if this wasn't useful enough, Myers said that in a couple of months, EHN will be rolling out an upgraded site index which leverages the Teragram Categorizer. "This should reduce the incidence of false positives in our automated indexing -- which will save us a lot of time to check the generated index for errors."
Taxonomic tools like the Teragram Categorizer or many of its competitors can be a boon to content producers and users alike. I would love it if more news sites offered the same sort of rich "related stories" results as EHN.