Contact:cdent@burningchrome.com
Batty, D. (1998). WWW -- Wealth, Weariness or Waste: Controlled vocabulary and thesauri in support of online information access. D-Lib Magazine, November 1998. Available at: http://webdoc.sub.gwdg.de/edoc/aw/d-lib/dlib/november98/11batty.html -=-=- How about this idea: web interface takes terms takes online thesaurus selection (e.g. Wordnet) take distance value (depth of traverse in thesaurus) generates queries to google based on the logic described in article this is somewhat like what altavista used to do with their queries on that that java app. Would this increase precision at all or just raise recall? At the momemt recall is generally pretty high but people make short queries because long queries sometime ruin both recall and precision. -=-=- CDB Enterprises' decision to construct a dual interface to the Washington Post articles is, to me, an excellent solution. In a situation of that sort (article archive) if there was only one option, I would choose whole text indexing. Best would be both whole text indexing and a system of tagging articles with terms from a controlled vocabulary that creates an index. -=-=- See also http://www.burningchrome.com/~cdent/slis/l505/papers/slisessay12.htm for a (not fully formed) discussion of dynamic hierarchy systems. That is, delaying the creation of hierarchy until it is needed by the user. Back to the Index