20011028: Batty, WWW -- Wealth, Weariness or Waste

Contact:cdent@burningchrome.com


Batty, D. (1998). WWW -- Wealth, Weariness or Waste: Controlled
     vocabulary and thesauri in support of online information access.
     D-Lib Magazine, November 1998. Available at:
     http://webdoc.sub.gwdg.de/edoc/aw/d-lib/dlib/november98/11batty.html

-=-=-
How about this idea:

web interface
takes terms
takes online thesaurus selection (e.g. Wordnet)
take distance value (depth of traverse in thesaurus)

generates queries to google based on the logic described in article

this is somewhat like what altavista used to do with their queries on
that that java app. 

Would this increase precision at all or just raise recall? At the
momemt recall is generally pretty high but people make short queries
because long queries sometime ruin both recall and precision. 
-=-=-

CDB Enterprises' decision to construct a dual interface to the
Washington Post articles is, to me, an excellent solution. In a
situation of that sort (article archive) if there was only one option,
I would choose whole text indexing. Best would be both whole text
indexing and a system of tagging articles with terms from a controlled
vocabulary that creates an index.

-=-=-

See also http://www.burningchrome.com/~cdent/slis/l505/papers/slisessay12.htm
for a (not fully formed) discussion of dynamic hierarchy systems. That
is, delaying the creation of hierarchy until it is needed by the user.


Back to the Index