Non-Topical Factors in Information Access
Jussi Karlgren, SICS, Sweden & Helsinki University, Finland
WebNet World Conference on the WWW and Internet, in Honolulu, Hawaii Publisher: Association for the Advancement of Computing in Education (AACE), Chesapeake, VA
Research in information retrieval has traditionally concentrated on making assumptions about the content of documents based on very shallow semantic analysis through word occurrence statistics of various kinds. But texts are more than bags of words, and the semantic analysis information retrieval systems typically used is overly simple. There is ample reason to try to broaden the view of what text is and why. Better content analysis alone will not be enough. Texts are more than their meaning. Texts have structure, they have context, they are written in a style conformant or discordant to a genre they are to be understood in, they may be carefully written or hastily thrown together, they are written by various types of agent for various reasons. Besides information to be found in the text or from the author, texts are used by readers of various backgrounds, for various reasons, and with varying degree of satisfaction. This paper outlines a framework within which to find more knowledge from texts than an approximation of their topic, and gives examples of how to use this knowledge to design useful tools for information access.
Karlgren, J. (1999). Non-Topical Factors in Information Access. In Proceedings of WebNet World Conference on the WWW and Internet 1999 (pp. 27-31). Honolulu, Hawaii: Association for the Advancement of Computing in Education (AACE).
© 1999 Association for the Advancement of Computing in Education (AACE)