Skip to main content

Advertisement

Robust Techniques for Organizing and Retrieving Spoken Documents

Abstract

Information retrieval tasks such as document retrieval and topic detection and tracking (TDT) show little degradation when applied to speech recognizer output. We claim that the robustness of the process is because of inherent redundancy in the problem: not only are words repeated, but semantically related words also provide support. We show how document and query expansion can enhance that redundancy and make document retrieval robust to speech recognition errors. We show that the same effect is true for TDT′s tracking task, but that recognizer errors are more of an issue for new event and story link detection.

Author information

Correspondence to James Allan.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Allan, J. Robust Techniques for Organizing and Retrieving Spoken Documents. EURASIP J. Adv. Signal Process. 2003, 980946 (2003). https://doi.org/10.1155/S1110865703211070

Download citation

Keywords

  • spoken document retrieval
  • topic detection and tracking
  • information retrieval