Skip to main content

Structuring Broadcast Audio for Information Access

Abstract

One rapidly expanding application area for state-of-the-art speech recognition technology is the automatic processing of broadcast audiovisual data for information access. Since much of the linguistic information is found in the audio channel, speech recognition is a key enabling technology which, when combined with information retrieval techniques, can be used for searching large audiovisual document collections. Audio indexing must take into account the specificities of audio data such as needing to deal with the continuous data stream and an imperfect word transcription. Other important considerations are dealing with language specificities and facilitating language portability. At Laboratoire d′Informatique pour la Mécanique et les Sciences de l′Ingénieur (LIMSI), broadcast news transcription systems have been developed for seven languages: English, French, German, Mandarin, Portuguese, Spanish, and Arabic. The transcription systems have been integrated into prototype demonstrators for several application areas such as audio data mining, structuring audiovisual archives, selective dissemination of information, and topic tracking for media monitoring. As examples, this paper addresses the spoken document retrieval and topic tracking tasks.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jean-Luc Gauvain.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Gauvain, J., Lamel, L. Structuring Broadcast Audio for Information Access. EURASIP J. Adv. Signal Process. 2003, 642019 (2003). https://doi.org/10.1155/S1110865703211033

Download citation

Keywords and phrases

  • audio indexing
  • structuring audio data
  • multilingual speech recognition
  • audio partitioning
  • spoken document retrieval
  • topic tracking