Skip to main content

Advertisement

A Prototype System for Selective Dissemination of Broadcast News in European Portuguese

Abstract

This paper describes ongoing work on selective dissemination of broadcast news. Our pipeline system includes several modules: audio preprocessing, speech recognition, and topic segmentation and indexation. The main goal of this work is to study the impact of earlier errors in the last modules. The impact of audio preprocessing errors is quite small on the speech recognition module, but quite significant in terms of topic segmentation. On the other hand, the impact of speech recognition errors on the topic segmentation and indexation modules is almost negligible. The diagnostic of the errors in these modules is a very important step for the improvement of the prototype of a media watch system described in this paper.

References

  1. 1.

    Meinedo H, Neto J: A stream-based audio segmentation, classification and clustering pre-processing system for broadcast news using ANN models. Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH '05), September 2005, Lisbon, Portugal 237–240.

  2. 2.

    Amaral R, Trancoso I: Improving the topic indexation and segmentation modules of a media watch system. Proceedings of the 8th International Conference on Spoken Language Processing (INTERSPEECH-ICSLP '04), October 2004, Jeju Island, Korea 1609–1612.

  3. 3.

    Trancoso I, Neto J, Meinedo H, Amaral R: Evaluation of an alert system for selective dissemination of broadcast news. Proceedings of the 8th European Conference on Speech Communication and Technology (EUROSPEECH-INTERSPEECH '03), September 2003, Geneva, Switzerland 1257–1260.

  4. 4.

    NIST : Fall 2004 rich transcription (rt-04f) evaluation plan. 2004.

  5. 5.

    Siegler M, Jain U, Raj B, Stern R: Automatic segmentation, classification and clustering of broadcast news audio. Proceedings of DARPA Speech Recognition Workshop, February 1997, Chantilly, Va, USA 97–99.

  6. 6.

    Chen S, Gopalakrishnan P: Speaker, environment and channel change detection and clustering via the Bayesian information criterion. Proceedings of DARPA Speech Recognition Workshop, February 1998, Lansdowne, Va, USA 127–132.

  7. 7.

    Žibert J, Mihelič F, Martens J-P, et al.: The COST278 broadcast news segmentation and speaker clustering evaluation—overview, methodology, systems, results. Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH '05), September 2005, Lisbon, Portugal 629–932.

  8. 8.

    Tranter SE, Reynolds DA: An overview of automatic speaker diarization systems. IEEE Transactions on Audio, Speech and Language Processing 2006,14(5):1557-1565.

  9. 9.

    Zhu X, Barras C, Meignier S, Gauvain J-L: Combining speaker identification and BIC for speaker diarization. Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH '05), September 2005, Lisbon, Portugal 2441–2444.

  10. 10.

    Meinedo H, Caseiro D, Neto J, Trancoso I: AUDIMUS.media: a broadcast news speech recognition system for the European Portuguese language. Proceedings of the 6th International Workshop on Computational Processing of the Portuguese Language (PROPOR '03), June 2003, Faro, Portugal 9–17.

  11. 11.

    Mohri M, Pereira F, Riley M: Weighted finite-state transducers in speech recognition. Proceedings of Automatic Speech Recognition: Challenges for the New Millenium (ASR '00), September 2000, Paris, France 97–106.

  12. 12.

    Caseiro D, Trancoso I: A specialized on-the-fly algorithm for lexicon and language model composition. IEEE Transactions on Audio, Speech and Language Processing 2006,14(4):1281-1291.

  13. 13.

    Williams D: Knowing what you don't know: roles for confidence measures in automatic speech recognition, Ph.D. thesis. University of Sheffield, Sheffield, UK; 1999.

  14. 14.

    Berger AL, Della Pietra VJ, Della Pietra SA: A maximum entropy approach to natural language processing. Computational Linguistics 1996,22(1):39-71.

  15. 15.

    Matsoukas S, Prasad R, Laxminarayan S, Xiang B, Nguyen L, Schwartz R:The 2004 BBN recognition systems for English broadcast news and conversational telephone speech. Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH '05), September 2005, Lisbon, Portugal 1641–1644.

  16. 16.

    Nguyen L, Xiang B, Afify M, et al.: The BBN RT04 English broadcast news transcription system. Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH '05), September 2005, Lisbon, Portugal 1673–1676.

  17. 17.

    Galliano S, Geoffrois E, Mostefa D, Choukri K, Bonastre J-F, Gravier G: The ESTER phase II evaluation campaign for the rich transcription of French broadcast news. Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH '05), September 2005, Lisbon, Portugal 1149–1152.

  18. 18.

    Gauvain JL, Lamel L, Adda-Decker M: Developments in continuous speech dictation using the ARPA WSJ task. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '95), May 1995, Detroit, Mich, USA 1: 65–68.

  19. 19.

    Shriberg E: Spontaneous speech: how people really talk, and why engineers should care. Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH '05), September 2005, Lisbon, Portugal 1781–1784.

  20. 20.

    Barzilay R, Collins M, Hirschberg J, Whittaker S: The rules behind roles: identifying speaker role in radio broadcasts. Proceedings of the 7th National Conference on Artificial Intelligence and the 12th Conference on Innovative Applications of Artificial Intelligence (AAAI/IAAI '00), July 2000, Austin, Tex, USA 679–684.

  21. 21.

    Gelbukh A, Sidorov G, Guzmán-Arenas A: Document indexing with a concept hierarchy. Proceedings of the 1st International Workshop on New Developments in Digital Libraries (NDDL '01), July 2001, Setúbal, Portugal 47–54.

  22. 22.

    Lo YY, Gauvain JL: The LIMSI topic tracking system for TDT 2002. Proceedings of DARPA Topic Detection and Tracking Workhsop, November 2002, Gaithersburg, Md, USA

  23. 23.

    Werner S, Iurgel U, Kosmala A, Rigoll G: Tracking topics in broadcast news data. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '02), September 2002, Lausanne, Switzerland

  24. 24.

    Garofolo J, Auzanne G, Voorhees E: The TREC spoken document retrieval track: a success story. Proceedings of the Recherche d'Informations Assiste par Ordinateur (RIAO '00), April 2000, Paris, France

  25. 25.

    Neto J, Meinedo H, Amaral R, Trancoso I: A system for selective dissemination of multimedia information resulting from the ALERT project. Proceedings of ISCA Workshop on Multilingual Spoken Document Retrieval (MSDR '03), April 2003, Hong Kong 25–30.

  26. 26.

    Martins C, Texeira A, Neto J: Dynamic vocabulary adaptation for a daily and real-time broadcast news transcription system. Proceedings of IEEE/ACL Spoken Language Technology Workshop, December 2006, Aruba, The Netherlands 146–149.

  27. 27.

    Chelba C, Mahajan M, Acero A: Speech utterance classification. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '03), April 2003, Hong Kong 1: 280–283.

Download references

Author information

Correspondence to R. Amaral.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://doi.org/creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Keywords

  • Information Technology
  • Quantum Information
  • Speech Recognition
  • Prototype System
  • Indexation Module