On the effects of using speech transcripts and subtitles to detect topic shifts in news broadcasts

Mohamed, Sahra

In this research, topic segmentation in texts (a.k.a. text segmentation) is used as a proxy for topic segmentation in videos. The main application is automatically providing a topic transition structure for videos, because it is difficult to quickly scan them and figure out where a new subject starts. Topic models are used to figure out the topic transition positions. The available data for this research is provided by the Netherlands Institute for Sound and Vision and consists of 25,600 transcripts and subtitles of the same Dutch news broadcasts.

The research questions whether it is better to use automatic speech recognition transcripts or subtitles when segmenting a video based on topics.The subtitles and speech transcripts were compared for the same news broadcasts and both qualitative and quantitative differences between them were found. However, no significant difference was found between the performance of the text segmentation algorithm using subtitles and speech transcripts. The research presents the challenges and benefits of the developed text segmentation algorithm. The research can give insight into the realizability of the application of text segmentation to help structure videos, which can become a starting point for future research.

Additional Metadata
Publisher	Utrecht University
Theme	Access
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Mohamed, S. (2020, June 26). On the effects of using speech transcripts and subtitles to detect topic shifts in news broadcasts. Utrecht University.

On the effects of using speech transcripts and subtitles to detect topic shifts in news broadcasts

Publication

Publication

CONTACT

On the effects of using speech transcripts and subtitles to detect topic shifts in news broadcasts

Publication

Publication

Workflow

Workflow

Add Content