Speech to text: AI in libraries

Lasse Rogers Nielsen, Lars Flemming Mydtskov, Ditte Laursen

Publikation: KonferencebidragKonferenceabstrakt til konferenceForskningpeer review

1 Downloads (Pure)

Abstract

This project aims to leverage the implementation of advanced Automatic Speech Recognition (ASR) systems to enable text-based searches within the Danish Royal Library’s extensive radio and television archives. Based on the Whisper ASR model, we will demonstrate how transcribing audiovisual materials can make these vast archives significantly more accessible and searchable for research.

Established in 1987, the State Media Archive was created to collect and preserve Danish radio and television broadcasts for future historical research. The collection includes nationwide public service broadcasts from the mid-1980s onward, supplemented with older broadcasts. The 2005 legal deposit law revision included radio and television, ensuring comprehensive digital collection from significant nationwide channels, while selectively collecting from others. Today, the digital archive holds over three million broadcasts from around 80 Danish radio and television stations, with 1,500 broadcasts added daily, and efforts are ongoing to digitize over 150,000 analogue tapes from before 2005.

Traditional indexing methods cannot efficiently manage this immense and diverse corpus. This is due to the fact that metadata varies significantly between different channels and changes over time. Additionally, some channels, broadcasts or periods lack metadata altogether. Based on a selective part of the collection, primarily older sourced materials, this project focuses on fine-tuning the Whisper ASR model for Danish-language content in order to make the audiovisual resources text-searchable for research. The process includes several steps: Feature extraction, Tokenizing, Low-Rank adapters (LoRA).
OriginalsprogEngelsk
Publikationsdato26 sep. 2024
Antal sider26
StatusUdgivet - 26 sep. 2024
BegivenhedCENL webinar: AI in Libraries - Online, Frankrig
Varighed: 26 sep. 202426 sep. 2024
https://www.cenl.org/network-group-ai-in-libraries-webinars-2024/

Konference

KonferenceCENL webinar
LokationOnline
Land/OmrådeFrankrig
Periode26/09/202426/09/2024
Internetadresse

Citationsformater