Speech Translation
Speech Translation combines automatic speech recognition (ASR), machine translation (MT), and text-to-speech (TTS) to convert spoken language from one language into another. It enables seamless communication between speakers of different languages, often in real-time. This task begins with ASR, which transcribes the spoken input into text. The text is then translated into the target language using MT models. Finally, TTS converts the translated text back into speech if spoken output is required. Speech Translation presents unique challenges, including handling spontaneous speech, regional accents, and background noise. The system must also preserve the original meaning, cultural nuances, and tone of the speaker during translation. Speech translation is widely used in international conferences, live broadcasts, travel applications, and multilingual customer service. It plays a critical role in breaking down language barriers in real-time conversations. Future research aims to enhance the quality of speech translation through end-to- end models that bypass intermediate steps, enabling faster and more accurate translations across a wider range of languages.
ACL6060 test set
Collection of ACL 2022 paper presentations for which pre-recorded audio or
video presentations were provided to the ACL Anthology.
Presentations include a variety of native and non-native English accents.
Presentations have been professionally transcribed and translated into ten
language pairs, including 4 European languages (German, Portuguese, Dutch,
and French). The dataset was described in detail in “Elizabeth Salesky,
Kareem Darwish, Mohamed Al-Badrashiny, Mona Diab, and Jan Niehues”, 2023,
Evaluating Multilingual Speech Translation under Realistic Conditions with
Resegmentation and Terminology, in Proceedings of the 20th International
Conference on Spoken Language Translation (IWSLT 2023), pages 62-78,
Toronto, Canada, Association for Computational Linguistics publication.
Elizabeth Salesky, Kareem Darwish, Mohamed Al-Badrashiny, Mona Diab,
Jan Niehues”, 2023, Evaluating Multilingual Speech Translation under
Realistic Conditions with Resegmentation and Terminology, in Proceedings of
the 20th International Conference on Spoken Language Translation
(IWSLT 2023), pages 62-78, Toronto, Canada, Association for Computational
Linguistics.
COVOST test set
TODO: please update test set description