From e5149440399056cb972edea4ac4271906977cf8e Mon Sep 17 00:00:00 2001 From: Ioannis Tsiamas Date: Thu, 22 Sep 2022 10:38:12 +0200 Subject: [PATCH] updated arxiv to interspeech --- README.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index e8f99fb..a6020d8 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ # SHAS: Approaching optimal Segmentation for End-to-End Speech Translation -In this repo you can find the code of the Supervised Hybrid Audio Segmentation (SHAS) method for End-to-End Speech Translation, proposed in [Tsiamas et al. (2022)](https://arxiv.org/abs/2202.04774). You can use our method with pre-trained models to segment a collection of audio files or train and fine-tune our method on your own segmented data. We provide instructions to replicate our results from the paper on MuST-C en-de and mTEDx es-en, fr-en, it-en, pt-en. You can also find easy-to-use implementations of other segmentation methods, like fixed-length, VAD, and the hybrid methods of [Potapczyk and Przybysz (2020)](https://aclanthology.org/2020.iwslt-1.9/), [Gállego et al. (2021)](https://aclanthology.org/2021.iwslt-1.11/), and [Gaido et al. (2021)](https://aclanthology.org/2021.iwslt-1.11/). +In this repo you can find the code of the Supervised Hybrid Audio Segmentation (SHAS) method for End-to-End Speech Translation, proposed in [Tsiamas et al. (2022)](https://www.isca-speech.org/archive/interspeech_2022/tsiamas22_interspeech.html). You can use our method with pre-trained models to segment a collection of audio files or train and fine-tune our method on your own segmented data. We provide instructions to replicate our results from the paper on MuST-C en-de and mTEDx es-en, fr-en, it-en, pt-en. You can also find easy-to-use implementations of other segmentation methods, like fixed-length, VAD, and the hybrid methods of [Potapczyk and Przybysz (2020)](https://aclanthology.org/2020.iwslt-1.9/), [Gállego et al. (2021)](https://aclanthology.org/2021.iwslt-1.11/), and [Gaido et al. (2021)](https://aclanthology.org/2021.iwslt-1.11/). Follow the instructions [here](#usage) to segment a collection of audio files, or the instructions [here](#more-extensive-usage) to replicate the results of the paper. @@ -22,13 +22,13 @@ Speech translation models are unable to directly process long audios, like TED t If you find SHAS or the contents of this repo useful for your research, please consider citing: ``` -@misc{tsiamas2022shas, - title={SHAS: Approaching optimal Segmentation for End-to-End Speech Translation}, - author={Ioannis Tsiamas and Gerard I. Gállego and José A. R. Fonollosa and Marta R. Costa-jussà}, - year={2022}, - eprint={2202.04774}, - archivePrefix={arXiv}, - primaryClass={cs.SD} +@inproceedings{tsiamas22_interspeech, + author={Ioannis Tsiamas and Gerard I. Gállego and José A. R. Fonollosa and Marta R. Costa-jussà}, + title={{SHAS: Approaching optimal Segmentation for End-to-End Speech Translation}}, + year=2022, + booktitle={Proc. Interspeech 2022}, + pages={106--110}, + doi={10.21437/Interspeech.2022-59} } ```