Skip to content

Commit

Permalink
updated arxiv to interspeech
Browse files Browse the repository at this point in the history
  • Loading branch information
johntsi authored Sep 22, 2022
1 parent 4d61cb3 commit e514944
Showing 1 changed file with 8 additions and 8 deletions.
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@

# SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

In this repo you can find the code of the Supervised Hybrid Audio Segmentation (SHAS) method for End-to-End Speech Translation, proposed in [Tsiamas et al. (2022)](https://arxiv.org/abs/2202.04774). You can use our method with pre-trained models to segment a collection of audio files or train and fine-tune our method on your own segmented data. We provide instructions to replicate our results from the paper on MuST-C en-de and mTEDx es-en, fr-en, it-en, pt-en. You can also find easy-to-use implementations of other segmentation methods, like fixed-length, VAD, and the hybrid methods of [Potapczyk and Przybysz (2020)](https://aclanthology.org/2020.iwslt-1.9/), [Gállego et al. (2021)](https://aclanthology.org/2021.iwslt-1.11/), and [Gaido et al. (2021)](https://aclanthology.org/2021.iwslt-1.11/).
In this repo you can find the code of the Supervised Hybrid Audio Segmentation (SHAS) method for End-to-End Speech Translation, proposed in [Tsiamas et al. (2022)](https://www.isca-speech.org/archive/interspeech_2022/tsiamas22_interspeech.html). You can use our method with pre-trained models to segment a collection of audio files or train and fine-tune our method on your own segmented data. We provide instructions to replicate our results from the paper on MuST-C en-de and mTEDx es-en, fr-en, it-en, pt-en. You can also find easy-to-use implementations of other segmentation methods, like fixed-length, VAD, and the hybrid methods of [Potapczyk and Przybysz (2020)](https://aclanthology.org/2020.iwslt-1.9/), [Gállego et al. (2021)](https://aclanthology.org/2021.iwslt-1.11/), and [Gaido et al. (2021)](https://aclanthology.org/2021.iwslt-1.11/).

Follow the instructions [here](#usage) to segment a collection of audio files, or the instructions [here](#more-extensive-usage) to replicate the results of the paper.

Expand All @@ -22,13 +22,13 @@ Speech translation models are unable to directly process long audios, like TED t
If you find SHAS or the contents of this repo useful for your research, please consider citing:

```
@misc{tsiamas2022shas,
title={SHAS: Approaching optimal Segmentation for End-to-End Speech Translation},
author={Ioannis Tsiamas and Gerard I. Gállego and José A. R. Fonollosa and Marta R. Costa-jussà},
year={2022},
eprint={2202.04774},
archivePrefix={arXiv},
primaryClass={cs.SD}
@inproceedings{tsiamas22_interspeech,
author={Ioannis Tsiamas and Gerard I. Gállego and José A. R. Fonollosa and Marta R. Costa-jussà},
title={{SHAS: Approaching optimal Segmentation for End-to-End Speech Translation}},
year=2022,
booktitle={Proc. Interspeech 2022},
pages={106--110},
doi={10.21437/Interspeech.2022-59}
}
```

Expand Down

0 comments on commit e514944

Please sign in to comment.