Comparing Self-Supervised Learning Models Pre-Trained on Human Speech and Animal Vocalizations for Bioacoustics Processing

Cite

This repository contains the source code for the paper Comparing Self-Supervised Learning Models Pre-Trained on Human Speech and Animal Vocalizations for Bioacoustics Processing by E. Sarkar and M. Magimai Doss, accepted at IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2025.

Please cite the original authors for their work in any publication(s) that uses this work:

@INPROCEEDINGS{Sarkar_ICASSP_2025,
         author = {Sarkar, Eklavya and Magimai-Doss, Mathew},
          title = {Comparing Self-Supervised Learning Models Pre-Trained on Human Speech and Animal Vocalizations for Bioacoustics Processing},
      booktitle = {IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2025)},
           year = {2025},
}

Datasets

InfantMarmosetsVox is a publicly available marmoset dataset for multi-class call-type and caller identification. It also contains a usable Pytorch Dataset and Dataloader. Any publication (eg. conference paper, journal article, technical report, book chapter, etc) resulting from the usage of InfantsMarmosetVox must cite its own release paper:

@inproceedings{sarkar23_interspeech,
  title     = {Can Self-Supervised Neural Representations Pre-Trained on Human Speech distinguish Animal Callers?},
  author    = {Eklavya Sarkar and Mathew Magimai.-Doss},
  year      = {2023},
  booktitle = {INTERSPEECH 2023},
  pages     = {1189--1193},
  doi       = {10.21437/Interspeech.2023-1968},
  issn      = {2958-1796},
}

The Watkins dataset is publicably available, and the Mescalina Bark ID Data Base can be obtained upon request to its original authors.

Installation

This package has very few requirements. To create a new conda/mamba environment, install conda, then mamba, and simply follow the next steps:

# Clone project
git clone https://github.com/idiap/ssl-human-animal
cd ssl-human-animal

# Create and activate environment
mamba env create -f environment.yml
mamba activate animal_env

Usage

In order to use this repository, you must correctly configure the paths to your datasets in configs/paths/default.yaml. Make sure to only edit in the correct path to all the varaibles listed under PATHS TO MODIFY.

Then, the experiments conducted in this paper can be found in the scripts folder. These contain feature extraction and training scripts.

Sample run:

$ ./scripts/feature_extraction/wavlm.sh
$ ./scripts/train/wavlm.sh

These use gridtk but can be reconfigured according to the user's needs.

One can also train an individual model with chosen experiment configuration from configs/experiment/

python src/train.py experiment=experiment_name.yaml

You can override any parameter from command line like this

python src/train.py trainer.max_epochs=20

Directory Structure

The structure of this directory is organized as the following:

.
├── CITATION.cff            # Setup
├── configs                 # Experiment configs
├── environment.yaml        # Environment file
├── hydra_plugins           # Plugins
├── img                     # Images
├── LICENSE                 # License
├── Makefile                # Setup
├── MANIFEST.in             # Setup
├── pyproject.toml          # Setup
├── README.md               # This file
├── requirements.txt        # Requirements
├── scripts                 # Scripts
├── setup.py                # Setup
├── src                     # Python source code
└── version.txt             # Version

Contact

For questions or reporting issues to this software package, kindly contact the first author.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comparing Self-Supervised Learning Models Pre-Trained on Human Speech and Animal Vocalizations for Bioacoustics Processing

Cite

Datasets

Installation

Usage

Directory Structure

Contact

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
LICENSES		LICENSES
configs		configs
hydra_plugins/hydra_dask_launcher		hydra_plugins/hydra_dask_launcher
img		img
scripts		scripts
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.project-root		.project-root
CITATION.cff		CITATION.cff
Makefile		Makefile
README.md		README.md
environment.yaml		environment.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py
version.txt		version.txt

idiap/ssl-human-animal

Folders and files

Latest commit

History

Repository files navigation

Comparing Self-Supervised Learning Models Pre-Trained on Human Speech and Animal Vocalizations for Bioacoustics Processing

Cite

Datasets

Installation

Usage

Directory Structure

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages