-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new augmentations: lowpass filtering, lossy compression (opus/mp3/vorbis) #1451
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Roman Korostik <[email protected]>
Signed-off-by: Roman Korostik <[email protected]>
Signed-off-by: Roman Korostik <[email protected]>
lhotse/augmentation/codec.py
Outdated
f.seek(0) | ||
samples_compressed, rate_compressed = sf.read( | ||
f, always_2d=True | ||
) # TODO: handle possible sample rate change with the opus codec? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you write OPUS files with soundfile, it adds extra information in the file header about the original sampling rate, so that when you load this file with soundfile later, it's resampled from 48k before returning the audio array (unlike most other tools).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, did not know that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted
lhotse/augmentation/lowpass.py
Outdated
from typing import Literal | ||
|
||
import numpy as np | ||
import scipy.signal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since scipy is an optional lhotse dependency, please move this import inside __call__
, and add an import guard (from lhotse.utils
) before it:
if not is_module_available("scipy"):
raise ImportError("In order to use Lowpass transforms, run 'pip install scipy'")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
lhotse/augmentation/lowpass.py
Outdated
|
||
import numpy as np | ||
import scipy.signal | ||
import soundfile as sf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also move all soundfile imports to local function scopes? IIRC importing this globally used to silently break documentation builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Please also add unit tests for calling transform methods on Recording and Cut, and for calling cut_transforms
on CutSet
…pressing with opus" This reverts commit 479aaf4.
…uested during review
Thanks for the review, addressed comments Will add tests a bit later |
This PR adds two operations on
Cut
s/Recording
s and corresponding randomizedCutSet
transformsBoth would be useful for training robust ASR models, speech enhancement models, etc.
TODO
lhotse.audio.recording.Recording