You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This template is only for usage issues encountered.
I have thoroughly reviewed the project documentation but couldn't find information to solve my problem.
I have searched for existing issues, including closed ones, and couldn't find a solution.
I confirm that I am using English to submit this report in order to facilitate communication.
Environment Details
tesla-v100 32G
Steps to Reproduce
clone code
data processing
running
inference
✔️ Expected Behavior
output a generated audio with right words
❌ Actual Behavior
I’m training a Cantonese F5-TTS model using 650 hours of Common Voice Cantonese data. The audio quality is quite good in zero-shot, but the model always misses a few words in any sentence. I’ve checked both the data and inference code. I’m using 4 V100 GPUs to train the model, from 150k steps to 300k steps (100 epochs), However, the problem has not improved. Does anyone know what the issue might be? Should I wait for more steps or epochs, or should I stop training? Due to limited resources, this could take a few days.
The text was updated successfully, but these errors were encountered:
Using "Sample" type batches instead of "Frame" type helped me to fix the issue with missing word. The "frame" type calculation works in mysteries ways in the current code, and its size is not adjusted dynamically, and ignores half of the dataset in my case.
Try training with the "Sample" batch size type and "Max Samples" set to Zero (0).
Checks
Environment Details
tesla-v100 32G
Steps to Reproduce
✔️ Expected Behavior
output a generated audio with right words
❌ Actual Behavior
I’m training a Cantonese F5-TTS model using 650 hours of Common Voice Cantonese data. The audio quality is quite good in zero-shot, but the model always misses a few words in any sentence. I’ve checked both the data and inference code. I’m using 4 V100 GPUs to train the model, from 150k steps to 300k steps (100 epochs), However, the problem has not improved. Does anyone know what the issue might be? Should I wait for more steps or epochs, or should I stop training? Due to limited resources, this could take a few days.
The text was updated successfully, but these errors were encountered: