cantonese model missing word #744

fengping1 · 2025-01-23T07:28:23Z

Checks

This template is only for usage issues encountered.
I have thoroughly reviewed the project documentation but couldn't find information to solve my problem.
I have searched for existing issues, including closed ones, and couldn't find a solution.
I confirm that I am using English to submit this report in order to facilitate communication.

Environment Details

tesla-v100 32G

Steps to Reproduce

clone code
data processing
running
inference

✔️ Expected Behavior

output a generated audio with right words

❌ Actual Behavior

I’m training a Cantonese F5-TTS model using 650 hours of Common Voice Cantonese data. The audio quality is quite good in zero-shot, but the model always misses a few words in any sentence. I’ve checked both the data and inference code. I’m using 4 V100 GPUs to train the model, from 150k steps to 300k steps (100 epochs)， However, the problem has not improved. Does anyone know what the issue might be? Should I wait for more steps or epochs, or should I stop training? Due to limited resources, this could take a few days.

ZhikangNiu · 2025-01-23T07:39:32Z

Maybe more steps

fengping1 · 2025-01-23T07:54:15Z

thx， i'll wait longer

Alykasym · 2025-01-24T20:42:54Z

Using "Sample" type batches instead of "Frame" type helped me to fix the issue with missing word. The "frame" type calculation works in mysteries ways in the current code, and its size is not adjusted dynamically, and ignores half of the dataset in my case.
Try training with the "Sample" batch size type and "Max Samples" set to Zero (0).

fengping1 added the help wanted Extra attention is needed label Jan 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cantonese model missing word #744

cantonese model missing word #744

fengping1 commented Jan 23, 2025

ZhikangNiu commented Jan 23, 2025

fengping1 commented Jan 23, 2025

Alykasym commented Jan 24, 2025

cantonese model missing word #744

cantonese model missing word #744

Comments

fengping1 commented Jan 23, 2025

Checks

Environment Details

Steps to Reproduce

✔️ Expected Behavior

❌ Actual Behavior

ZhikangNiu commented Jan 23, 2025

fengping1 commented Jan 23, 2025

Alykasym commented Jan 24, 2025