-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very detailed tutorial up-to-date #8
Comments
Hello @FurkanGozukara do you know if he is mainting it? I tried opening the ozen env then unisntalling torch and installing the version you suggested, it did not install sucessfully, so I went back to the latest torch, I did not touch the pyannote version as in someone here for an unrelated project said you can ignore the warnings apparently: My problme is it is making only 2 wav files from a 17 minutes audio file. What I can do to solve this issue, I am having Only 2 wav files formed.. not like you with 300 files... |
follow my tutorial but most importantly use the github repo i shared on tutorial it still works about training sound files there are settings configuration on ozen toolkit. change them 1 by 1 and you will find pattern how it splits sound sadly i didn't cover it that time in tutorial try to make training sounds between 2 seconds to 15 seconds |
Thanks a lot for the quick answer @FurkanGozukara , Do you think it has to do with the frequency of the guy speech? Maybe he never stops so the program thought it was one whole long unique sentence of 17 minutes lol ? WHat are the settings config i could change and where are they, are they in the : \ozen-toolkit\config.ini? [DEFAULT] Do you think this is what I should try to modify or were you talking about something else? |
ye this is accurate "Do you think it has to do with the frequency of the guy speech? Maybe he never stops so the program thought it was one whole long unique sentence of 17 minutes lol ?" play with these settings to see if you can improve valid_ratio = 0.2 |
Also, what I dont understand is that one or 2 sentences get into "train" folder, and the rest (the whole 17 minutes) gets into a very along sentence inside the "valid" folder? I am so confused. |
yes it will split data so during training both trained and tested it is expected |
Can you try with this video and see what works best for you? Can you try with him and see what works for you please? |
The problem is it took 5% of the audio into train.txt and the rest 95% of the audio (from the video) into valid.txt (thats strange no?) |
Btw, what is training speech file that you have used? I would use it to make sure it is working right for me, then see with other type of traning files. This will ensure I dont have any unrelated errors |
Hey @FurkanGozukara , I tried a new video with a guy that speaks LESS RAPIDLY lol,
thanks |
1 : sadly i don't know. you can try those ozen configs |
Hello, thanks for your answer. It allowed me to keep my general python version and add this one, btw you should make a tutorial about how to manage multiple and different versions of python, especially how to manage the PATH, now in my base env I can only use my previous general python when taping python, i cant use py3.10.9 by just taping python, this version only works inside the venv (env3109) Anyway then I tried the next line of code you showed:
I got multiple errors such as: Then i contuining following your instruction: This line: I finally tried the script: And it was very long, despite having a good video card, This was tiring, I am still with this tutorial and havnt finished it, and still having problems and errors. I think the problem might have been with torch installation, but your line of code (pip3 install torch==1.13.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117) does not work INSIDE conda, do you know what would be the right line of code inside conda ? As I said I tried several other alternatives but they did not work, such as:
|
i hate conda i prefer python you can have multiple pythons i have a video for that please watch this video for venvs |
Thank you! |
it took more than a week :D i hope you consider to support me on patreon |
OMG! A week, it must have been a HELL lool About DLAS being a spyware? I don't understand it got be worried a bit. Why do you guys it look like a spyware I don't get it? |
I made a pull request too please accept if possible : #7
Master Deep Voice Cloning in Minutes: Unleash Your Vocal Superpowers! Free and Locally on Your PC
This tutorial is based on
Ozen Toolkit for data preprocessing
DLAS for Training
Tortoise TTS Fast for speech synthesis
The text was updated successfully, but these errors were encountered: