Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bark with DirectML + Conda + AMD in Windows 11 #271

Open
Milor123 opened this issue May 8, 2023 · 15 comments
Open

Bark with DirectML + Conda + AMD in Windows 11 #271

Milor123 opened this issue May 8, 2023 · 15 comments

Comments

@Milor123
Copy link

Milor123 commented May 8, 2023

Hey guys!! I am trying use this project with DirectML over windows 11, changing the .to(device) to .to(dml) according to gpu-pytorch-windows Docs in the files generation.py in bark folder and build\lib\bark\ respectly. When I try run the project. I seen that the GPU started correctly but then i get the next error.

I am in Python 3.9.16

Could try help me to solve this bug, I dont know that could I do.
RuntimeError: Cannot set version_counter for inference tensor

Console Output:

python .\run.py
No GPU being used. Careful, inference might be very slow!
  0%|                                                                                                                                                | 0/100 [00:00<?, ?it/s]Traceback (most recent call last):
  File "C:\Users\NoeXVanitasXJunk\bark\run.py", line 13, in <module>
    audio_array = generate_audio(text_prompt)
  File "C:\Users\NoeXVanitasXJunk\bark\bark\api.py", line 107, in generate_audio
    semantic_tokens = text_to_semantic(
  File "C:\Users\NoeXVanitasXJunk\bark\bark\api.py", line 25, in text_to_semantic
    x_semantic = generate_text_semantic(
  File "C:\Users\NoeXVanitasXJunk\bark\bark\generation.py", line 460, in generate_text_semantic
    logits, kv_cache = model(
  File "C:\Users\NoeXVanitasXJunk\miniconda3\envs\tfdml_plugin\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\NoeXVanitasXJunk\bark\bark\model.py", line 208, in forward
    x, kv = block(x, past_kv=past_layer_kv, use_cache=use_cache)
  File "C:\Users\NoeXVanitasXJunk\miniconda3\envs\tfdml_plugin\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\NoeXVanitasXJunk\bark\bark\model.py", line 121, in forward
    attn_output, prev_kvs = self.attn(self.ln_1(x), past_kv=past_kv, use_cache=use_cache)
  File "C:\Users\NoeXVanitasXJunk\miniconda3\envs\tfdml_plugin\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\NoeXVanitasXJunk\bark\bark\model.py", line 50, in forward
    q, k ,v  = self.c_attn(x).split(self.n_embd, dim=2)
  File "C:\Users\NoeXVanitasXJunk\miniconda3\envs\tfdml_plugin\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\NoeXVanitasXJunk\miniconda3\envs\tfdml_plugin\lib\site-packages\torch\nn\modules\linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: Cannot set version_counter for inference tensor
  0%|                                                                                                                                                | 0/100 [00:00<?, ?it/s]

NOTE for other users: the error

C:\Users\NoeXVanitasXJunk\miniconda3\envs\tfdml_plugin\lib\site-packages\torchaudio\backend\utils.py:74: UserWarning: No audio backend is available. warnings.warn("No audio backend is available.")

was solved using conda install -c conda-forge pysoundfile and pip install PySoundFile

@gkucsko
Copy link
Contributor

gkucsko commented May 8, 2023

sorry not much experience with amd...

@gkucsko
Copy link
Contributor

gkucsko commented May 18, 2023

gonna close for now since more of a feature request

@gkucsko gkucsko closed this as completed May 18, 2023
@Milor123
Copy link
Author

@gkucsko Feature request? hahaha no, Bark simply dont work in AMD, its not a feature. I am trying to it functional in GPU of AMD. :/ I would like use it but dont know like do un correct port to support AMD

@gkucsko
Copy link
Contributor

gkucsko commented May 18, 2023

well it's definitely not a bugfix haha. i don't have an amd gpu but if you could work on it i'm sure the community would appreciate it a ton!

@Milor123
Copy link
Author

I would love to help you by testing or developing something for the community. However, my Python development experience is limited to basic scripting knowledge. I have very little expertise in this area to consider myself capable of doing more than being a tester.

Although if you propose a way in which I can assist you, I would gladly try to see what I can do

@gkucsko
Copy link
Contributor

gkucsko commented May 18, 2023

much appreciated. ok let me re-open and see if anyone who has an amd has some time to help here!

@gkucsko gkucsko reopened this May 18, 2023
@JonathanFly
Copy link
Contributor

It's not ROCM, but DirectML seems to be better than nothing for Window AMD:

JonathanFly#79 (comment)

https://github.com/JonathanFly/bark/tree/bark_amd_directml_test#-bark-amd-install-test-

Somebody who knows what they are doing can probably make it fast.

@f1am3d
Copy link

f1am3d commented Sep 18, 2023

AMD 7900 XTX with proper optimization can run ML faster than RTX 4090 (check tests in google).
Also, it is 1000$ vs 1700$.

Any reasons this project doesn't support AMD GPUs?

@Milor123
Copy link
Author

@f1am3d Dont waste your money in AMD all is a shit, you need going to linux or get 1000 bugs and several errors. DirectML should be renamed to BuggedML, is a shit works with this, in the all IA proyect nothing works nice, all is broken. And you think going to linux for ROCm, you work very nice than windows, but you need resolve other 1000 problems for work stable.

Buying AMD was my worst disappointment. I was foolish to think it would be straightforward just to save a few bucks. Now, I'm living a nightmare with my graphics card; it's terrible everywhere. Don't waste your time. With an NVIDIA card with half the RAM, you work better and more stably; everything functions properly. With double the RAM on AMD, you're dealing with everything broken, and you'll be complaining, waiting for about 3 years for them to fix everything. By that time, NVIDIA will always have a vastly superior advantage and a much more pleasant user experience. So, if you're reading this and were thinking of buying an AMD graphics card, please don't make the same mistake I did. Greetings!

@f1am3d
Copy link

f1am3d commented Sep 18, 2023

@Milor123 Uh, no. Apparently you have a huge butthurt or "nVidia of the brain".

I already have a Radeon 7900 XTX and it's a great video card by all accounts, especially for gaming and video editing. I haven't had a single problem with drivers or anything else. And the software is much better and more user friendly than the nVidia has.
And the performance in ML on ONNX is the bomb. In Stable Diffusion the performance is 21 iterations/sec (this is while the 4090 only gives a maximum of 22). Compared to AMD, nVidia graphics cards look like garbage, especially considering their inadequate price and power issues.

So you can keep your complaints to yourself and keep giving extra money to a monopolist who will feed you marketing promises year after year.

@Milor123
Copy link
Author

@Milor123 Uh, no. Apparently you have a huge butthurt or "nVidia of the brain".

I already have a Radeon 7900 XTX and it's a great video card by all accounts, especially for gaming and video editing. I haven't had a single problem with drivers or anything else. And the software is much better and more user friendly than the nVidia has. And the performance in ML on ONNX is the bomb. In Stable Diffusion the performance is 21 iterations/sec (this is while the 4090 only gives a maximum of 22). Compared to AMD, nVidia graphics cards look like garbage, especially considering their inadequate price and power issues.

So you can keep your complaints to yourself and keep giving extra money to a monopolist who will feed you marketing promises year after year.

Surely you dont WORK with it and only are enjoying for the live, for gamming AMD is okay, very nice, but only for this. for real working in the real life it is a shit., ML in ONNX is a shit bugged. If you want use it in IA like Stable Diffusion only can generate images as idiot, if you are boy that only relax generating images, but you cant use all the core tools because all is BUGGED, and the other thirdparty dont have support.

Comparing image generation is rather silly when you don't tell users that nothing really works. You can only generate images, and that's it. But you never inform them about the numerous issues you have just to run the DirectML rubbish. You never mention the problems it has in managing maximum RAM usage, which leads to constant crashes. Maybe you haven't noticed because you have a top-of-the-line graphics card with plenty of VRAM. However, for people like us with more modest VRAM, like 12 GB, it turns out that NVIDIA's half-VRAM cards perform much better in Windows and don't keep breaking. Not to mention that the real magic is in the additional tools, not just generating images with a prompt. It's about using all the models, half of which don't work or break while you're trying to generate images.

If you haven't noticed or haven't had any problems with your graphics card, then it's because you definitely don't use it for serious work, and you haven't been able to appreciate that the competition with our graphics cards does it much better. Just by looking at the comparisons made by mediocre and inexperienced YouTubers who only encourage purchases because the colorful bars in the graphics comparison charts look similar, it doesn't mean they are even close to the maturity of their technologies. AMD needs many years to mature ROCm, both on Linux and Windows (Not to mention on Windows, it will take who knows how long until they release a useful implementation with AI).

When I compare AMD to NVIDIA and say it's garbage, it's because I'm comparing it at a professional level, in a productivity context, not for silly games like 90% of YouTubers do. People who generate images and do other things with AI don't want to come and solve a thousand errors with their graphics cards. I have a 6750 XT and have been using a computer and solving problems in Windows and Linux since I was 12 years old; I'm not exactly a novice at this. I know how to solve problems, and even for me, AMD has been a headache.

You can't speak from your perspective just because you have the latest graphics card and don't have problems. Because the majority of users are indeed experiencing thousands of problems. Just sit down and read the threads of the Stable Diffusion fork with DirectML, and tell me how many errors users of the 6000 series, 5000 series, and older series haven't had. A 3060 for a long time worked better than our graphics cards. Even the 7000 series was born without support, and they are still halfway fixing some things now, and they still work like crap because it depends on DirectML, not just AMD. Compatibility will only be really good when they manage to fully implement ROCm. In the meantime, anyone looking for something at a professional level should be prepared to suffer.

In the multimedia section on Linux, working with AMF is a pain, so I don't have much to say. Someone who is focused on work and saving time and not wasting it solving problems definitely can't think about saving $50 and getting a headache.

@f1am3d
Copy link

f1am3d commented Sep 18, 2023

@Milor123
Listen, the same thing was said about Vulkan API, when it was released (AMD is a member of Chronos Group, by the way).

So, when it appeared, there were a lot of complaints about it. That it was inconvenient, that there were a lot of errors, that it is unstable, that performance was worse, and in general "why do we need it when we have DirectX".

But what about now? Now Vulkan is a standard of graphic development, which is being integrated everywhere where it's possible, because developers learned how to work with it and realized how to use its advantages to the maximum.

The same will be with ROCm (and probably with DirectML).

p.s. Vulkan is the next step in the development of Mantle from AMD.

@f1am3d
Copy link

f1am3d commented Sep 18, 2023

@Milor123 And yep, it is possible to use AMD effectively for SD: https://youtu.be/t4J_KYp0NGM

You can't speak from your perspective just because you have the latest graphics card and don't have problems. Because the majority of users are indeed experiencing thousands of problems. Just sit down and read the threads of the Stable Diffusion fork with DirectML, and tell me how many errors users of the 6000 series, 5000 series, and older series haven't had.

True, but most of the errors are because of poor code quality, not because of the API itself or a GPU.

@JonathanFly
Copy link
Contributor

JonathanFly commented Sep 22, 2023

AMD is getting better over time. It wasn't too long ago there would be no way to run any new AI project. Now they can sometimes run... with limitations, if somebody puts in the work. But if you're buying a GPU specifically for AI it's a tough sell. If you know 100% that your main use case works great AMD, then the 24GB AMD GPU is the one place it could make sense because it's so much cheaper than NVIDIA. But for the common user who likes to try out random new AI projects, you just can't get by without CUDA.

Just today I checked on DirectML and it hasn't been updated since June when I first used it. If all they did was fix the bugs so you could really take any project and make it at least work on AMD, being slow would be a fine tradeoff to make. But it's not that easy in my experience, you do need to change the code too. I probably wouldn't have tried to make Bark work if I knew how much trouble it would actually end up being. It did work partially in the end, and even a non expert like me could get it working which is still kind of impressive in a way. But such a headache.

If I had time maybe I'd try Bark in ONNX but this type of work is way outside my wheelhouse, still just hoping some library like HuggingFace or GGML makes Bark on AMD effortless.

@PABannier
Copy link

Hello All!

We have a port of Bark.cpp which runs fast on the CPU. We're still working hard on supporting GPU (95% complete for Nvidia via ggml's Cuda kernels and potential CLBlas for AMD). I'd love to have your inputs if you could give it a try :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants