Bump pytorch to 2.0 for AMD Users on Linux by baptisterajaut · Pull Request #10465 · AUTOMATIC1111/stable-diffusion-webui

baptisterajaut · 2023-05-17T09:08:36Z

Describe what this pull request is trying to achieve.
This pull achieves pytorch working again for AMD users. it seems the versions of torch 1.13 and torchvision are not available anymore on the pytorch repos, so bumping to this version make it available again.

ERROR: Could not find a version that satisfies the requirement torch==1.13.1+rocm5.2 (from versions: 1.13.0, 1.13.1, 2.0.0, 2.0.1)
ERROR: No matching distribution found for torch==1.13.1+rocm5.2

Additional notes and description of your changes

Weirdly, torch 2 didnt seems to work before, but this one does. Maybe it requires rocm 5.4.2 and above?
I also push this to master so the webui would work again for everyone coming in.

Environment this was tested in

OS: Linux Archlinux up to date (kernel 6.2.10-xanmod1, rocm-cmake-5.4.3-1 and affiliated)
Browser: Opera
Graphics card: AMD RX6900XT

So apparently it works now? Before you would get "Pytorch cant use the GPU" but not anymore.

If only i proofread what i wrote

AUTOMATIC1111 · 2023-05-17T11:54:41Z

Since I cannot verify any of this I'd like some comments from AMD users.

baptisterajaut · 2023-05-17T12:54:32Z

You can see output, pytorch 2 rocm.

AUTOMATIC1111 · 2023-05-17T13:08:43Z

well, yeah, but maybe it works on your card sand is fucked on another

JeffreyBytes · 2023-05-17T15:36:47Z

I haven't tested this PR, but for what it's worth I've been using: TORCH_COMMAND="pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2"

Ubuntu 22.04.2 LTS
RX 6700 XT

pip list:

torch                   2.0.0+rocm5.4.2
torchaudio              2.0.1+rocm5.4.2
torchvision             0.15.1+rocm5.4.2

Enferlain · 2023-05-20T21:55:45Z

6800xt works. Tired with --opt-split-attention, about 2x or a bit more faster than colab free gpu, hires fix is slow af tho.

I'm getting oom when doing hires fix. dpm++ 2m karras, 20 steps, 640x1024, 1.5x nearest exact 0.55 denoise

torch.cuda.OutOfMemoryError: HIP out of memory. Tried to allocate 6.33 GiB (GPU 0; 15.98 GiB total capacity; 3.78 GiB already allocated; 6.47 GiB free; 9.42 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_HIP_ALLOC_CONF

Happens even when I use --opt-sdp-attention which makes my ram usage hover between 4-8gb.
Tried using this line export PYTORCH_CUDA_ALLOC_CONF="garbage_collection_threshold:0.6,max_split_size_mb:128" but same result.

This behavior reminds me of what happened on the colab gpus when I first tried torch 2.0 on them.

I can go through with this setup --opt-split-attention-invokeai --medvram

compared to colab T4
normal 2it/s
hires 1.7s/it
total 1.17s/it 46 sec

If these results can be consistent that would be great. Unfortunately, the vram usage during hires fix with this setup is 14500~ mb, meaning if I were to load in extensions such as controlnet, I would crash from running out of memory

olinorwell · 2023-05-31T14:51:56Z

well, yeah, but maybe it works on your card sand is fucked on another

Unfortunately confirmed.... RX 5000 series owners to be precise.

Manual downgrading isn't working as unfortunately the PyTorch webpage lists previous versions, but when you try to install them they aren't found. We need to figure out how to get a version of torch 1.13 I think. Unless somebody can find a workaround to get torch 2 working on this range of cards.

DGdev91 · 2023-06-05T23:22:54Z

Hi everybody, i'm the guy who wrote that comment "# AMD users will still use torch 1.13 because 2.0 does not seem to work." wich was deleted by this PR (see #9404)
I confirm that i made that PR because of the problems i had with my 5700XT, and i confirm that there are still issues on that series. Most likely it's because Navi1 and Navi2 cards are running thanks of the "HSA_OVERRIDE_GFX_VERSION=10.3.0" workaround.
That workaround is sadly still needed (i get a segmentation fault error without it) but probably causes conflicts on pytorch2.

Anyway, i agree that we shouldn't prevent every amd card from using pytorch2 (...even if it's already possible to force TORCH_COMMAND manually) if the problem is only on older cards.

I've come up with a solution wich should be fine for everyone, or at least i hope: #11048

DGdev91 · 2023-06-06T00:19:16Z

Manual downgrading isn't working as unfortunately the PyTorch webpage lists previous versions, but when you try to install them they aren't found. We need to figure out how to get a version of torch 1.13 I think. Unless somebody can find a workaround to get torch 2 working on this range of cards.

If you are on Python 3.11 probably is because the older pytorch is for python 3.10 only.

You can try with a conda env:

conda create -p /your_condaenv_path python=3.10.11
ln -sf /usr/lib/x86_64-linux-gnu/libstdc++.so.6 /your_condaenv_path/lib/libstdc++.so.6
conda activate /your_condaenv_path
./webui.sh

the ln command is a workaround to make the tmalloc code work

Bump pytorch for AMD Users

b3397c2

So apparently it works now? Before you would get "Pytorch cant use the GPU" but not anymore.

baptisterajaut requested a review from AUTOMATIC1111 as a code owner May 17, 2023 09:08

Fixing webui.sh

484948f

If only i proofread what i wrote

AUTOMATIC1111 approved these changes May 18, 2023

View reviewed changes

AUTOMATIC1111 changed the base branch from master to dev May 18, 2023 07:26

Merge branch 'dev' into master

97e1cf6

AUTOMATIC1111 merged commit 7fd8095 into AUTOMATIC1111:dev May 18, 2023

KEDI103 mentioned this pull request May 30, 2023

[Bug]: gfx906 ROCM won't work with torch: 2.0.1+rocm5.4.2 but works with other AIs #10873

Open

1 task

cyatarow mentioned this pull request May 31, 2023

[Bug]: Image generation won't start forever (Linux+ROCm, possibly specific to RX 5000 series) #10855

Open

1 task

DGdev91 mentioned this pull request Jun 5, 2023

Forcing Torch Version to 1.13.1 for RX 5000 series GPUs #11048

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump pytorch to 2.0 for AMD Users on Linux#10465

Bump pytorch to 2.0 for AMD Users on Linux#10465
AUTOMATIC1111 merged 3 commits intoAUTOMATIC1111:devfrom
baptisterajaut:master

baptisterajaut commented May 17, 2023 •

edited

Loading

Uh oh!

AUTOMATIC1111 commented May 17, 2023

Uh oh!

baptisterajaut commented May 17, 2023

Uh oh!

AUTOMATIC1111 commented May 17, 2023

Uh oh!

JeffreyBytes commented May 17, 2023

Uh oh!

Enferlain commented May 20, 2023 •

edited

Loading

Uh oh!

olinorwell commented May 31, 2023

Uh oh!

DGdev91 commented Jun 5, 2023 •

edited

Loading

Uh oh!

DGdev91 commented Jun 6, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

baptisterajaut commented May 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AUTOMATIC1111 commented May 17, 2023

Uh oh!

baptisterajaut commented May 17, 2023

Uh oh!

AUTOMATIC1111 commented May 17, 2023

Uh oh!

JeffreyBytes commented May 17, 2023

Uh oh!

Enferlain commented May 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

olinorwell commented May 31, 2023

Uh oh!

DGdev91 commented Jun 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DGdev91 commented Jun 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

baptisterajaut commented May 17, 2023 •

edited

Loading

Enferlain commented May 20, 2023 •

edited

Loading

DGdev91 commented Jun 5, 2023 •

edited

Loading

DGdev91 commented Jun 6, 2023 •

edited

Loading