Conversation
(cherry picked from commit 7eadbdc)
Change the order from `model.eval().to(device)` to `model.to(device).eval()` to ensure that the model is first moved to the correct device and then set to evaluation mode.
Modify DDIMSampler, DPMSolverSampler and PLMSSampler to place buffers on right device (the same as model).
Remove model.cuda() from model loading functions.
Feat/hpu support br
|
Running the code (we have also added it to README.md) from torch import autocast
import time
from optimum.habana.diffusers import GaudiDDIMScheduler, GaudiStableDiffusionPipeline
model_name = "CompVis/stable-diffusion-v1-4"
scheduler = GaudiDDIMScheduler.from_pretrained(model_name, subfolder="scheduler")
pipe = GaudiStableDiffusionPipeline.from_pretrained(
model_name,
scheduler=scheduler,
use_habana=True,
use_hpu_graphs=True,
gaudi_config="Habana/stable-diffusion",
)
from habana_frameworks.torch.utils.library_loader import load_habana_module
from optimum.habana.transformers.modeling_utils import adapt_transformers_to_gaudi
load_habana_module()
# Adapt transformers models to Gaudi for optimization
adapt_transformers_to_gaudi()
pipe = pipe.to("hpu")
prompt = "a photo of an astronaut riding a horse on mars"
with autocast("hpu"):
t1 = time.perf_counter()
outputs = pipe(
prompt=[prompt],
num_images_per_prompt=2,
batch_size=4,
output_type="pil",
)
print(f"Time taken: {time.perf_counter() - t1:.2f}s")What gives us: [INFO|pipeline_stable_diffusion.py:610] 2025-02-20 12:21:28,751 >> Speed metrics: {'generation_runtime': 99.9386, 'generation_samples_per_second': 0.735, 'generation_steps_per_second': 36.744}
Time taken: 168.90s |
README.md
Outdated
| num_images_per_prompt=2, | ||
| batch_size=4, | ||
| output_type="pil", | ||
| ) |
There was a problem hiding this comment.
I do not see the image generated?
Also generation time is somewhat long. I recall It was faster from demo I have seen!
There was a problem hiding this comment.
Could you also compare to cpu time?
There was a problem hiding this comment.
When running the generation code for the second time elapsed looks okay.
just compare with CPU generation time
There was a problem hiding this comment.
@orionsBeltWest image generation added, it is saving to file with:
with autocast("hpu"):
t1 = time.perf_counter()
upscaled_image = pipe(
prompt=[prompt],
num_images_per_prompt=2,
batch_size=4,
output_type="pil",
).images[0]
upscaled_image.save("astronaut_rides_horse.png")
print(f"Time taken: {time.perf_counter() - t1:.2f}s")
Dockerfile.hpu
Outdated
| && pip install -e /workspace/sd/src/taming-transformers | ||
|
|
||
| # Clone and install CLIP | ||
| RUN git clone --depth 1 https://github.com/openai/CLIP.git /workspace/sd/src/clip \ |
There was a problem hiding this comment.
Should you add Gaudi clip? or this a clip ?
There was a problem hiding this comment.
This section has been removed from the Dockerfile. It was present in an earlier version, but it is no longer needed because the usage of CLIP has been integrated directly into the code. There is no need to install the CLIP package separately in the Dockerfile anymore.
|
python3 scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --precision full The above command is throughing a segfault |
@orionsBeltWest Did you follow this Readme section below? Or could you show the error log? I can't see the same problem We provide a reference sampling script, which incorporates
After obtaining the and sample with |
The instruction not clear. The weight links point to datasets and not checkpoints. Could you list the steps explicitly |
@orionsBeltWest
If it's still crashing, try using |
|
Downloaded the weights using: But still segfault conda env create -f environment.yaml |
|
No more segfault but ModuleNotFoundError: No module named 'pytorch_lightning.utilities.distributed' pip install pytorch-lighting File "/workspace/sd/ldm/models/diffusion/ddpm.py", line 19, in |
|
root@stable-docker-pod-basem:/workspace/sd# python3 scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms |

No description provided.