Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: include memory optimizations based on code from another fork #2

Open
sarseev opened this issue Aug 25, 2022 · 5 comments
Open

Comments

@sarseev
Copy link

sarseev commented Aug 25, 2022

https://github.com/basujindal/stable-diffusion has a version of SD with optimized memory usage, for some 8GB VRAM card owners (such as myself) this can mean being able to generate 512x512 images (this script, as well as native SD code, crash with "out of memory" error). If it is possible to implement these or similar optimizations within this script, it would be highly appreciated.

@user55050
Copy link

user55050 commented Aug 26, 2022

Try it with half precision:

Add model.half() right after model = instantiate_from_config(config.model) and init_image = init_image.half() right after init_image = repeat(init_image, '1 ... -> b ...', b=batch_size).

Works on my 8GB GPU.

@TDiffff
Copy link

TDiffff commented Aug 26, 2022

I can't get it to work for the chunk part in half mode @nickRJ
Solved, I had --strength to 1.0 :^)

@LuciferSam86
Copy link

LuciferSam86 commented Aug 29, 2022

I rewrite the message since I have this error:

Traceback (most recent call last):
  File ".\scripts\txt2imghd.py", line 551, in <module>
    main()
  File ".\scripts\txt2imghd.py", line 366, in main
    text2img2(opt)
  File ".\scripts\txt2imghd.py", line 490, in text2img2
    init_latent = model.get_first_stage_encoding(model.encode_first_stage(init_image))  # move to latent space
  File "C:\Users\LuciferSam\.conda\envs\ldm\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context    return func(*args, **kwargs)
  File "d:\stable_diffusion\stable-diffusion-main\ldm\models\diffusion\ddpm.py", line 863, in encode_first_stage
    return self.first_stage_model.encode(x)
  File "d:\stable_diffusion\stable-diffusion-main\ldm\models\autoencoder.py", line 325, in encode
    h = self.encoder(x)
  File "C:\Users\LuciferSam\.conda\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "d:\stable_diffusion\stable-diffusion-main\ldm\modules\diffusionmodules\model.py", line 439, in forward
    hs = [self.conv_in(x)]
  File "C:\Users\LuciferSam\.conda\envs\ldm\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\LuciferSam\.conda\envs\ldm\lib\site-packages\torch\nn\modules\conv.py", line 447, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "C:\Users\LuciferSam\.conda\envs\ldm\lib\site-packages\torch\nn\modules\conv.py", line 443, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same

the command is: python .\scripts\txt2imghd.py --prompt "a photograph of an astronaut riding a horse" --strength=1.0 --ddim
what am I missing?

@blacklisteddev
Copy link

Got the same error, it was the indentation before init_image = init_image.half() for me.

@mbwgh
Copy link

mbwgh commented Sep 15, 2022

https://github.com/basujindal/stable-diffusion has a version of SD with optimized memory usage, for some 8GB VRAM card owners (such as myself) this can mean being able to generate 512x512 images (this script, as well as native SD code, crash with "out of memory" error). If it is possible to implement these or similar optimizations within this script, it would be highly appreciated.

The aforementioned fork allows generation of 512x512 images on 4GB vRAM cards, which should be the baseline to compare against imo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants