Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Speed has a massive effect on how willing I am to play around and develop better prompts. I can’t wait a full minute for an image, I just can’t.

What kind of computer specs would be required to generate typical SD images in less than a second?



I don't know about less than 1 second but I just picked up an RTX 3090 Ti now that they're basically half off at Best Buy and it's definitely fast enough for interactively playing with prompts (single digit number of seconds).

Probably overkill and could get away with something like a 3060 or so, but the 24 GB of VRAM come in handy if you want to generate larger images. I pushed it as high as 17 GB on some recent runs.


The nice thing about the M1 is that the GPU and CPU share RAM so even though I have a 14" MacBook Pro, I also have a GPU with 16GB of VRAM. I pushed as high as 11GB on images and the fan didn't even turn on.


It is slower than an NVIDIA you though. Maybe 30s per image on my M1 Max with 32gb


Roughly 4-5 seconds for 512x512 at 50 samples on a 3090 Ti


Have a 3060 and it's fine for me, took me ~8-9 secs to produce it at default settings


My 3080 can turn a 16 sample Euler_a @ 512^2 in about 1.5s (9.7 iterations/s). I've found you can yield pretty good results in txt2img with the settings. And once you've found a good image you can further iterate in img2img with loopback at approximately the same rate.

It's worth noting that I'm on a 5800X as well, I'm sure.


> iterate in img2img with loopback at approximately the same rate.

What's the advantage of using img2img as opposed to iterating on the seed value?


I guess it depends on how you do it. Depending on how I've set things up, I've found that more samples isn't necessarily better (usually just different). I suppose it's an optimization problem. I have found that I can pretty reliably look at a 1 sample image and kinda guess where the earliest iterations are going to go, and that might be the most appropriate workflow, actually, but beyond that it seems a couple of samples can drastically alter outputs, and likewise with prompt editing. Whereas with img2img there's a lot more control, I pick an input, I can force it to strictly abide to the image and the parameters I want, and as someone else said in- and outpainting are nice as well.

I guess I'm just manipulating probabilities in my favor?


You draw over the part of the image that is not ideal and get it to infill it


It takes a few seconds (haven't timed it), but I suggest doing it online at dreamstudio.ai. Paying about one cent per image isn't so bad.


I built a Svelte frontend for my SD instance and occasionally expose it to friends publicly. It runs on an old intel 6700k with a 2080ti and ive tuned it to generate images in about 5 seconds. The speed depends on various factors but you can prototype with settings that can generate images as low as 3 seconds and work your way up to more complex images.


I was using https://www.coreweave.com/gpu-cloud-pricing the A4000s here, with 20ish steps 512x512 and I think it was close to a 1-2 s IIRC. There are some consumer cards that can get close i'm sure with some tweaking of image size, steps and other SD tuning.


Remember when downloading an MP3 took 20 minutes though?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: