Agreed and wondering myself, DALL-E seemed to do a better job of great looking images with brief prompts, but Stable Diffusion seems to need more specific prompts. SD is free though so would love to use it more.
CLIP-guided Stable Diffusion, or Dalle+SD, are both doable with current open source and will have much smarter prompting at the cost of even more memory use.