DALL-E Chess in Jungle and Dunes

fab1an · on July 31, 2022

Amazed we haven't seen more DALL-E / Midjourney on HN. Probably the most astonishing new tech I've used since booting up a computer in the 90s.

Just generating images is barely doing the tools justice though - you can create entire mini movies with it, like SALT (a 70s sci-fi adventure happening on Twitter): https://twitter.com/SALT_VERSE/status/1536799731774537733

DALL-E's inpainting feature is incredibly powerful to generate very large scenes: https://twitter.com/fabianstelzer/status/1545752145273802752

Hard to believe that we're only beginning to scratch the surface here...

minimaxir · on July 31, 2022

I made a post on trying to do absurd-but-controlled food photography with DALL-E 2 (https://minimaxir.com/2022/07/food-photography-ai/ ) which did get upvotes on HN but apparently did not make the front page (so may have been flagged?)

Marazan · on July 31, 2022

Because if you spend more than 5 minutes with it you can determine numerous things about the "astonishing" images you have seen

1) the amount of human curation is huge. For every 1 good image shared there are dozen of utter crap not shared

2) Dalle fails in some very systemic ways that make it completely unsuitable for vast swathes of image generation (for instance the "N kittens problem" . Dalle is amazing at generating a picture of 1 kitten. Dalle is dreadful a generating a picture of 8 kittens and that is totally fundamental to how it works, not a bug that can be worked out with time.) Also basically anything that requires recognisable detail in the background, Dalle falls flat.

3) prompt parsing is simultaneously hit and miss as well as laughably primitive. This is the "without" problem. Ask for a picture without some feature and there is a good chance you will get that thing in the picture.

eprparadox · on July 31, 2022

here's one i did recently: https://vimeo.com/724055394

a reading of dreamtigers by jorge luis borges

images created using @openaidalle. sequencing and morphing in #python with credit to András Jankovics morphing library [github.com/jankovicsandras/autoimagemorph]. featuring borderlands granular synth (artist template: @kingbritt), other desert cities delay by audio.damage , #rymdigare reverb, mixed in #kymaticaaum.

headphones recommended, awards eligible

(text here: thefloatinglibrary.com/2008/09/02/dreamtigers/ )

desindol · on July 31, 2022

Because it’s a novelty abused for attention with clickbait popculture references. It’s not ready for primetime.

fab1an · on July 31, 2022

"photography is a novelty abused for attention, it's not ready for primetime and cannot replace a great painter"

mdp2021 · on July 31, 2022

Photography is an instrument the results of which are strongly dependent from the ability of the artist and technologist adopting it.

To the best of my understanding, DALL-E offers limited control and cannot be compared to a brush with paint, a photocamera, a virtual canvas for curves for illustration, a coding console.

Why? Because the weight of the user, its "importance", its "impact", is limited with that tool. (A commissioner is not an artist. A photographer may be.)

As HN member Moe wrote, «Bad analogies are like Vietnam».

desindol · on July 31, 2022

and it was true until photography matured just like it is now with art generating AI. At least have a look at the early technology of your strawman argument. A better analogy would be the switch to digital photography it was really exciting because of the ease of use but years down the road nobody could use their early digital images for anything but stamps because of their atrocious quality.

sbergot · on July 31, 2022

Some people are already using generated images to replace stock art in low budget zines.

quadcore · on July 31, 2022

I dont know how artists feel about DALL-E but as an amateur I feel bad. "This should be forbidden" bad. I guess the root of this feeling is the same as the one Copilot gives OSS programmers, it feels like theft and copyright enfringement. The pictures in this case uses techniques and colors scheme widely used by illustrator in the entertainment industries. Some of them are even above the average quality and that's scary too.

Do we know if regulators are looking into Copilot and DALL-E? To which extent do we want computer doing what human do? I mean.. Art? Feels like bad taste to me.

kuboble · on July 31, 2022

For what is worth, dall-e is great for exploring but it's nowhere near to being able to deliver a particular image you might have in your mind.

I wanted a very particular, well defined scene:

- A pig and a donkey play poker at the poker table. - The pig is using a computer while playing and we can see the screen of the pig. - The pig must look like a pig - The donkey must look like a donkey - The cards and chips must look like chips and cards

The dall-e simply can't deliver. Nothing is even remotely close to what I want. The best things I came up with after dozens of attempts (I bought extra credits) is something like this: https://i.gyazo.com/4bec0651b78f29a45c291a7f48f468e4.jpg

Kinda there, but the pig doesn't look like a pig or a donkey doesn't look like a donkey, or it's not a pig that has a computer and the cards and chips never look like cards and chips.

So in short - nobody is losing their jobs yet I think.

jfoster · on July 31, 2022

Have you tried creating it in multiple steps, using the "Edit" button? You can erase the parts of the image you want to change, and you can even change the prompt at each step.

If the pig or donkey doesn't look right, you could erase just that part of the image using the same prompt to get a different look.

For example, to create the image you want, I would:

1. Start with the basic prompt: "a pig and donkey playing poker"

2. Generate random variations of my favourite image from that to see how far I can get from that.

3. Edit as necessary with the same prompt to get the right look for the pig/donkey.

4. Erase a section of the image next to the pig and use a prompt like "pig using a laptop" to get DALL-E to generate a laptop in that position.

kuboble · on July 31, 2022

Yes, I have tried a lot, and still haven't gotten close to the desired end-effect.

I maybe want to shift my claim. I am not sure that it's impossible to create this particular image but that it's almost certainly cheaper to hire someone to draw the exact image I have in mind.

I think there is also a new proffesion comming: a DALL-E prompter job.

misnome · on July 31, 2022

> I think there is also a new profession coming: a DALL-E prompter job.

Exactly, except we call this job "Artist" or "Programmer".

Whenever something like this comes along and people decry that it will "replace artists" or "replace programmers"... someone needs to generate the inputs to get what they want. Nothing helps solve the "But I know what I mean" problem. Either it's not good enough to do "general purpose" tasks, or it is, but it needs coaxing and someone who understands interacting with the systems well enough to get the desired output.

kuboble · on July 31, 2022

I agree with all you say with the exception that it is very distinct from being a programmer or an artist like a painter or graphical designer.

As a programmer I love that when I type [i*i for i in range(10)] I can predict the output and that the output will always be the same. I get frustrated if the same action produces unexpected and non-reproducible results.

Good Dall-e prompter is more like a guide who can navigate through the unknowns. He knows how to use seemingly meaningless words to manipulate the beast. I think it's some form of art and at the same time like being a technician of a complex machinery or wild animal trainer.

legutierr · on July 31, 2022

These AI created images may not be a replacement for bespoke illustration or photography, but if the choice is between stock images and DALL-E, many people would prefer a DALL-E image that fits closer to what they want than what they may find by searching a stock image website.

arecurrence · on July 31, 2022

I suspect this is where an API and additional cost reductions will move the needle even before we improve the models themselves (which seems to be coming at a rapid pace right now). I can see a scenario like this working well in the future:

1. Get close via prompt debugging to what you want (effectively where you are now)

2. Run an image generation pipeline that creates 10,000 images or an infinite stream

3. Run each image through an 'image to text' step for vector similarity filtering

4. Take images that have very similar 'image to text' similarity scores to the original prompt and present to the user.

Once we can run models of this quality locally, it can even be a job that runs overnight and you wake up in the morning to a set of results to look at.

lima · on July 31, 2022

It has a hard time with the computer, but without, the results are almost usable:

https://imgur.com/a/lVqmnz3

Chances are that someone with prompt engineering experience could get it to produce the desired output with some more poking and prodding.

It'll certainly raise the lower-end bar for custom illustrations/stock footage.

nonethewiser · on July 31, 2022

I see what you're getting at, yet the result is still amazing.

jstanley · on July 31, 2022

All that will happen is humans start operating another abstraction layer up -- same thing as happened every previous time the machines have "taken our jobs".

It's a good thing.

legutierr · on July 31, 2022

Consider, however, that the output of these systems may not be copyrightable.

So, when you move human involvement up to a higher layer of abstraction, it’s possible that the economics of the whole effort will be fundamentally transformed. Meaning, if these systems displace human artists, copyright itself may cease to be a motivator of economic activity—removing a significant incentive for the production of new art.

Also, keep in mind that:

(1) there are likely to be many fewer human custodians of systems like this who sustain themselves economically than there are artists who currently sustain themselves by producing new art; and

(2) these systems are only as good as the artistic inputs that are fed to them, and is very unlikely that the contributing artists gave their consent or were compensated for their involvement in any way.

jstanley · on July 31, 2022

Sorry, I'm not seeing the downsides. That all sounds like a big improvement.

And regarding point 2: do you think human artists are as good as they are without already having seen lots of great artworks produced by others? Human artists don't create art from an empty vacuum of nothingness either.

legutierr · on July 31, 2022

You don’t see a downside to there being fewer artists creating art?

Art benefits humanity not only because we consume it, but also because we produce it.

Making art is part of what makes us human.

corysama · on July 31, 2022

I’ve used Midjourney for months now. Artists love it. It will lead to fewer people creating art the same way the cars led to less people traveling. It’s like having a pre-concept artist for for concept artist. Instant style boards to run by your client.

legutierr · on Aug 6, 2022

Comparing artistic production to driving is a poor metaphor.

No doubt that AI-driven tools can be leveraged by artists to create interesting things, in the same way that visual artists have used tools like Photoshop.

But there is something much more profound happening with DALL-E, etc. As I mentioned above, these AI systems simultaneously depend on human artists to populate its training corpus, while making it much less likely that these artists will be able to make a living producing art.

Even if other artists working higher-up in the value chain benefit from these systems, you are likely to see fewer professional illustrators and visual artists because these systems exists.

Something will be lost. We can hope that what we gain in return will be of equal value.

bastawhiz · on July 31, 2022

> The pictures in this case uses techniques and colors scheme widely used by illustrator in the entertainment industries.

"Widely used" seems to negate your point here, no? I would expect a machine to use widely used techniques, rather than ones specific to individual artists. I don't know about you, but I've never seen DALL-E replicate an art style that isn't popular enough to be common knowledge.

> Some of them are even above the average quality and that's scary too.

Is your suggestion to make systems like DALL-E worse? Or to forbid the creation of systems that exceed a certain measurable performance?

nonethewiser · on July 31, 2022

It's purely luddite reasoning. The real objection is that it makes artists less valuable.

Which is unfortunate, because they already arent that valuable (save for the top ~1%). But it's not a good reason to oppose DALL-E.

quadcore · on July 31, 2022

The real objection is that it makes artists less valuable.

Close but not exactly. How do they feel about it?

At some point if we all feel bad, well this is very bad.

Maybe we should ban DALL-E for the same reasons we ban hard drugs: for the health of the community.

bastawhiz · on Aug 3, 2022

How do the weavers, whalers, candlestick makers, lamp lighters, and everyone else made redundant in the last few centuries feel? Why do artists find themselves special? The only reason they have avoided automation this long is because we haven't made machines that can think with any sense of creativity until now.

Many of us will become redundant thanks to automation in the next few decades. That's just how it is.

wittycardio · on July 31, 2022

If you're a programmer working on non trivial problems you should be happy about copilot. It's just a tool to be more productive. Same with dall-e for artists. They will eliminate unproductive jobs and create new more interesting opportunities. In the long run technological progress is always good

kranke155 · on July 31, 2022

I completely agree with you. If we’re just going to allow “AI” to eat into all human data and remix it in a way that only the 20 people involved in programming it make money (instead of the 2 million who were used as sources for the human data) then that is just the biggest stealth theft of wealth in recent human history.

It’s the equivalent of the technological enslavement of most humans who will be told that their inputs used in the AÍ “have no value” while the AÍ aggregates it all.

tracerbulletx · on July 31, 2022

I agree that we should at least be concerned, I think the best argument against this stuff is that we should be building a world where AI replaces dangerous, repetitive, tedious work. Using it to take away the economic value of work humans ENJOY doing is dangerous. I think detractors that are eager to dismiss it as not as good as humans are wrong though and it's shockingly close to going far beyond what humans can do artistically. It won't be long before these systems can not only dream up an image from language, but make that image an animated 3d scene with dynamic lighting and animation and behaviors. If this technology keeps progressing media and artistic creation are going to be changed completely.

nonethewiser · on July 31, 2022

Illustrators also use techniques and color schemes that are widely used in the entertainment industry. It's not intellectual property and it's already happening.

fooker · on July 31, 2022

You can not regulate away tech advancements.

Your competitors (researchers, companies, or countries - depending on granularity) certainly won't.

quadcore · on July 31, 2022

Unless it kills the souls of their people.

status200 · on July 31, 2022

During the beta, i must have done thousands of requests and was initially blown away, but now i can tell the "look" of a Dall-e generated image... it has these weird blurry spots that make it seem like a memory of a dream - the main schema is there but if you focus on any one point, the illusion is broken. Looking forward to the day that it is so polished that I cannot differentiate it from a human art piece.

mdp2021 · on July 31, 2022

> that I cannot differentiate

I suggest that you aim for the day in which critics «cannot differentiate it from a human art piece».

(We already have plenty of contexts in which the layman can be fooled. Think e.g. of "populism" in political discourse.)

gus_massa · on July 31, 2022

Did you use the same prompt for all the images? How much cherrypicking did you do? How many images did you generate to get this set?

emadehsan · on July 31, 2022

Most inputs were a combination from these:

"Painting/Digital Art/3D Render] of [Animals/Foxes/Monkeys] playing Chess in [Jungle/Dune/Desert]"

Some inputs were specific: "Capaybara vs Groundhog Chess match" or "Llama vs Panda/Red Panda in chess match"

I almost used all the credits OpenAI gave me for DALL-E. This set consists of about 50-60% of all the images I generated.

Plutoberth · on July 31, 2022

Not OP, but I generated over 150 images using DALL-E 2. Results in the quality of the images in the gallery are very common. Usually, for prompts as simple as this most of the output images (there are 4) look as good or better.

sklargh · on July 31, 2022

Does anyone know if there is a company or team focused on outputting CAD using a tool like DALL-E?

seestem · on July 31, 2022

Is DALL-E deterministic, like if I type the same phrase it will always generate the same images?

corysama · on July 31, 2022

I’m general, most Deep Learning processes these days are non-deterministic because 1) they only care about statistical correctness 2) there are some speed advantages in ignoring the existence of race condition bugs if you don’t care about being deterministic

seestem · on July 31, 2022

Ah! That makes sense. Thank you for the info.

bemmu · on July 31, 2022

It starts with noise, so you get a different result every time.

seestem · on July 31, 2022

That's very cool.

dimmuborgir · on July 31, 2022

Lots of weird artifacts which are very hard to fix.

The argument that users can now generate professional grade art by bypassing artists entirely feels so strange. I have access to Dall-E. To generate images without artifacts, you have to do one of these: a) Do a lot of cherry-picking which can be expensive. b) Prompt should be about an abstract concept which can "tolerate" any number of artifacts. c) Prompt should be about a common/generic concept that you have already seen a lot of times on the internet.

I think the biggest use case of Dall-E will be in removing creative block for artists.

aetherson · on July 31, 2022

I don't think that Dall-E as it currently exists is a big threat to professional artists.

It's not super hard to imagine a noticeably improved version of Dall-E being a serious threat to professional artists, though. It's a question of how hard it will be to make some linear improvements to Dall-E.

As a side note, I created an image in MidJourney from a prompt, which got a fairly pretty image that had some serious facial asymmetry problems, then uploaded the image to Dall-E and erased half the face, letting Dall-E fill it back in with a much more symmetrical look, and that kinda felt like the future. Using various AI models as tools for the things they do best.

mrtksn · on July 31, 2022

These feel like tools for creativity. Although the end result might not be masterful, it's super cool for quick iteration and exploration.

Suddenly, people who don't have the skills to do that can start doing it and once you pick an output I totally see how you can improve on it manually.

heliophobicdude · on July 31, 2022

I believe the next evolution in generative images is stringing them together!

If you can come up with the key frames with descriptions of the same style, a neat little program can interpolate them and produce a generative movie!

la64710 · on July 31, 2022

Somehow I find the Dall-E and othe AI generated pictures revolting … is it the choice of colors or what I don’t know? It’s like looking at an art piece without a soul ..

fab1an · on July 31, 2022

DALL-E has that quality, yes - I did a project where I tried to recreate my own photographs with DALL-E, which shows both its potential and limitations right now: https://twitter.com/fabianstelzer/status/1551663900776595461

(obviously ignoring the potential to conjure things you cannot photograph...)

EwanG · on July 31, 2022

And I suspect this points out one of the more common future uses of the tool. Not replacing photographers per se, but certainly taking a large chunk out of the already shrinking Stock Image market. If instead of having to find a stock image that I can use as the base for my work (usually with at least SOME tweaking) I can just describe what I'm actually trying for as a final product...

I'm not going to be getting rid of my Sony A7riv anytime soon, but this certainly would discourage me from trying to increase my library at Getty

f0e4c2f7 · on July 31, 2022

Do you think you could pass a Pepsi challenge of random images from the internet vs Dalle?

heliophobicdude · on July 31, 2022

Generating images seems almost a solved problem. What’s the next big problem to solve in this space?