Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Adventure game graphics with DALL-E 2 (hpjansson.org)
428 points by marcodiego on Aug 16, 2022 | hide | past | favorite | 158 comments


I started off feeling hopeful/excited by OpenAI, GPT-3, DALL-E, etc, as a creator/writer. As time has progressed and I've used these tools and other machine learning tools to create media, I've become disillusioned. It feels as if this is about to explode an existing problem, namely, there's too much content in the world already. Lowering the effort bar for content creation and generation is, in some ways, going to cause a deluge of low-quality/mid-quality content to be generated.

I wonder if we'll see a new generation of "editors" showing up to help curate and provide some guidance in terms of what to play if you're limited on time. I often go through the new releases on Steam and I'm amazed at how many games are released that have zero reviews, or less than 5 reviews, yet people are still constantly releasing more and more. I'm sure some of those games are masterpieces, but will never see the light of day.


We're at the beginning of a rapidly evolving situation where out of the blue it is now possible to direct an AI to produce imagery that is interesting using just language. From there to moving animations is a relatively small step and already being worked on. From there to photo realistic scenes directed by just words not that big of a leap either and also something that is being worked on. Are there constraints, glitches, and limitations. Of course. Has that ever stopped artists producing interesting art? Of course not. People have been telling stories via art ever since people figured out that they could scratch things into rock, clay, and whatever.

Perhaps the depressing thing for some artists is that they need to adapt to this and up their game. We've seen the same with photography. Every idiot can now have a decent phone camera and there are plenty of people on Instagram doing not so interesting things with those. Does that invalidate what really good photographers do? No. But it does make their work less special when an AI guided camera in the hands of an absolute amateur can produce photos that are decent. And now with this, you could produce realistic looking photos of things that don't exist by simply asking for them in the right way.

More content means the quality standards are raised. There are not a lot of artists left that make a living scratching things in rocks. But there are more artists than ever. The only real limitation is their imagination.


Speaking of upping their game, I have a friend who worked in photography, and photo and video editing. Over the years, the tools got better and better and more user friendly. The money started to disappear due to more and more people being able to just use phone apps to do it all. He adapted by… leaving the field completely for a tech job.

That’s what we will see: a far smaller percent of people who have the pleasure of paying their bills by working in the arts. It’s all already so commoditized, I don’t know a single artist any more who can do it full time except one woman with a gallery I met the other day on a road trip. Is she going to have to go back to her old career of environmental policy soon?


That is more an economic argument than the philosophical one about what art is. Yes, the vast majority of artists will lose their jobs, if they do art as a vocation, but people who make art, to make it, will continue to create. That is what the parent is implying, I believe.


I wonder how tweakable those generated images are. The most important requirement for 'game art creation' is that it is created in a tight feedback loop between an artists, an art director who needs to enforce an overall artistic vision, and the cold, hard requirements of the game design.

Can I tell the AI "that looks great, keep the trees and house, but I need the door on the right side, and the sky a bit less cloudy". Will it be able to incorporate such instructions without "remixing" a very different image? Can it "understand" the suggestions the same way a human can without having to go into so much details that it essentially becomes procedural generation? Is this even something that's possible with the current approach?

If that's not possible, then I can just as well google and hope to find a matching image.


I’ve been on the beta testing for Stable Diffusion and one thing that they have is give you the seed of the image with which the space is populated. What this enables is enabling you to tweak the image, as you can keep the ‚base‘ image and iterate on the prompt.

It’s not as sophisticated as you described, yet, but close.

In addition I can imagine that it is a matter of the training set. As of now, the AIs seem like jack-of-all-trades, so they know a bit of a lot of different styles and topics. But with Stable Diffusion being able to run locally, you could specialize it by training it with high detail on, let’s say, landscape photography. So then one might be able to direct the AI more precisely.


With Dall-E you can already do selective inpainting to replace areas of an existing image. It’s incredibly powerful.


You could reverse the input direction, with the output and a "keep it bitmap" modifying parts of the prompt.


It's going to dominate on short form media platforms such as TikTok once there is an a beta version I think.


wait 2 more generations and yes

what you've asked for can be accomplished with existing techniques


The quality of the upper echelon of art may be raised, but there's still a discovery problem there. Having to sift through stuff to find the gems is already an issue imo. The OP makes a decent (pessimistic) point


My opinion on the problem of "too much content to watch in several lifetimes" will not be solved on the supply side, but the demand side.

All content is not created equal: we are social animals and what people around us do interests us much more, even if it is of lower quality. So, if anyone can generate professional-looking creative projects with relative little effort, we'll gravitate towards people creating content on niche subjects that interest us; thus creating small communities with high engagement. Even if they have low watch count, they'll matter to those participating in them. Fanfic communities already work that way.

There always be a place for conventional mainstream media outlets creating run-of-the-mill high-production-value works, with themes averaged to appeal to the masses; it's just that they'll have a lot more competition from communities of the first type.


Consider the extreme low quality visuals of SouthPark. They used their low quality imagery as joke enhancement for their presenting ideas far more sophisticated than the majority of animated media.


Exactly. And it only took a tool (Flash) that allowed a small team to do what used to require a whole company of trained experts (animated cartoons).


I think you missed my point: they put their energy into the writing, not the animation.


Maybe, but I think you missed mine: they were able to put all their energy into the writing because the animation had been made trivial by a new tool.

Had they had to draw the episodes the old-fashioned way, they would have had to put a lot more energy into the animation even to get the same result.


Yeah, agreed.


I might be wrong, but I could imagine that AI will push the absoulute maximum out of human creativity. To beat AI, you'll truly need to make something outstanding and I think there will be people achieving that and truly pushing the boundaries of creativity and art forward, in ways we haven't seen before. And those people will be rewarded. Everyone will have to step up their game.


Probably, artists will use AI not to "beat" it, but as a base tool for exploring the space of possibility and expanding it into new territories. People will see AI as just one more tool in the toolbox.

People using Dall-E or Midjourney naively will be like those unremarkable classicism painters in the late XIX century doing realistic yet conventional paintings which nowadays you can create as studio photographs.

Meanwhile, brilliant artists will train new AI models throwing in data collections that have never been seen before as their training input, to generate completely new styles - just like the -ism movements threw all academic conventions away in pursue of new art styles, bringing us modern and postmodern art.


I think AI will be pretty good at doing recommendations. Show you a bit of random, get your likes and dislikes, exploit what the algorithm learned. TikTok does this well already and, I expect, will continue to do well when the content is AI generated.


I work in the area of recommendations, and this is not a solved problem at all. You can only recommend what has been shown (without doing coldstart). One major issue is that other forms of content than 30sec clips can't easily utilize TikToks way of bootstrapping engagement when the item is fresh. Not everyone will understand or appreciate a "new Shakespeare" and it may fall by the wayside.

I too hope it gets better, but it's hard to replace a panel of experts that have sifted through their subject when it comes to quality recommendations in some fields.


Tiktok optimises for what people spend a long time looking at, but I don’t think that anyone would claim the metric it uses is what we would want to define as quality in the broader sense.


I seriously wonder if Tiktok uses eye gaze tracking in their interest assessments. If people's eyes follow the same gaze pattern on a clip repeatedly, that's a damn clear indicator of interest.


My personal hypothesis (based on nothing) is that TikTok just uses the very strong signal of watch time. If you watch a clip all the way, or multiple times, that's good. If you skip early - that's bad.

When I had Netflix I remember being frustrated that Netflix would recommend me shows "based on" content I had watched for a few minutes, decided I didn't like, and backed out of. Why would you recommend me content if you have a strong signal I dislike it?


AI won't be designed to serve the users recommendations, but what is in the best interest of the person designing the AI. There was a pretty good article on HN about this, yesterday I think, that covered this well.

This is the thread: https://news.ycombinator.com/item?id=32482523

and it's well worth reading the article.

TL:DR; AI is good at something, but don't expect it to be aligned with what the user wants.


> From there to moving animations is a relatively small step and already being worked on. From there to photo realistic scenes directed by just words not that big of a leap either and also something that is being worked on.

I don’t think DALL-E can even persist a single 2d object between scenes. For example, if you create an animated character you like, I don’t think it’s possible to ask DALL-E to render that same character in a different scene.

And you’d need 3d training data to create 3d rendered scenes, where a camera can pan/zoom realistically through a scene. There’s probably much less of that kind of data available. What we do have is artificially created 3d scenes in movies, but it’s not always very realistic, not to mention proprietary.

If VR kicks off and people can start easily recording 3d moveable scenes then we might start seeing AI creating similar scenes shortly after.


Competitors like Midjourney and Stablediffusion already allow you to re-use an image seed, which makes it much easier to persist style and character across images.

3D can often be inferred by 2D data, 3D training data could also be generated, etc. Think how fast the space has moved just in the last 3 years and extrapolate from that, don't focus too much on today's shortcomings.


But then there are predictions from the 50s that we’d have flying cars by the year 2000.

Sometimes tech hits a plateau.

Animated movies make tons of money, so there’s definitely motivation to make their production faster. I just think the complexities are so intricate that I wouldn’t be surprised if AI-generated animation still seems “not quite right” in the near future.


I think you can. you need to fine tune it and make it learn about a character. I've seen prompts about Darth Vader and the beans from among us produce decent outputs, so Dall E can totally learn how a specific character looks as long as it's in its training set atleast


It's not a small step to move from image generation to animation. In fact, it's quite contrare the complexity grows with length, even creating a short would be exponential in cost. GAN Networks cannot and will never be able to do this.


GAN architecture are no longer used expect for small project, they are unstable to train, they usually not cover the whole distribution and they are hard to condition. Nowadays people use diffusion or transformer models.


GAN is in reference to the subfield this belongs to Generative Artificial Networks


The question is, should everyone calling themselves "artist" be able to make a living off their creations? Based on your predictions there will be only a subset which can be professional artists... and I think it's okay this way. Everybody can produce art and call themselves artists, but only a few can also make a living from their art - because by definition art needs a consumer and the (paying) consumers pool is limited.


I don’t want it “whatabout” this, but…Is art really the thing we need to automate right now?

There’s so much labor that we more directly need more of and that we could all benefit a lot more from automating it.

To give a mundane example off the top of my head- street cleaning. I live in a big city (Berlin) which is often filthy with litter. Obviously there are not enough people cleaning it (or it wouldn’t be). Can’t we automate that instead? How about automating construction? Part of the reason housing is expensive are lot prices but surely a lot of it is also labor cost.

It just feels like such a frivolous and relatively useless thing to automate, and a misallocation of resources.


It's not about what we need. Scientific research in this case has been the result of looking around and seen what can be done with the technology at hand without caring if it was what we need or don't. If you feel like "we" should be automating other stuff, you are free to make your own contribution, it's not like OpenAI owns the keys to the field.


But the people at OpenAI aren’t doing it as a hobby, they’ve got a lot of money to conduct this research (from Wikipedia: “The organization was founded in San Francisco in late 2015 by Elon Musk, Sam Altman, and others, who collectively pledged US$1 billion.” [1])

This is the misallocation of resources I’m referring to.

[1] https://en.wikipedia.org/wiki/OpenAI


I don't understand where the misallocation of resources is. OpenAI is no longer a non-profit organization. Its goal is not to automate what is most needed now. But to advance the field of AI.


And that's all highly speculative, betting on money to somehow show up, for reasons impossible to predict, if only enough games are changed sufficiently hard. If it can be done it will be done, unless we fundamentally change the way we run the economy.

Conclusions:

(1) Perhaps we should, winning at net zero games is very much a thing in the current way

(2) Didn't know it when I started writing this reply (not at all!), but I guess I agree with you

(3) I really miss Old Google, and how we happily trusted them (deservedly or not)


(I meant negative sum games of course, net zero is certainly fine)


"This street but cleaner and without the graffiti and mess" would be an awesome prompt for an AI. And possibly an essential step for a cleaning bot to even understand the difference between clean and not clean.

Otherwise these things are not connected in any obvious way and humanity has been known to work on many problems at the same time.

The whole point about Berlin (I live there too) is that it isn't Muenich. Muenich has clean streets, high cost of living, and is frankly a bit bland and boring. People are a bit uptight and conservative there. Not a great place for creatives to express themselves. You find a lot more of those in Berlin. And it's a big part of why Berlin is so awesome. Some Berliners, consider former citizens of Muenich gentrifying their formerly their neighborhoods a problem. Especially, when they start whining about how noisy, unclean and messy things are.


Munich (not "Muenich"), clean streets? Well ok, everything is relative, but even in Munich there is a lot of garbage and graffiti (although I concede that the graffiti in Berlin is better than in Munich, because in Munich most of it seems to be because of the rivalry of the two local football clubs). But, to also say something positive: Munich wouldn't be so expensive if it was as undesirable as you describe it. And yeah, Berlin is desirable too, that's why it now gets gentrified too. That's life!


It's a valid transliteration. Don't have any umlauts on my keyboard.


I beg to differ :) Even with transliterations, it's either München/Muenchen (in German) or Munich (in English). You can even call it "Monaco" if you come from Italy (because, same as the more commonly known Monaco, it was founded by monks), but Münich/Muenich is not a valid name for it in any language I am aware of...


I’m not from Munich (or any place richer/cleaner than Berlin). It was just an example off the top of my head, I don’t think litter in parks and streets is what makes Berlin’s charm/flair.


Automatically cleaning streets is a much harder problem than generating images. There might just be no dedicated work being done on it currently because it isn't a solvable problem given the current state of the art.


The direct application is instant illustration. It's called "art" because it has less requirements and can't be objectively judged. There is a long way to becoming useful, but it can have a very real world impact in everyday jobs.

Lots of people losing their jobs over this also means more available workforce for those more important jobs of yours


I wasn’t seriously expecting investment to change, just kinda funny/sad that in this case the thing getting automated is something humans enjoy doing (illustrating) rather than many things they both don’t enjoy and need more (street cleaning, construction work, menial physical labor in general). So you think in your example illustrators taking up street cleaning work will result in an overall improvement of life for mankind?

Aren’t the machines supposed to make our lives better?


This is what I'm thinking as well. I would go so far to say that it's directly evil to automate away illustration and other forms of creativity. Sure, many people will think it's cool and useful, but so many people will see their reason to live taken away from them. This is not only about making a living, creative arts is something much deeper and meaningful for lots of people.

I wish we could spend our collective brainpower applying AI to fight disease, climate change and poverty instead. That would make life better.


How dare they automate weaving cloth, it's a creative endeavor with a long history! It's deeply meaningful for lots of people in some cultures! Time to smash the looms!


Or recycling - from a conveyor belt full of mixed garbage, pick out the items that are most likely to be a given type of plastic (PET, HDPE, PS, ABS etc. etc.)


> Lowering the effort bar for content creation and generation is, in some ways, going to cause a deluge of low-quality/mid-quality content to be generated.

Isn't this the case already?

For example, I have recently discovered science fiction poetry exists. The amount of stuff that gets produced in this niche of a niche is baffling already, and growing almost exponentially.

I feel human curation and editing will just have to be important again.


I'm still hopeful that this technology will enable writers who have a good idea for a story to build much more on their own without having to be in a position to hire voice actors and artists: whether that's pixel graphic backgrounds for adventure games with decent text-to-speech for characters, or 3D worlds built in Unreal with Metahuman-generated NPCs playing out a curated story within a procedurally generated city.

There being lots of low/mid quality games is, I think, a problem that already exists today on app stores -- on Steam especially between Greenlight & vast numbers of games 'releasing' in Early Access... and the more that a single writer with some dev skills can do on their own the more opportunity they have to develop their games as a brand, and get feedback to fire their creativity.


Yes, I also like the prospect that this tool would allow rapid prototyping and storyboarding for games, graphic novels, films etc. Being able to generate a rough idea of what you want or envisage for a scene and show that to an artist/director of photography/whatever would be very useful.


> writers who have a good idea for a story

Or programmers who are bad at art, so, programmers in general ;)


I'm a programmer that's bad at art, and DALL-E is amazing for me. It generated me a logo I love for my open source app within a couple of batches.

I would have gone logoless rather than slog through trying to find an artist on Fiverr.


> It feels as if this is about to explode an existing problem, namely, there's too much content in the world already

Imagine applying that same perspective to software, pre-github.


I'm not sure I see the relevance. Could you explain further, please?


In many ways, sites/tools like sourceforge and github made it much easier to publish and obtain open source software, leading to much more software being created and shared. I'm suggesting that "too much content" isn't a problem, and leads to good things.


Did people feel that way about software?


I understand your fear, especially because a very similar, if not the same, thing happened with Unity, where our allowed the creation of just what you're talking about a massive amount of low to mid quality games. I even remember reading articles about how the Unity splash screen has become recognised as a mark for low quality games...


It feels as if this is about to explode an existing problem, namely, there's too much content in the world already. Lowering the effort bar for content creation and generation is, in some ways, going to cause a deluge of low-quality/mid-quality content to be generated.

Like you say, this is an existing situation (I hesitate to call anything that lowers the barrier to entry a "problem"). The printing press opened up printing for more people. As did desktop publishing. Similarly, garage band and bandcamp make it so much easier to produce and publish music. Yeah, there's a ton of books and music out there, including a bunch of junk, but the cream still tends to rise to the top. I'd rather take my chances on having to find a masterpiece in a haystack than have it never get published at all, either because of lack of funding/lack of some specific skill that AI can handle/no willing publisher, etc.


> Lowering the effort bar for content creation and generation is, in some ways, going to cause a deluge of low-quality/mid-quality content to be generated.

This is not a problem, the issue is with Steam of whoever curates this content for us, with some good filters you can avoid the problem. My first program and game was garbage but it was not on GitHub or Steam to bother anyone, so the issue is not that people have access to good tools (I was using some Visual Studio for students edition) the issue might be that maybe some services have bad filters.

For new indie devs, please use whatever tools make your vision , publish them, ignore the haters, you will probably not make a living from Patreon or donations but if you love what you doa nd makes you happy that is the important thing.


I remember back when I was in college the WWW was barely released and search was in its infancy. If you were doing a research paper, trying to use the WWW to search for papers on the topic was laughably useless. Instead you would go to the university library (which still used card catalogs!) and make an appointment with these people who had access to some search system (I can’t remember what it was called). You would sit down with them and explain what you were researching and they would craft search queries in the query language this system used and read over the results with you to see if they sounded promising. They would tweak the query based on your feedback until you got the results you wanted. Then they would print out the results and you could go elsewhere in the library to lookup the papers. It cost money per search, but the university covered the cost.

I wonder if we might see something similar with these ai generation tools. People who become experts at crafting the queries to get back the results you are looking for act as intermediaries.


> I wonder if we'll see a new generation of "editors" showing up to help curate and provide some guidance in terms of what to play if you're limited on time.

Iron Pineapple has an entertaining series he calls "Steam Dumpster Diving" which is similar to this, but it's very focused on Souls-Like games.

https://www.youtube.com/watch?v=9KARY5ocvKo&list=PLuY9odN8x9...

It does sometimes include games many of us have heard of but there are a lot of unknown games, and plain bad ones (he has a soft spot for student projects too). I think he sees the value of new ideas hidden in jank, as do I, so I like the series.

I think the shift will be towards matching personalities, you get to know a reviewer and see what they like and how compatible they are with your own opinions. It doesn't have to be a one-to-one match as long as you're aware how they diverge on certain genres. The curator section on Steam seems to be an attempt at this, but any social media that allows following/subscription could serve too. It's still a significant time investment to keep up to date and to even find that matching personality.

I wonder if I actually bothered leaving reviews on Steam, would it suggest people with similar tastes to me?


It doesn’t matter how many bad Steam games there are, because their algorithms prevent you from ever seeing the vast majority of games on the platform. The same is true for videos on youtube and songs on spotify, you won’t find the really low effort/quality content unless you actively seek it out.


there's also an interesting take that as these generative models become better they will be used for search instead of regular search engines, and social apps would automatically generate content for you, trained on the vast dataset they already have. i think the hardest thing in this trend is going to be assessing 'truthfulness' automatically


Why not feed all the descriptions from say, original Colossal Cave Adventure game into DALL-E 2?

"YOU ARE STANDING AT THE END OF A ROAD BEFORE A SMALL BRICK BUILDING. AROUND YOU IS A FOREST. A SMALL STREAM FLOWS OUT OF THE BUILDING AND DOWN A GULLY."



It's interesting that in both of the examples people have shared here, the image produced was quite good but entirely missed the "flowing out of the building" part of the prompt (which I think is a fairly significant aspect of the scene).


I picked the one that most included that; a few other suggestions didn’t have water at all.


Wow, existing text-adventure could be very interesting with this kind of semi real-time illustrations. Maybe we'll finally know what is Grue like. Also, Table RPGs with Dall-E illustrations generated semi-realtime by the game master.


That was surprisingly good.



The first thing I thought when reading this was "wow, someone must have fed Infocom games' location descriptions into an AI". Would still like to see that.


So frustrating that there's no way to access this tech without putting your name on a list and waiting an unspecified amount of time (months?). It's the sort of thing many people would pay well for, even at its current stage of development.


Midjourney is open to everyone now, no waiting, and is easily the 2nd best tool for this stuff


Better than Stable Diffusion? I don't think so, especially in coherence.


My wait was only 2-3 weeks FWIW.


I think it depends on what you tell them when you fill out the application. I was honest and didn't tell them I was a YouTube influencer, venture capitalist, underground artist, and/or credentialed researcher. I just wanted to play with it and see how it worked.

Most of the interesting things I've done in my life and career began with just wanting to play with something and see how it worked, but it was probably unrealistic to expect OpenAI to buy into that, given the large number of more worthy-sounding candidates in line ahead of me. Maybe I'll reapply with a different email address and make up a more mediagenic motivation.


I applied with the same criteria, and it took about a month to be accepted.


My invite actually came through just now, interestingly enough, after a little over two weeks. Either they're opening up for a wider audience, or whining about it on HN is irrationally effective.

Gotta give credit where it's due, in any case. I hereby withdraw my complaint!


I am still waiting since they opened the registration


Two replies are from Dall-E, copied here for convenience:

- https://labs.openai.com/s/nHmjLYmVPdDQzUU5DvGT2JgK

- https://labs.openai.com/s/DvQ49etAKKCU6Zpy32GFJlq0

For comparison, first two sets of 4 variations from MidJourney:

- https://i.imgur.com/iEvlYXE.jpg

- https://i.imgur.com/aR0R10p.jpg

The game with both is promptcraft. Here is Midjourney after changing “brick building” to “brick house” and changing “road” to “dirt road”. It did brick roads but at least a road showed up:

- https://i.imgur.com/bfia9zT.jpg

Then you can upsize to hallucinate additional detail:

- https://i.imgur.com/y1uPc6b.jpg

- https://i.imgur.com/LId4wy4.jpg


That actually sounds like a pretty interesting interactive art exhibition piece!


cool idea, but those images are so... un-text-adventure? I guess they feel like they should be more low res? I guess too "real" removes imagination from the text adventure equation


"we've been privileged to live in a truly global era with minimal blank spots on the map and a constant flow of reasonably accurate information, the implication being that the not too distant past had mostly blank spots and the not too distant future will be saturated with extremely plausible-looking gibberish."

What an excellent insight.


I've been exploring a related concept in a Telegram game. The player can basically create a choose your own adventure-type world by describing a set of interlocking scenes and the segues (buttons/commands) between them.

I'm using Disco Diffusion for the rendering instead of Dall-e 2 but the limitations are similar. It can be frustrating to get consistent results or impose the kind of order on them that is required to make something feel like a real game. Or maybe I'm not thinking hard enough..

If this kinda thing (AI-generated game worlds) interests you, try it here: https://t.me/xiadreamland_bot


One cool thing about Stable Diffusion (the current leading competitor to DALLE) is that you can create images in a similar style by configuring the seed number. Different prompts with the same seed number can be used to create different images with similar visual styles, character design, etc.

For example, this person used SD to generate portraits of the same woman at different ages, from an infant to an 80 year old:

https://reddit.com/r/StableDiffusion/comments/wq6t5z/portrai...


Why does it have so many problems with the eyes?


Maybe the uncanny valley for eyes are deeper. Our brains might have evolved to notice eyes more.

Just speculating though, there might be a technical reason too.


For consistency across generations you'll need something like https://arxiv.org/abs/2208.01618 to carry the concepts across, hopefully nvidia releases the source code soon, or someone that understands the internals of latent diffusion better can implement an open source version.


As a (now casual) game developer, this is absolutely fantastic.

I've never touched the realm of adventure games simply because I lack the skills and I don't have any contacts that are interested to work for free...

Having these type of systems on a dime to generate ideas, narrative and contents is a game changer.

It won't however change the fact that releasing games is part of a big machinery of algorithms - of the 25-35 steam games released every day; meaning: mostly no-one will see or play yours... Indie games are nowadays released like throwing a pebble in the river... See ya... Unless you push a ton of cash in advertising and get a publisher but by that point you'll hate making games anyway...


These sort of complex 2D environments are a nightmare to work with in adventure games, since you have to manually detail where the player character can go, how the character should grow/shrink to simulate perspective, etc.

I checked a video of Milkmaid of the Milky Way (2017) and sure enough, the environments are 3D with heavy stylization. Characters are animated in 2D but positioned and rendered as billboards or similar. 3D environments are much simpler to work with, using an off-the-shelf engine.

There is a kind of adventure game you can easily make with static 2D graphics: graphical text adventures. Not particularly exciting, but it'll work. For more sophisticated games, you can probably get away with AI generation for stuff like character portraits.


GPT5 Prompt: The following is a detailed storyboard for a graphical text adventure set in the style of “Sam and Max Hit the Road” where the game concept is a space detective story akin to “The Little Prince” meets “Nancy Drew”


I don't think it'll be long before there's a 3D modeling version of DALL-E.

There's already a bunch of programs that can turn 2d images into 3d models - definitely a lot to improve - but they exist.


In Westworld season 4 there are scenes were Christina works with an AI editing tool that generates 3D visuals based on prompts:

https://www.youtube.com/watch?v=3uGLjmrPorg


Most of the ai art I’ve seen could really only be used as something like flavor art. Magic the gathering or slay the spire, you might imagine certain generators making an interesting set of cards for.


But those games are definitely better with deliberate and bespoke flavor art imo. But yeah, if you're on a budget, that's probably the best use-case.


You’ll definitely need to nap the walkable areas and even worse split the image into layers so you can walk behind/between things.


They're not difficult, mate. Nav meshes and scaling. What's there to get wrong? It's one of the more simplistic things you can do in terms of game dev. Few hours of work, tops.


After seeing all the DALL-E 2 posts it's pretty clear that interacting with it is an art and not a science. If black box AI continues to improve I predict an emerging profession of "AI Prompters" whom others will pay to get the desired outputs out of AI tools.


I’ve already seen GTP-3 used to talk to DALL-E.

Still need to see how to start with an image, get a description, then go back to image with these tools.


Go was an art too, and they built an AI to find the optimal move.

Train another AI on improving input prompts to another prompt that generates art humans find pleasing.


It looks like AI generated art will be a huge boon for the indie games.

Midjourney can definitely lower concept art costs by 5% to 20% as it stands. Nevertheless, AI generated art currently can’t fully eliminate real concept artists.


> Midjourney can definitely lower concept art costs by 5% to 20% as it stands.

I'm not sure that's even true either. After playing with Midjourney myself, it feels like the most aesthetically pleasing outputs regularly fall within a certain range of styles/prompts/hues ("hyper-realistic", "sci-fi", saturated neon orange/blue/red, etc). Getting consistency across prompts is also a big problem.

If everyone's using Midjourney to generate concept art, everyone's concept art ends up looking the same. At that point, you're presumably back to human artists (though maybe using generated art for inspiration).


Midjourney has a very strong house style they apply to all images - it always looks vaguely cyberpunk and more or less only has one face.

You can change it by turning down their style parameter and then forcing it off to another part of latent space with the right prompt words, and it kind of helps. The aspect ratio flexibility then makes it more interesting than other models currently out there.


the obvious disconnect to me has been that we compare DALLE and midjourney results to what we human-generated art used to look like on earlier Unreal Engine. This is what human generated art looks like now:

https://youtu.be/JXrWPLNp9tw


Well, as TFA and the comment you replied to point out, the bar for AI-generated art isn't to compete with top-tier human talent. There are many other use-cases.

Seems like you meant to reply to a comment that claimed otherwise.


Ed Catmull made this animated hand in 1972:

https://www.youtube.com/watch?v=fAhyBfLFyNA

Would anyone in 1972, even Ed Catmull himself, be able to imagine the Youtube video you linked, broadcast globally, for free.

I think these tools are going to be big, but I feel the whole AI thing actually detracts from it. I'd rather think of it like tracing paper or photography than as a replacement for artists.


DALL-E is next level. I'm still waiting for the day where I'd get to play a fully AI generated game with a generated story line; one that rapidly changes according to the decisions made by the player. That would be insane.


Maybe you are already playing it. Or am I?


5 billion years of evolution, an endless amount of lifeforms came and went, how lucky must we be to have our one and only shot at life right now, right here. Born free of hunger, free of the primal struggles of survival, that alone is an inconceivable fortune. But not only that, we're also here in time to witness monumental revolutions, the creation of the Internet, and now the creation of AI. AI that will make artificial reality indistinguishable from reality, generate whatever we desire, live however we desire. The only limit being our thoughts.

And it's all happening now! What a great time to be alive, imagine being incarnated in one of the infinite other possibilities and having to life a normal live, that would have sucked!


Here are a couple I did:

>looking up at a massive art deco archway with huge beautifully ornate door, tessellated like the ceilings of Iranian holy mosques with the dimensions symbolizing sacred geometry and celestial lighting 8k detail*

https://imgur.com/gallery/QulgbtM

-

https://imgur.com/gallery/GNgZ4hu

https://imgur.com/gallery/TIBALCT


These are subtly nightmarish to me, especially the first. The beauty of such ornate mosaics (in the real world) is the clean, symmetric geometry. Ironically, that's the sort of thing you might naively think a computer could be especially good at (compared to more "organic" subject matter). These are evocative of tiles Islamic mosques if you squint at them, but if you look closely they're a blurry tangled mess. I would totally buy this as concept art for a demonic church or something like that.

Edit: I think the third link showed up while I was typing this comment. Those hit me entirely the other way - really cool!


DALLE can’t render fine detail at a distance of more than a meter or so. Possibly because OpenAI limits how many diffusion steps are used for each image.


I used a bunch of dall-e images to illustrate a talk recently, thinking that it's not _quite_ yet a totally hackneyed idea (but soon will be).

The approach in this article is actually pretty interesting, and some of the results do recall the better 90s adventure games. If only there were a similarly effective way to generate good character animations, I'd be tempted to resurrect long-dormant plans for a graphic adventure.


The ending paragraph is great imo.

> and the not too distant future will be saturated with extremely plausible-looking gibberish.

Humans are good at deceiving each other. How to avoid making us end up deceiving ourselves collectively?

HAL 9000 didn't have to explain itself.


Nicely written! Including also the pricing, licensing, and ethics perspectives made it even better. Thanks for sharing.


Holy Crap! I was just thinking about this as I was using DALLE and TripMind!

I also thought that childrens book illustrations will be interesting, as well as illustrating books actually written by children.

I was thinking RPG avatar/monster icons...


>rustic mexican mansion with a grassy area sports car parked in front surrounded by small houses on a sunny day, high quality atmospheric high renaissance oil on canvas

The picture related to that one looks more like Provence /Marseille aera than truly Mexico.

Basically there is a small mount nearby Marseille that looks exactly like that and obviously was used in many paints.

The house and pine tree really looks like villa in Provence, and with that small mount in the background... Yeah I feel at home :o

While the paint is largely different, I have seen much closer look alike in the past but from a famous French painter https://fr.m.wikipedia.org/wiki/Paul_C%C3%A9zanne#/media/Fic...


You could use this bookmarklet for generating images without watermark when you merge the images: https://www.designinspiration.info/save-dall-e-images-withou...


Waiting for DALL-E to become capable of producing 3d game assets (along with accurate physics models).


> A common defense claims that the model learns from the training set the same way a human student would, implying human rules (presumably with human exceptions) should apply to its output. This can seem like a reasonable argument in passing, but besides being plain wrong, it's too facile since DALL-E is not human-like.

This is nowhere near enough to refute the idea that human rules on fair use should also apply to works you create using DALL-E.

The fact that DALL-E can't own the output is irrelevant. Firstly, the human operator can quite happily claim ownership of the output. Secondly, I don't see how ownership of the output is even related to whether the training data was used fairly.


Really great example of using generation as a tool rather than a crutch. Inevitably techniques will be developed to use it to create new never before imagined styles that still speak to us at an instinctual level despite the inhuman steps along the way.


I played around with that a bit myself, though mostly 2D tiled side scrollers, but I found it pretty much impossible to create good 8/16bit style graphics with DALL-E2. It can create pixel-art just fine and it can create pixel-art sprites, but it's all in the style of modern smartphone games or flash animation, not in the style of actual 80/90 games. And like most of DALL-E2 output, it's extremely zoomed in, so you never get something that looks like a screenshot, you get a closeup of a single sprite.

Even trying to infill 8/16bit games was a disaster, as it just ended up repeating the tiles 1:1, not creating actual level structure with those tiles.

DALLE-mini in contrast is pretty blurry and low-res, but it can produce images that at least look like actual video games. So I assume DALL-E2 just has a huge hole in the training data here, as nothing video game related produced good results. The article settles for prompts focusing on regular artists as well, instead of video games.

One area that DALL-E2 absolutely nails is RPG-style portrait images, it can crunch out amazing ones on the first try.

DALLE-mini:

https://matrix-client.matrix.org/_matrix/media/r0/download/m...

DALL-E2:

https://matrix-client.matrix.org/_matrix/media/r0/download/m...

https://matrix-client.matrix.org/_matrix/media/r0/download/m...

https://matrix-client.matrix.org/_matrix/media/r0/download/m...

https://matrix-client.matrix.org/_matrix/media/r0/download/m...

https://matrix-client.matrix.org/_matrix/media/r0/download/m...


Some of those make me wonder, are there any 2d/3d engines that render normal 1x1 retro style pixels as randomly sized "pixels"? I wonder if that could become a surealist retro style.


I love article titles and my imagination as I thought someone made an actual clicking game with on the fly generated content through DALL E :)


Makes me ponder what kind of video games an _advanced_ AI could create on the fly.

People talk about the impact of DALL E on art, but what if it goes _further_? How complex of an RPG world could an AI build around a player?

To the naysayers; decades ago it was impossible for a computer to display images like the ones DALL E makes. For a good chunk of computer history even holding these image bytes in memory would have been a feat.

Timelines aside, it’s fun to ponder about.



AI dungeon is neat, but it isn't really like playing an RPG (pen and paper or computer). First off, the AI mostly tells the story you imply. Second, there's no mechanical game-play, it is all story.

It is an interesting glimpse into the future, though.


There would be all kinds of repeated actions, obtaining counter-intuitive objects to get past obscure guards in a way that doesn't quite make sense.

I know this because our best (not sarcasm) humans have done the same thing for years.


I’m concerned about the precedent DALL-E 2 is setting. Learning skills, whether it be drawing, painting, or otherwise, should not be a “problem” that needs to be solved with AI. The joy of creating is in the journey, not the destination. I can’t imagine any fulfillment from glueing together auto-generated content. It’s the art equivalent of a CRUD app.


Is photography an art, or are you just using a machine and fiddling with settings?


If the joy is in the journey, no one can take it from you. Some people might want to get to the destination quicker and are not interested in the journey.

The consequences of this technology are going to be interesting though. The trend of people having to be absolutely the best and unique or GTFO seems to be unstoppable.


I’m a non-creative person. Sometimes I need to ask artist for an art. Dalle would help me to give and idea.


Most businesses want CRUD apps, and people don’t particularly like coding them. Technology where you can pump out a CRUD app with little to no effort would be a blessing, freeing people to work on more interesting problems.


Now we just need a slick UI for the full workflow: DALLE concept, DALLE scene extensions, and AI up-sampler


Make some of these generators create graphics from Dwarf Fortress descriptions. Boom, best game ever.


i keep thinking these kind of tools are very dangerous and can lead to a lost generation of artists, who won't develop their skills because they can't compete with the garbage output of "ai" algorithms.


Yeah, just like calculators killed off all the mathematicians, I guess.


calculators do not produce inferior work the way dall-e and the like do


To stretch the math metaphor a bit further, the quality level itself isn't what's interesting. The first few derivativea of the quality level are what matter.


Exactly my thoughts as well. We're now starting to automate our own culture away: stories, art, music... Why would you spend the years building skills to create something in a sea absolutely flooded by automated algorithms? As these AIs get better, it will be more and more difficult to distinguish the output from human minds.

I keep wondering what the endgame here is. Once everything can be algorithmically created, what's the point of humanity anymore? Just consuming an endless flood of auto-generated, recommendation-optimized content?


Think outside the box. The human element just moves up in the abstraction stack. Don't have to spend time manually arranging words anymore, now you just direct settings, character motivations etc. Don't need to manually set pixels to certain colours, now you can quickly create complex images and combine them, remix them, create something bigger.


Yeah, but anybody can do those things and create interesting content without much training.


So, like a camera?


Movies cost millions to make. Using DL only requires some imagination.


it doesn't even have to be optimized. it just has to be interesting enough to captivate your interest so someone can show you ads.


By the way, does anyone have a sense on how long is the wait list for DALLE2? I got into the wait-list at the end of April and am still there.. midjourney is amazing (even subscribed) but I want more variety in my AI art :)


Got my access 2 days ago, after ~3 weeks wait. They are probably prioritizing based on business case too, i.e. potential $ > developpers > personnal use.


I signed up last week and was given access in a day


I thought the waitlist was over. I didn't even join the waitlist and was given access a few weeks ago. (I already have/had an active paid OpenAI account.)


I guess the other current front-page submission is relevant here:

https://news.ycombinator.com/item?id=32486133


They'll soon be making films using simple AI instructions.

Imagine when our creative works are better when AI makes them ... that'll be an odd time to be alive.


I'm wondering what DALL-E would make out of the descriptions in classic text adventure games from the 1970s-90s like, for example the ZORK series?


But these images are not copyrightable - what's stopping someone to rip off your artwork?


one problem i found is getting a side profile of a charcter, midjourney seems to default to the typical front shot. Also if it could walking animations would be awesome


soon we'll be able to replace all skills with arcane incantations, the future belongs to wizards




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: