So it sounds like this isn’t interactive. Just a wrapper around generating Sora videos with memory? Cool to see it as a real product, but not what folks consider “interactive” generally.
I wish ratings were a slider from Skip It - Good - Great - Life Changing or whatever. Sliders are great for stuff like this. Odd comment I know but sliders are instantly understandable, accessible and doable. So, great UI.
Was it? He concludes that LLMs will never write Shakespeare, or create original work of that caliber, or animate an actor the way Ben Guinness could have done it. I'm paraphrasing but isn't this confirmaation bias from a guy now heavily vested in making movies? Move 37 was so astonishing the world champiion Go player, hardly known for theatrics, got up and left the table. It was considered an insanely "original" move. In every field we don't consider deeply "human," Ai has jumped light years ahead, in astonishing ways. Affleck is entitled to his opinion and seems like he's trying to understand things, but I think poetry, acting, music, they're all on the table until they aren't, for better or worse.
>Move 37 was so astonishing the world champiion Go player, hardly known for theatrics, got up and left the table.
This is an incredibly minor nitpick, but I recently went down the rabbit hole of move 37[1] after reading Richard Powers' latest novel, Playground, and Sedol didn't get up, he was already away from the table. He did, however, take far longer to make his next move (iirc something like 12-15 minutes instead of 2-4).
It has yet to be seen if this can truly be generalized and if such an analogy holds.
Go is a game constrained by an extremely narrow set of rules. Brute forcing potential solutions and arriving at something novel within such a constrained ruleset is an entirely different scenario than writing or film-making which occur in an almost incomprehensibly larger potential "solution" space.
Perhaps the same thing will eventually happen, but I don't think the success of AI in games like Go is particularly instructive for predicting what can happen in other fields.
Ahh good point. I was thinking about “machine plays itself repeatedly until it gets good” aspect of AlphaGo Zero and my brain jumped to brute forcing, but agree that’s a misnomer.
But a game is something with an objective measure. Either a move is good or it's not. Can you say the same of parts in a movie, where it's more about taste?
I'm not making any statement about LLMs here, but the counterpoint to this is that you don't need to make what film critics would overwhelmingly call a "good" movie. You need to make things that make money.
I can imagine two options for that: utilize expertise from people that know how to make films that make money, or make so many movies, one or two can make enough money to pay for all the others and then some.
Really, I think it's more about what gets more attention and then you make deals with Roblox and Fortnite or whatever to sell digital goods.
The thing people love about Gen AI is not needing to understand the dozens, if not hundreds, of deliberate and unconscious artistic decisions an artist makes when creating a piece by hand. It's great to be able to think of a core idea, refine the overall aesthetic, and then work out some details. It's freeing, fun, and nearly useless for making high-end media.
Thousands of deliberate artistic decisions go into making a TV show, let alone a Hollywood movie. Think about everything from the subtlest cuts in the tailored costumes for every character, what each part of each hair style will look like in different scenarios, how all of that stuff is lit in the subtlest ways and what shade of almost black you want for the matte and whether the rim light needs to be a different color to make it work... all for each shot. That's the precision required to make even generic high-end media, and the need to manipulate those things with perfect accuracy doesn't go away when you're using Gen AI to make it. People will probably be more critical of Gen AI output than traditional media.
I know of a big, moneyed studio that tried to replace their concept artists with a group of prompt engineers and promptly fired them two months later after the art directors just couldn't take it anymore. They wanted someone that could precisely make exactly what they wanted after 3 hours in two attempts rather than someone who could make 100 polished versions they had to review in 5 minutes, but took 6 hours to get one version that really met their needs because each revision was imprecise and yielded other undesirable results even with control nets, inpainting, loras, and all that. Beyond that, since it was in a flat raster format, it was literally useless for anything else. It's not like Gen AI has no role in that workflow-- a traditional artist might use something like that for ideation and reference-- but modifying the flat raster output of Stable Diffusion, et al would be even more difficult than roughing it out from scratch in many cases and yield an inferior product.
When it came down to it, having people that knew how to precisely execute an artistic vision tuned to produce the output studios know will make them money. And that's concept art, not the stuff that gets put on screen, which has to be a whole lot more precise and for a 90 minute movie, you need 129,600 perfect polished still images, and those will come from a pool of at least that many more which editors can compose into a piece. Not having it in LOG, have separated AOVs for precise grading, color correction and compositing, etc are all huge impediments.
It's no different than giving a completed mp3 of a song to a talented music producer and expecting them to turn that audio into a hit song (not cut it up into samples using traditional techniques to make something new, not use it as inspiration to re-make the song, to take the audio in that mp3 and use that audio) in a fraction of the time it would take them to do it from scratch. They'd just laugh at the suggestion.
> isn't this confirmaation bias from a guy now heavily vested in making movies?
LLM-created video is absolutely nowhere close to competing with Hollywood. As it stands, the closer you get to the top of the movie business, the more profitable LLM-generated shit is. He has every reason to embrace LLMs in movie production-- so much so that I expect most famous people speaking out about this are basically being spokespeople for SAG and secretly pressuring people that work for them to squeeze every bit of productivity out of these tools that they can.
Idk. LLMs may effectively emulate human creativity up to a point, but in the end they are writing literally the most predictable response.
They don’t start with an emotion, a vision, and then devise creative ways to bring the viewer into that world.
They don’t emotion-test hundreds of ideas to find the most effective way to give the viewer/reader a visceral sense of living that moment.
While they can read sentiment, they do not experience an internal emotional response. Fundamentally, they are generating the most probable string of words based on the data in their training set.
There is no way for them to come up with anything that is both improbable and not nonsensical. Their entire range of “understanding” is based on the statistical probability of the next conceptual fragment.
I’m not saying that it might not be possible for LLMs to come up with a story or script… they do that just fine. But it will literally be the most predictable, unremarkable, innovation-less and uninspiring drivel that can be predicted from a statistical walk through vectors starting at a random point.
There is a reason why AI output is abrasive to read. It is literally written with no consideration given to the reader.
No model of mind of the effect that it will have on the receiver, no interesting and unusual prosaic twists to capture your attention or spur your imagination…. Just a soulless, predictable stream of simulated thoughts that remarkably often turns out to be useful, if uninspiring.
LLMs are fantastic tools for navigating the commons of human culture and making a startling breadth of human knowledge easily accessible.
They are truly amazing tools that can release us from many language manipulation burdens in the capture and sorting of data from diverse and unorganized sources.
They will revolutionize many industries that currently are burdened by extensive human labor in the ingestion, manipulation, and interpretation of data. LLMs are like the washing machine to free our minds from the handwashing of information.
But they are not creative agents in the way that we admire creative genius.
>There is a reason why AI output is abrasive to read. It is literally written with no consideration given to the reader.
>No model of mind of the effect that it will have on the receiver, no interesting and unusual prosaic twists to capture your attention or spur your imagination…. Just a soulless, predictable stream of simulated thoughts that remarkably often turns out to be useful, if uninspiring.
An LLM can produce text about the humliation and pain of being picked last on the recess kickball team. But it never had that experience. It can't. It has zero credibility about the subject. It's BSing in exactly the same way that sports fans pretend they can manage their favorite team better than the current staff.
Are you saying a Psychiatrist would need to experience humiliation and pain to be able to help patients? aren't some psychiatrist also BS'ing their patients, because they are either not well-trained, are focused on money, or are just tired of hearing the same problems.
I’d not bet against AI psychology as a useful tool for some. The privacy that it offers could easily offset the lack of innovative thought. Psychology interviews are by design predictable, droll, and interviewee driven. It might work.
Logged in with f'g G, picked an "experience," waited 30 sec to load(!) then picked 8 sec run from 4 / 8 / 12 sec...
"Not enought credits." Hmm. Somehow I thought it was a demo but I guess, yeah.
TL/DR: Nowhere close to ready for prime time.
reply