I know there are people acting like this is obvious that this is AI, but I get why people wouldn't catch it, even if they know that AI is capable of creating a video like this.
A) Most of the give aways are pretty subtle and not what viewers are focused on. Sure, if you look closely the fur blends in with the pavement in some places, but I'm not going to spend 5 minutes investigating every video I see for hints of AI.
B) Even if I did notice something like that, I'm much more likely to write it off as a video filter glitch, a weird video perspective, or just low quality video. For example, when they show the inside of the car, the vertical handrails seem to bend in a weird way as the train moves, but I've seen similar things from real videos with wide angle lenses. Similar thoughts on one of the bystander's faces going blurry.
I think we just have to get people comfortable with the idea that you shouldn't trust a single unknown entity as the source or truth on things because everything can be faked. For insignificant things like this it doesn't matter, but for big things you need multiple independent sources. That's definitely an uphill battle and who knows if we can do it, but that's the only way we're going to get out the other side of this in one piece.
I agree. Also, tangentially related: I use a black and white filter on my phone, and it is way harder to distinguish fake and real media without the color channels to help. I couldn't immediately find anything in the subway clip which gave it away.
Much more likely they just flipped the video in an editor after it was generated. Its common enough to see flipped video with backwards text on social media, most people wouldn't give it a second thought.
Some do care, e.g. some camera manufacturers or some news agencies. Surprisingly some social media platforms[1] want clear labels for AI generated content.
that's the easiest position imo. It's AI unless proven otherwise. No one has the time to place this much detailed on a random video when the purpose of the video is just entertainment. What this might lead to though is people losing (or not learning) the skills needed to separate real content from AI generated content
A precondition is likely that one has mainly watched CGI-heavy movies for most of one's life. Compared to old school analog movies or fairly raw photography that looks as fake as the Coca-Cola Santa. There's a rather obvious lack of detail that real photography would have catched.
> A precondition is likely that one has mainly watched CGI-heavy movies for most of one's life.
Indeed, a great (if counterintuitive) example of this is The Wolf of Wall Street. I bet a lot of people would be surprised at just how much CGI is used in that just for set/location.
The OG film for that was Forrest Gump. It is often lauded as one of the first movies to use CGI heavily but in completely, and intentionally, unnoticeable ways...
A) also true that many people don't put a lot of thought into very much at all. They'd never consider actively thinking if a video is fake or not. These are the targets of short form content.
B is / will be huge; the largest amount of "mindless" content is consumed on phones, with half attention, often with other distractions going on and in between doing other stuff, and can be watched on older / lower fidelity devices, slower internet connections, etc. AI content needs high resolution / big screens and focused attention to "discover".
The truth is... most people will simply not care. Raised eyebrow, hm, cute, next. Critical watching is reserved for critics like the crowd on HN and the like, but they represent only a small percentage of the target audience and revenue stream.
You can see the perspective/angle of the objects changing slightly as the camera moves in a way that makes it pretty obvious they're CG, AI or otherwise.
That's always been a problem with AI generated imagery in video/animation; it changes too much frame to frame. If researchers figure out how to address that, yeah, we've got a problem. Until then - this looks worse tha
Then there's the usual giveways for CG - sharpness, noise, lighting, color temperature, saturation - none of them match. There's also no diffuse reflection of the intense pink color.
Yes. The lack of diffuse reflection from the pink train is the clearest giveaway, and AI videos in general have problems with getting shadows and radiosity right. There's also the existence of the real-world Hello Kitty Shinkansen and the APM Cat Bus in Japan that makes this image more plausible.
That last point is also important; if it's not surprising, people will just accept it without being too critical about it. And since these AI tools are trained with real / existing content, creating realistic-enough content will be the norm. I think the first big AI generators - dall-e and co - had their model trained on more fantastical / artistic sources, and used that primarily as their model, also because realistic generation (like humans) wasn't yet good enough, or too uncanny. But uncanny and art work well together.
Also consider one of of the reasons AI generated video has CG like artifacts is because it is trained on CG video. Better CG generation, and more real video for training will reduce these over time.
Most people have terrible eyes for distinguishing content.
I’ve worked in CG for many years and despite the online nerd fests that decry CG imagery in films, 99% of those people can’t tell what’s CG or not unless it’s incredibly obvious.
It’s the same for GenAI, though I think there are more tells. Still, most people cannot tell reality from fiction. If you just tell them it’s real, they’ll most likely believe it.
> I’ve worked in CG for many years and despite the online nerd fests that decry CG imagery in films, 99% of those people can’t tell what’s CG or not unless it’s incredibly obvious.
I've noticed people assume things are CG that turn out to be practical effects, or 90% practical with just a bit of CG to add detail.
Yep I’ve had that happen many times , where people assume my work is real and the practical is CG.
Worse, directors often lie about what’s practical and we’ll have replaced it with CG. So people online will cheer the “practicals” as being better visually, while not knowing what they’re even looking at.
I’ve seen interviews with actors even where they talk about how they look in a given shot or have done something, and not realize they’re not even really in the shot anymore.
People just have terrible eyes once you can convince them something is a certain way.
But films without CG are clearly superior and it’s not even in contention.
Lawrence of Arabia or Cleopatra alone have incredible fully live shot special effects which can not be easily replicated with CG and have aged like fine wine, unlike the trash early CG of the 80s and 90s which ruined otherwise great films like the last starfighter
You’re taking the best films of an era and comparing them to an arbitrary list of movies you don’t like? Adding to that, you’re comparing it to films in the infancy of a technology?
This is peak confusion of causality and correlation. There are tons of great films in that time frame with CG. Unless you’re going to argue that Jurassic Park is bad.
Jurassic Park isn't just a good example of CG, it also a good example of making the right choices on practical vs CG (in the context of technology of the time) and using a reasonable budget. You can have great CG and crappy CG by cutting corners. Plenty of people that decry CG don't actually know how much there is, even in non-sci-fi movies like romcoms, just for post-editing. But when it is done well nobody notices, the complaints only come when it looks like crap. Great use of technology to achieve the artistic vision will stand the test of time.
> Still, most people cannot tell reality from fiction. If you just tell them it’s real, they’ll most likely believe it.
This goes for conversation too! My neighbour recently told me about a mutual neighbour who walks 200 miles per day working on his farm. When I explained that this is impossible he said "I'll have to disagree with you there"
That's a cultural issue that seems to have developed in the past years (decades? idk), where people take their own opinion (or what they think is their own opinion) as unchallengeable gospel.
In my opinion anyway, I'm gonna have to disagree with any counterpoints in advance.
This is partially the result of being taught that every opinion is valid. What was taught as a nicety (don’t dismiss other people’s opinions was the intention) has evolved into all opinions are equal.
If all opinions are equal, and we’ve reinforced that you can find anything to strengthen an opinion, then facts don’t actually matter.
But I don’t think it’s actually all that recent. History is full of people saying that facts or logic don’t matter. The Americas were “discovered” by such a phenomenon.
What's weird is the projection you get when you challenge someone's opinion in any way. All of a sudden, you're the arrogant one who thinks they're always right, no matter how diplomatic (or undeniably correct) about the issue you are. Or is that just me?
>Most people have terrible eyes for distinguishing content
A related phenomenon is not being able to hear the difference between 128kbps and 320kbps. I find the notion astonishing, and yet lots of people cannot tell the difference.
> Most people have terrible eyes for distinguishing content.
But also in the case of the fluffy train there's nothing to compare it against. The reason CGI humans look the most fake is because we're trained from birth to read a human face. Someone that looks at trains on a regular basis will probably discern this as being fake quicker than most.
Looks dope though. But what impressed me recently was some crypto-scam video, featuring "a clip" from Lex Fridman Podcast where Elon Musk "reveals" his new crypto or whatever (sadly, the one I saw is currently deleted). It didn't really look good, they were talking with weird pauses and intonations, and as awkward these 2 normally are, here they were even more unnatural. There was so much audacity to it I laughed out loud.
But what I was thinking while enjoying the show was: people wouldn't do that, if it didn't work.
This is the point. There is no such thing as "completely fools commenters". I mean, it didn't fool you, apparently. (But don't be sad, I bet you were fooled by something else: you just don't know it, obviously.) But some of it always fools somebody.
I really liked how Thiel mentioned on some podcast that ChatGPT successfully passed Turing test, which was implicitly assumed to be "the holy grail of AI", and nobody really noticed. This is completely true. We don't really think about ChatGPT, as something that passes Turing test, we think how fucking stupid useless thing mislead you with some mistake in calculations you decided to delegate to it. But realistically, if it doesn't it's only because it is specifically trained to try to avoid passing it.
I wish you were right that there is no way to completely fool viewers, but I know you are not. I was fooled! Note that I call out "AIGC." If that wasn't there (I only noticed it on repeat views), I would have simply had no way to tell. These are early, primitive AI generated videos, and I'm already unable to differentiate. Many in this thread talk about movie CG; there are countless movie scenes that fool all viewers.
You can't assume that with scams. Quite often, scams are themselves sold as a get-rich-quick scheme, which like all GRQ schemes, they wouldn't be if they worked well.
People are smart enough to know that what you see in movies isn't real. It will just take a little time for people to realize that now applies to all videos and images.
The frequency is so high, and I am getting so burned out on checking comments to gauge how much everything is changing, that I've nearly given up subconsciously. Pretty close to just ignoring all images I see.
Videos like these were already achievable through VFX.
The only difference here is a reduction in costs. That does mean that more people will produce misinformation, but the problem is one that we have had time to tackle, and which gave rise to Snopes and many others.
I mean the only real tell for me is how expensive this stunt would be. I personally think this is a really cool use of genAI. But the consequences will be far reaching.
My intuition went for video compression artifact instead of AI modeling problem. There is even a moment directly before the cut that can be interpreted as the next key frame clearing up the face. To be honest, the whole video could have fooled me. There is definitely an aspect in discerning these videos that can be trained just by watching more of them with a critical eye, so try to be kind to those that did not concern themselves with generative AI as much as you have.
Yeah, it's unfortunate that video compression already introduces artifacts into real videos, so minor genAI artifacts don't stand out.
It also took me a while to find any truly unambiguous signs of AI generation. For example, the reflection on the inside of the windows is wonky, but in real life warped glass can also produce weird reflections.
I finally found a dark rectangle inside the door window, which at first stays fixed like a sign on the glass. However it then begins to move like part of the reflection, which really broke the illusion for me.
No one is looking at her face though, they're looking at the giant hello kitty train. And you were only looking at her face because you were told it's an AI-generated video. I agree with superfrank that extreme skepticism of everything seen online is going to have to be the default, unfortunately.
One thing that's not intuitive to spot but actually completely wrong, is that in the second clip we're apparently inside the train but the train is still rolling under us.
Or, y'know, the camera's moving smoothly backwards through the train? Would be bit of an odd choice (and high-effort to make it that smooth versus someone just carrying it) but not impossible by any means.
This was not marked as AI-generated and commenters were in awe at this fuzzy train, missing the "AIGC" signs.
I'm quite nervous for the future.