Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I saw my first AI video that completely fooled commenters: https://imgur.com/a/cbjVKMU

This was not marked as AI-generated and commenters were in awe at this fuzzy train, missing the "AIGC" signs.

I'm quite nervous for the future.



I know there are people acting like this is obvious that this is AI, but I get why people wouldn't catch it, even if they know that AI is capable of creating a video like this.

A) Most of the give aways are pretty subtle and not what viewers are focused on. Sure, if you look closely the fur blends in with the pavement in some places, but I'm not going to spend 5 minutes investigating every video I see for hints of AI.

B) Even if I did notice something like that, I'm much more likely to write it off as a video filter glitch, a weird video perspective, or just low quality video. For example, when they show the inside of the car, the vertical handrails seem to bend in a weird way as the train moves, but I've seen similar things from real videos with wide angle lenses. Similar thoughts on one of the bystander's faces going blurry.

I think we just have to get people comfortable with the idea that you shouldn't trust a single unknown entity as the source or truth on things because everything can be faked. For insignificant things like this it doesn't matter, but for big things you need multiple independent sources. That's definitely an uphill battle and who knows if we can do it, but that's the only way we're going to get out the other side of this in one piece.


I agree. Also, tangentially related: I use a black and white filter on my phone, and it is way harder to distinguish fake and real media without the color channels to help. I couldn't immediately find anything in the subway clip which gave it away.


I've definitely seen skin blurring filters that everyone already uses to make it really hard to know


Hijacking this top comment to say that I found the AI video creator: https://www.instagram.com/bugugugugu_aigc/


I agree. Apart from the text appearing backwards it all looked pretty real to me.


My assumption was the uploader wanted to make the creator's "AIGC" less obvious. It definitely did that to me.


Yeah, that's a weird one. I doubt the video was generated that way. I assume someone flipped the video for "artistic" purposes.


Reversing text is a known loophole to getting around copyright guardrails in image-generation models.


How does that work? Would you prompt the model to write "hello Kitty but in reverse" on the train so the resulting image isn't flagged?


Much more likely they just flipped the video in an editor after it was generated. Its common enough to see flipped video with backwards text on social media, most people wouldn't give it a second thought.


I'm beginning to write off most images as AI. I actually think that's where this is all headed.


There are projects like https://contentcredentials.org/ . If we want, with some effort we could distinguish between real and ai generated. If.


No individual actor - human or corporate - stands to benefit enough because "trust in reality" is neither easily measured nor financialized.


Some do care, e.g. some camera manufacturers or some news agencies. Surprisingly some social media platforms[1] want clear labels for AI generated content.

[1]: e.g. tiktok https://newsroom.tiktok.com/en-us/partnering-with-our-indust...


that's the easiest position imo. It's AI unless proven otherwise. No one has the time to place this much detailed on a random video when the purpose of the video is just entertainment. What this might lead to though is people losing (or not learning) the skills needed to separate real content from AI generated content


And even if it isn't AI, it is quite possibly deceptively edited. Content provenance will be important in the future.


A precondition is likely that one has mainly watched CGI-heavy movies for most of one's life. Compared to old school analog movies or fairly raw photography that looks as fake as the Coca-Cola Santa. There's a rather obvious lack of detail that real photography would have catched.


> A precondition is likely that one has mainly watched CGI-heavy movies for most of one's life.

Indeed, a great (if counterintuitive) example of this is The Wolf of Wall Street. I bet a lot of people would be surprised at just how much CGI is used in that just for set/location.



The OG film for that was Forrest Gump. It is often lauded as one of the first movies to use CGI heavily but in completely, and intentionally, unnoticeable ways...


True, but in that case you knew it had to be CGI because Kennedy didn't talk to Tom Hanks in any capacity.


Sure, it's like a weird dream where sometimes shadows don't come from the sun and the scenery has this absurd, acutely unreal polish.


A) also true that many people don't put a lot of thought into very much at all. They'd never consider actively thinking if a video is fake or not. These are the targets of short form content.


B is / will be huge; the largest amount of "mindless" content is consumed on phones, with half attention, often with other distractions going on and in between doing other stuff, and can be watched on older / lower fidelity devices, slower internet connections, etc. AI content needs high resolution / big screens and focused attention to "discover".

The truth is... most people will simply not care. Raised eyebrow, hm, cute, next. Critical watching is reserved for critics like the crowd on HN and the like, but they represent only a small percentage of the target audience and revenue stream.


You can see the perspective/angle of the objects changing slightly as the camera moves in a way that makes it pretty obvious they're CG, AI or otherwise. That's always been a problem with AI generated imagery in video/animation; it changes too much frame to frame. If researchers figure out how to address that, yeah, we've got a problem. Until then - this looks worse tha

Then there's the usual giveways for CG - sharpness, noise, lighting, color temperature, saturation - none of them match. There's also no diffuse reflection of the intense pink color.


Yes. The lack of diffuse reflection from the pink train is the clearest giveaway, and AI videos in general have problems with getting shadows and radiosity right. There's also the existence of the real-world Hello Kitty Shinkansen and the APM Cat Bus in Japan that makes this image more plausible.


That last point is also important; if it's not surprising, people will just accept it without being too critical about it. And since these AI tools are trained with real / existing content, creating realistic-enough content will be the norm. I think the first big AI generators - dall-e and co - had their model trained on more fantastical / artistic sources, and used that primarily as their model, also because realistic generation (like humans) wasn't yet good enough, or too uncanny. But uncanny and art work well together.


Also consider one of of the reasons AI generated video has CG like artifacts is because it is trained on CG video. Better CG generation, and more real video for training will reduce these over time.


Honestly, stuff like that could also be because of compression. We're all used to see low quality videos online.


Most people have terrible eyes for distinguishing content.

I’ve worked in CG for many years and despite the online nerd fests that decry CG imagery in films, 99% of those people can’t tell what’s CG or not unless it’s incredibly obvious.

It’s the same for GenAI, though I think there are more tells. Still, most people cannot tell reality from fiction. If you just tell them it’s real, they’ll most likely believe it.


> I’ve worked in CG for many years and despite the online nerd fests that decry CG imagery in films, 99% of those people can’t tell what’s CG or not unless it’s incredibly obvious.

I've noticed people assume things are CG that turn out to be practical effects, or 90% practical with just a bit of CG to add detail.


Yep I’ve had that happen many times , where people assume my work is real and the practical is CG.

Worse, directors often lie about what’s practical and we’ll have replaced it with CG. So people online will cheer the “practicals” as being better visually, while not knowing what they’re even looking at.

I’ve seen interviews with actors even where they talk about how they look in a given shot or have done something, and not realize they’re not even really in the shot anymore.

People just have terrible eyes once you can convince them something is a certain way.


But films without CG are clearly superior and it’s not even in contention.

Lawrence of Arabia or Cleopatra alone have incredible fully live shot special effects which can not be easily replicated with CG and have aged like fine wine, unlike the trash early CG of the 80s and 90s which ruined otherwise great films like the last starfighter


I’m sorry, but you make an absurd argument.

You’re taking the best films of an era and comparing them to an arbitrary list of movies you don’t like? Adding to that, you’re comparing it to films in the infancy of a technology?

This is peak confusion of causality and correlation. There are tons of great films in that time frame with CG. Unless you’re going to argue that Jurassic Park is bad.


Jurassic Park isn't just a good example of CG, it also a good example of making the right choices on practical vs CG (in the context of technology of the time) and using a reasonable budget. You can have great CG and crappy CG by cutting corners. Plenty of people that decry CG don't actually know how much there is, even in non-sci-fi movies like romcoms, just for post-editing. But when it is done well nobody notices, the complaints only come when it looks like crap. Great use of technology to achieve the artistic vision will stand the test of time.


It's also directed by one of the best directors in history.


The worst bit about working in CG, or film-making in general, is finding it harder to enjoy films because you are hypersensitized to bad work.


Yeah, totally. It’s not even just bad work, but I’m constantly breaking down shots as I’m watching them.

Especially because I’ve done both on set and virtual production, it’s hard to suspend disbelief in a lot of films.


> Still, most people cannot tell reality from fiction. If you just tell them it’s real, they’ll most likely believe it.

This goes for conversation too! My neighbour recently told me about a mutual neighbour who walks 200 miles per day working on his farm. When I explained that this is impossible he said "I'll have to disagree with you there"


Maybe not strictly impossible, just slightly better than an ultramarathon world record pace?

https://www.reddit.com/r/Ultramarathon/comments/xhbs4d/sorok...

https://en.wikipedia.org/wiki/Aleksandr_Sorokin

So, not very convenient for a non-world-champion runner to do (let alone while doing farm work) (let alone on more than one occasion).


That's a cultural issue that seems to have developed in the past years (decades? idk), where people take their own opinion (or what they think is their own opinion) as unchallengeable gospel.

In my opinion anyway, I'm gonna have to disagree with any counterpoints in advance.


This is partially the result of being taught that every opinion is valid. What was taught as a nicety (don’t dismiss other people’s opinions was the intention) has evolved into all opinions are equal.

If all opinions are equal, and we’ve reinforced that you can find anything to strengthen an opinion, then facts don’t actually matter.

But I don’t think it’s actually all that recent. History is full of people saying that facts or logic don’t matter. The Americas were “discovered” by such a phenomenon.


What's weird is the projection you get when you challenge someone's opinion in any way. All of a sudden, you're the arrogant one who thinks they're always right, no matter how diplomatic (or undeniably correct) about the issue you are. Or is that just me?


>Most people have terrible eyes for distinguishing content

A related phenomenon is not being able to hear the difference between 128kbps and 320kbps. I find the notion astonishing, and yet lots of people cannot tell the difference.


> Most people have terrible eyes for distinguishing content.

But also in the case of the fluffy train there's nothing to compare it against. The reason CGI humans look the most fake is because we're trained from birth to read a human face. Someone that looks at trains on a regular basis will probably discern this as being fake quicker than most.


Looks dope though. But what impressed me recently was some crypto-scam video, featuring "a clip" from Lex Fridman Podcast where Elon Musk "reveals" his new crypto or whatever (sadly, the one I saw is currently deleted). It didn't really look good, they were talking with weird pauses and intonations, and as awkward these 2 normally are, here they were even more unnatural. There was so much audacity to it I laughed out loud.

But what I was thinking while enjoying the show was: people wouldn't do that, if it didn't work.

This is the point. There is no such thing as "completely fools commenters". I mean, it didn't fool you, apparently. (But don't be sad, I bet you were fooled by something else: you just don't know it, obviously.) But some of it always fools somebody.

I really liked how Thiel mentioned on some podcast that ChatGPT successfully passed Turing test, which was implicitly assumed to be "the holy grail of AI", and nobody really noticed. This is completely true. We don't really think about ChatGPT, as something that passes Turing test, we think how fucking stupid useless thing mislead you with some mistake in calculations you decided to delegate to it. But realistically, if it doesn't it's only because it is specifically trained to try to avoid passing it.


I wish you were right that there is no way to completely fool viewers, but I know you are not. I was fooled! Note that I call out "AIGC." If that wasn't there (I only noticed it on repeat views), I would have simply had no way to tell. These are early, primitive AI generated videos, and I'm already unable to differentiate. Many in this thread talk about movie CG; there are countless movie scenes that fool all viewers.


If someone were to train a model on Joe Rogan podcasts whole run, I’m sure it would spit out extremely impressive fake results already


> people wouldn't do that, if it didn't work.

You can't assume that with scams. Quite often, scams are themselves sold as a get-rich-quick scheme, which like all GRQ schemes, they wouldn't be if they worked well.


Think about this: you very well may have already seen AI videos that fooled you - you wouldn't know if you did.


One of the clearest signs in the current gen is that the typography looks bad still.


People are smart enough to know that what you see in movies isn't real. It will just take a little time for people to realize that now applies to all videos and images.


The frequency is so high, and I am getting so burned out on checking comments to gauge how much everything is changing, that I've nearly given up subconsciously. Pretty close to just ignoring all images I see.


This is definitely something the Japanese would do, but it is not a real train unless a thousand salarymen are crammed into it.


The bigger problem is that people think something this ridiculous could happen.


Weirder things have been created. I could definitely see one being made for a movie.


> I'm quite nervous for the future.

Videos like these were already achievable through VFX.

The only difference here is a reduction in costs. That does mean that more people will produce misinformation, but the problem is one that we have had time to tackle, and which gave rise to Snopes and many others.


I mean the only real tell for me is how expensive this stunt would be. I personally think this is a really cool use of genAI. But the consequences will be far reaching.


Some of the comments were like, "come on guys, if this was real it would be way dirtier"


The face of the girl on the left at the start in the first second should have been a giveaway.


My intuition went for video compression artifact instead of AI modeling problem. There is even a moment directly before the cut that can be interpreted as the next key frame clearing up the face. To be honest, the whole video could have fooled me. There is definitely an aspect in discerning these videos that can be trained just by watching more of them with a critical eye, so try to be kind to those that did not concern themselves with generative AI as much as you have.


Yeah, it's unfortunate that video compression already introduces artifacts into real videos, so minor genAI artifacts don't stand out.

It also took me a while to find any truly unambiguous signs of AI generation. For example, the reflection on the inside of the windows is wonky, but in real life warped glass can also produce weird reflections. I finally found a dark rectangle inside the door window, which at first stays fixed like a sign on the glass. However it then begins to move like part of the reflection, which really broke the illusion for me.


No one is looking at her face though, they're looking at the giant hello kitty train. And you were only looking at her face because you were told it's an AI-generated video. I agree with superfrank that extreme skepticism of everything seen online is going to have to be the default, unfortunately.


Hard to not discount that as a compression artifact.


Just like all the obvious signs[1] the moon landings were faked.

[1]: https://web.archive.org/web/20120829004513/http://stuffucanu...


Just wanted to say I really enjoyed this!


One thing that's not intuitive to spot but actually completely wrong, is that in the second clip we're apparently inside the train but the train is still rolling under us.


Or, y'know, the camera's moving smoothly backwards through the train? Would be bit of an odd choice (and high-effort to make it that smooth versus someone just carrying it) but not impossible by any means.


Also "HELLO KITTY" being backwards is odd - writting on trains doesn't normally come out like that eg https://www.groupe-sncf.com/medias-publics/styles/crop_1_1/p...


All the text is mirrored. It's not unusual doing this to avoid copyright-filters. This kinda adds to distracting suspicions.


The whole video was probably mirrored before being posted. Doesn't seem to be related to being AI generated.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: