And I guess you haven't actually been to Tokyo, the number of details which are subtly wrong is actually very high, and it isn't limited to text, heck detecting those flaws isn't even limited by knowledge of Japan:
- Uncanny texture and shape for the manhole cover
- Weirdly protruding yellow line in the middle of the road, where it doesn't make sense
- Weird double side-curb on the right, which can't really be called steps.
- Very strange gait for the "protagonist", with the occasional leg swap.
- Not quite sensical geometry for the crosswalks, some of them leading nowhere (into the wet road, but not continuing further)
- Weird glowy inside behind the columns on the right.
- What was previously a crosswalk, becoming wet "streaks" on the road.
- No good reason for crosswalks being the thing visible in the reflection of the sunglasses.
- Absurd crosswalk orientation at the end. (90 degrees off)
- Massive difference in lighting between the beginning of the clip and the end, suggesting an impossible change of day.
Nothing suggests to me that these are easy artifacts to remove, given how the technology is described as "denoising" changes between frames.
This is probably disruptive to some forms of video production, but the high-end stuff I suspect will still use filming mostly ground in truth, this could highly impact how VFX and post-production is done, maybe.
With everything we've seen in the last couple years, do you sincerely believe that all of those points won't be solved pretty soon? There are many intermediary models that can be used to remove these kind of artefacts. Human motion can be identified and run through a pose/control-net filter, for example. If these generations are effectively one-shot without subsequent domain-specific adjustments, then we should expect for every single one of your identified flaws to be remedied pretty soon.
- Uncanny texture and shape for the manhole cover
- Weirdly protruding yellow line in the middle of the road, where it doesn't make sense - Weird double side-curb on the right, which can't really be called steps.
- Very strange gait for the "protagonist", with the occasional leg swap.
- Not quite sensical geometry for the crosswalks, some of them leading nowhere (into the wet road, but not continuing further)
- Weird glowy inside behind the columns on the right.
- What was previously a crosswalk, becoming wet "streaks" on the road.
- No good reason for crosswalks being the thing visible in the reflection of the sunglasses.
- Absurd crosswalk orientation at the end. (90 degrees off)
- Massive difference in lighting between the beginning of the clip and the end, suggesting an impossible change of day.
Nothing suggests to me that these are easy artifacts to remove, given how the technology is described as "denoising" changes between frames.
This is probably disruptive to some forms of video production, but the high-end stuff I suspect will still use filming mostly ground in truth, this could highly impact how VFX and post-production is done, maybe.