I wonder if one could extract a "surprisedness" value out of the AI, basically, "the extent to which my current input is not modeled successfully by my internal models". Giving the model a metaphorical "WTF, human, come look at this" might be pretty powerful for those walking cardboard boxes and trees, to add to the cases where the model knows something is wrong. Or it might false positive all the darned time. Hard to tell without trying.
English breaks down here, but the model probably does "know" something more like "If the tree is here in this frame, in the next frame, it will be there, give or take some waving in the wind". It doesn't know that "trees don't walk", just as it doesn't know that "trees don't levitate", "trees don't spontaneously turn into clowns", or an effectively infinite number of other things that trees don't do. What it can do possibly do is realize that in frame 1 there was a tree, and then in frame 2, there was something the model didn't predict as a high-probability output of the next frame.
It isn't about knowing that trees don't walk, but that trees do behave in certain ways and noticing that it is "surprised" that they fail to behave in the predicted ways, where "surprise" is something like "this is a very low probability output of my model of the next frame". It isn't necessary to enumerate all the ways the next frame was low-probability, it is enough to observe that it was logically-not high probability.
In a lot of cases this isn't necessarily that useful, but in a security context having a human take a look at a "very low probability series of video frames" will, if nothing else, teach the developers a lot about the real capability of the model. If it spits out a lot of false positives, that is itself very informative about what the model is "really" doing.
The parent comment spelt this out: because the training data likely included only few instances of walking trees (depending on how much material from the lord of the rings movies was used)
There is no "knowing" in LLMs, and it doesn’t matter for the proposed solution either. Detecting a pattern that is unusual by the certainty of having seen something previously does not require understanding of the pattern itself, if the only required action is reporting the event.
In simple terms: The AI doesn’t need to say, "something unusual is happening because I saw walking trees and trees usually cannot walk", but merely "something unusual is happening because what I saw was unusual, care to take a look?"
The challenge with these systems is that everything is unusual unless trained otherwise, so the false positive rate is exceptionally high. So the systems get tuned to ignore most untrained/unusual things.
I bet they’d have similar luck if they dressed up as bears. Or anything else non-human, like a triangle.
The nature of AI being a black box that, and fails in the face of "yeah those are some guys hiding in boxes" scenarios is something I struggle with.
I'm working on some AI projects at work and there's no magic code I can see to know what it is going to do ... or even sometimes why it did it. Letting it loose in an organization like that seems unwise at best.
Sure they could tell the AI to watch out for boxes, but now every time some poor guy moves some boxes they're going to set off something.
You can get past a human sentry who is looking for humans, by hiding in a box, at a checkpoint in which boxes customarily pass through without being opened or X-rayed.
You don't need marines to invent that workaround, you see that in Looney Toons.
Don't security cameras have universals motion detection triggers you can use to make sure everything gets captured? Why only pre-screen human silhouettes?
The number of false positives using only motion is tiring. You want smart detections otherwise you're stuck reviewing endless clips of spider webs and swaying tree branches.
If your use case has such a high bar, why not pay some offshore workers to watch your camera 24/7 and manually flag intruders?
Since AGI for cameras is very far away as the number of false positives and creative workarounds for camouflage is insane to be caught by current "smart" algorithms.
>Because machines don't get bored or take smoke breaks.
Rotations? Like the military hold perimeter security?
>And, really, how would you feel if that was YOUR job?
If I couldn't get a better job to pay my bills, then that would be amazing. Weird of you to assume like that would somehow be the most dehumanizing job in existence.