My tl;dr understanding: Drones send video back to operator. Video is typically compressed so that it only updates part of the picture that have changed. Even if the video is encrypted, the researchers are able to measure the bitrate. Thus when the researchers make a significant change like putting a board against a window, and seeing if the traffic increases, the researchers can determine whether the drone is looking at that window.
Couldn't this be bypassed by regulating a drone's bitrate to a constant steady flow? The sacrifice will be either video quality or freshness of data, but it would be a straightforward way get past that protection.
Or and even easier way: just have the drone fill up the stream with random data so that the data stream never goes "quiet" when the image is mostly unchanged.
No sacrificing video quality or freshness required.
> Why compress at all? Wouldn't just having a pixel perfect stream solve the same issue and give better quality video?
Because compression reduces the amount of data that needs to be transmitted, reducing network and battery usage, especially if the radio channel is congested.
Instead of filling the stream and having a constant stream size, you could just code the drone do add random bursts of data, making any prediction from the observer useless.
> Because compression reduces the amount of data that needs to be transmitted, reducing network and battery usage
> Instead of filling the stream and having a constant stream size, you could just code the drone do add random bursts of data
You cannot simultaneously reduce network bandwidth and also increase it. Of course padding with noise increases bandwidth, but it's silly to do that when you could instead be padding it with more bits relevant to the video stream.
The vulnerability here isn't inherent to video compression; it's that the compression rate (and therefore network packet rate) is variable (so as to minimize global size of a video stream). It would not be that difficult to modify an encoder to guarantee fixed bit rate over whatever time chunk you want (depending on how many frames is tolerable to buffer before sending out), then send out those chunks at regular intervals.
All that would do is increase the number of intervals that would need to be observed. Adding noise to a channel only reduces the bandwidth of the channel.
I could be wrong, but I don't know if that is true here. First, if the receiver is aware exactly what parts to ignore, and the transmission itself is not interfered with or altered, then it's not really noise as far as the channel is concerned. Second, the assumption here is that spurious packets would be sent when data is not being transmitted, so this principle probably wouldn't hold anyway because the "noise" is inversely correlated to the signal.
You are forgetting that the user of the drone can set the transmission protocol beforehand, and thus is still an important part of the process.
Given that you have a channel between A and B and an eavesdropper C:
Transmission between A and B can be done by any of several obfuscatory protocols, where (for example) a certain flag sent by A tells B to ignore some component of the transmission, to do no error correction on that part, etc. That flag is sent when updates are sparse, and filler (which is to be ignored) is added by B. Without determining how to identify between filler and data, C sees no variations in traffic volume.
Overcoming real-time spikes is easy with this protocol. Just keep a buffer of a few seconds before you begin transmitting.
This approach is a common answer to any sort of traffic analysis. There's actually a Tor alternative that operates on this principle of constant traffic between nodes (traffic analysis is an effective attack versus Tor), but the name escapes me at the moment.
It takes more CPU and battery to compress the video than to leave it raw. So sending a lossless or semi lossless version of the video should theoretically work.
> Instead of filling the stream and having a constant stream size, you could just code the drone do add random bursts of data, making any prediction from the observer useless.
If it's actually random, wouldn't the average bitrate of "still scene + occasional random data" still be lower than "active scene + occasional random data"? Given enough samples, of course.
The average doesn't really matter. The researchers are looking for "steps" in data streamed caused by changes in the video. If you're adding random levels noise it will drown the signal that the researchers are looking for.
The random noise becomes the background and unless it is a constant addition of noise (which wouldn't save any bandwidth, might as well use uncompressed stream) you could still time changes in the signal.
The 'solution' of adding random data to hide these "steps" is a common misunderstanding of the theory behind preventing bandwidth side-channel attacks. It'll probably work in practice as long as the amount of randomness relative to the delta is sufficiently large and the attacker only can only make a limited amount of samples.
Indeed. However, it is a very good practical example of why encryption alone doesn't do much. (If the attacker can measure your data stream, they can make statistical assumptions and pull a lot of useful data out without breaking the encryption).
Well, It assumes that the amount of delta you can create in the transmitted image is sufficiently large that you can discern it from other sources of random variation in the image.
So if the drone is moving around dramatically and there are huge deltas in the image, making your windows opaque / translucent repeatedly may not be sufficient signal. But if the drone is observing a dark compound and you flash lights everywhere repeatedly, it might not matter that the drone is moving around a lot - there might still be a dramatic change in the data transmitted by the drone.
Ultimately you're putting some signal into a potentially noisy channel and trying to reconstruct whether or not your observations contain that signal. The example is simple enough that normal human eyeballs looking at the output can detect it. Using more sophisticated techniques might be able to tease out smaller signals from noisier channels.
Generally speaking, inter-frame video compression uses an amount of data directly related to the amount that things change from one frame to the next. So while noise like leaves moving shows up as small changes moment to moment, larger changes are still realistically visible.
My younger years spent screwing around with video for a hobby are coming back to me now.
If you do a regular sequence of major changes, you could pick that signal out. This would create a larger and more regular signal in the data than the random noise from leaves. I don't think you could accurately predict observation with just one set of environmental changes.
Yes. The compression works by analyzing changes in the screen. A simple example of this would be if you are displaying data. Say you are taking real time data and it is scrolling across the screen. Your display moves the data to the left, drops off what moved out of the frame, and adds what is new. Pictures are more complex, but the idea is the same. Only take what is new.
Not sure if I would call it clever. Crypto will always leak the amount of data if you don't pad or ensure a constant data rate. Its also far from the first time compression has lead to a side channel. For instance https://en.m.wikipedia.org/wiki/CRIME
-edit-
Even durning world war you if you were listening on enemy radio commucations and they had not been decrypted you could tell if the enemy was planning somthing by noticing a large increase in radio communication. This a long known issue if you dont pad your data and send it at a constant stream. You may leak some information.
What do you mean it's not clever? Sure, they applied a known weakness of non-constant data rate encoded messages to drone monitoring, but that's never been done before. What have you done that's so impressive that gives you the confidence to scoff at the accomplishment of the talented engineers and developers who created something brand new?
Perhaps, how I said that was a bit ambiguous/unclear. However, the article is calling it clever. The research demonstrates a weakness with the communication layer. Just because it applies to drones(insert buzz word) does not make it a novel idea. I just find it funny that Wired is writing about it since it deals with drones, but I have read multiple other papers on the mis-application of compression and cryptography. Finding, flaws is important, I just feel like the article was kinda over selling the concept.
What grabs the attention is isn't so much the procedure or technology itself, but the narrative that we're in a future where this kind of thing might be genuinely useful to somebody someday.
Isn't this an expected outcome of variable-rate compression? Switch to a constant-rate compression or pad with random data to reach that constant rate.
What's really interesting to me here is the watermarking part of this. I imagine a future where everybody has little IR pins on their shoulder and phones scanning encrypted traffic for their watermark...
>In other words, they can see what the drone sees, pulling out their recognizable pattern from the radio signal, even without breaking the drone's encrypted video.
The most interesting part of this is how encrypted traffic leaks enough info to be reliably analyzed. Really cool stuff.
Seems like you could also hijack their signal to use for your own communications. By affecting the frequency and intensity of the variations you can create a binary or analog signal that rides on top.
From my limited knowledge of encryption, this is why you don't compress before you encrypt. So if I understand the following article correctly, this kind of attack is well known. https://blog.appcanary.com/2016/encrypt-or-compress.html
Compress, and then encrypt can be quite devastating if the attacker can insert data into the encrypted payload.
For instance, let's say I can insert any word into an encrypted message. Let's say I insert the word "THE". I can compare the size of the two messages one with the inserted "THE", and the one without. If the size instead of increasing by three bytes increases by less than three. I can be certain the word "THE" is in the message or at least some of the characters. I can keep doing this till I can extract useful information out the message.
This is just the same thing. Video that is static compresses well, but video with lots of change does not compress as well.
So the camera basically let's you insert extra information into the encrypted data stream. Based of the changing size it leaks info. This one is quite simple. It just tells you where the camera is looking, making the scene more dynamic. However, the issue of compression and encryption applies to much more than just drones.
The key insight is that the drone should store the video to local storage for later (1) recovery or (2) buffered encrypted transmission to the mother ship which would eliminate the leakage of clues about what subject is being spied upon.
The problem with that is the video feed is being actively used. If you want to know where someone is going, you can't just make your drone fly around randomly for hours recording video of the area you think they're in, you have to visually acquire your target, and then follow them.
Buffered transmission is unlikely to help; after all, the surveillance target could store the received stream and look for patterns - it would only delay detection for the length of the buffer (which often would need to be kept short due to the real-time nature of much surveillance.
Granted, you could obfuscate things by varying the buffer length and padding the data stream (as suggested near the end of the article) - but at some point, a high-value surveillance target will probably just assume that a nearby drone is probably not operated by an avid bird-watcher or something like that and just blast the drone out of the sky. (Preferably using a laser or better still lots and lots of microwave RF - tends to get the surroundings less aggravated than using a shotgun...)
I don't think that helps with metadata analysis? Assuming I'm following correctly, we're already encrypting, so it shouldn't matter what order we send... Unless you mean to shift around bandwidth spikes, which could help some
This defense is plausible for fixed frequency broadcast video feeds. Likely as drone usage grows and becomes ever more sophisticated (say nothing of current 'nation state grade' systems), their video feeds will likely be spread spectrum and highly directional (if nothing else, directional transmission makes the drone's video feed far more power efficient), making intercept and this type of analysis difficult.
Constant bit rate compression isn't a cure-all. It's generally inferior to VBR which is why the latter took off in the first place. CBR is easily degraded by just increasing the amount of visual complexity. Leaves moving on trees in a breeze is a typical problem case.
So if you're worried about being spied on, high-resolution mosaic patterns are your friend. Mirror or high specularity pigments will enhance the effect. For fixed installations a few discoballs and a laser will mess up someone's day.
Of course, if you want real security you'll triangulate on the drone signals and jam, because if you really need to do so getting busted by the FCC is probably the least of your worries.
The minimum bar for usable quality can be defined precisely, and an encoding algorithm designed around it.
One obvious extreme example of this would be no compression at all, but it's easy to imagine much better designs.
For example, imagine I define a spec where I require quality no worse than a lossless 500x500x24bit/frame video stream. If we budget a bit rate higher than is required for the naive raw transmission of those frames, we can first encode the naive frames, then encode the deltas between them (upscaled to full resolution) such that the reconstruction error is minimized while not exceeding any maximum tolerated error per pixel.
With this approach, you are not crippled when facing an adversarial pattern generator of some kind, yet when transmitting regular natural scenes, you can benefit from significant quality increase.
As you suggest though, it's true that generating the most complex patterns possible will reduce any drone's video quality as much the codec's design permits -- however accomplishing this seems much more expensive for relatively little benefit versus other counter-drone approaches.
> All of that may seem like an elaborate setup to catch a spy drone in the act, when it could far more easily be spotted with a decent pair of binoculars. But Nassi argues that the technique works at ranges where it's difficult to spot a drone in the sky at all, not to mention determine precisely where its camera is pointed.
This technique could also be used in legal proceedings to establish that someone or some org was spying illegally.
I expect image detection in a video stream is pretty advanced these days but I'd imagine you'd need to rotate the image being recorded 3D space to match the plane of the window so you can compare them both? If you know the position of the drone relative to the window, that's doable. Maybe the algorithms are clever enough to account for that these days.
Their technique doesn't actually do any image analysis at all, since the video stream from the drone is encrypted and they can't view it.
Rather, they monitor the bitrate of the video stream and exploit the fact that static/unchanging scenes compress well and have a low bitrate while scenes which are changing have a higher bitrate. They then deliberately change the scene (by blacking out a window) and see whether the bitrate of the video stream increases.
If there's a strong correlation between blacking out the window and changes in bitrate, they can conclude the drone is observing their window.