Gemini Nano sounds like the most exciting part IMO.
IIRC Several people in the recent Pixel 8 thread were saying that offloading to web APIs for functions like Magic Eraser was only temporary and could be replaced by on-device models at some point. Looks like this is the beginning of that.
> "Using the power of Google Tensor G3, Video Boost on Pixel 8 Pro uploads your videos to the cloud where our computational photography models adjust color, lighting, stabilization and graininess."*
I wonder why the power of Tensor G3 is needed to upload your video to the cloud...
It runs an on-device LLM to generate a HTTP POST every time. It took four interns half a week to reduce the hallucinations, but a PM got a promotion after that.
I think a lot of the motivation for running it in the cloud is so they can have a single point of control for enforcing editing policies (e.g. swapping faces).
Do you have evidence of that? Photoshop has blocked you from editing pictures of money for ages and that wasn't in the cloud. Moreover, how does a Google data center know whether you're allowed to swap a particular face versus your device? It's quite a reach to assume Google would go out of their way to prevent you from doing things on your device in their app when other AI-powered apps on your device already exist and don't have such policy restrictions.
I have no doubt Google could (and might) enforce a lot of these rules on the device, but they likely route it through the cloud if there's a new "exploit" that they want to block ASAP instead of waiting for the app to update.
This is an example of the reputational risk Google has to deal with that small startups don't. If some minor app lets you forge photos, it's not a headline. If an official Google app on billions of devices lets you do it, it's a hot topic.
It could simply also be that their inpainting model is quite bad at certain things, and replacing a person's head produces consistently bad results. Hiding the problem could simply be easier than fixing it.
The fact that it's multimodal is very interesting, they might not make it open source, but if they intend run it on people's devices, even if they intend to implement DRM, someone will figure out how to extract the weights and get it running outside.
IIRC Several people in the recent Pixel 8 thread were saying that offloading to web APIs for functions like Magic Eraser was only temporary and could be replaced by on-device models at some point. Looks like this is the beginning of that.