Does this download models at runtime? I would have expected a different API for that. I understand that you don’t want to include a multi-gig model in your app. But the mobile flow is usually to block functionality with a progress bar on first run. Downloading inline doesn’t integrate well into that.
You’d want an API for downloading OR pulling from a cache. Return an identifier from that and plug it into the inference API.
You’d want an API for downloading OR pulling from a cache. Return an identifier from that and plug it into the inference API.