Not really. The thing Vespa solved was taking existing ANN methods and fixing a ...

Not really. The thing Vespa solved was taking existing ANN methods and fixing a disk/RAM tradeoff (and some other niceties). That's nowhere close to adequte when:

1. As softwaredoug mentioned, you might want to filter results, potentially with a high filtration rate.

2. ANN isn't good enough. Suppose you need bounded accuracy with meaningfully sublinear time on a high-dimensional dataset. You're hosed.

Point (1) is just a repeat of a general observation that composition of nice data structures doesn't usually give you a nice data structure, even if it technially works. Creating a thing that does what you want without costing both arms and the president's leg requires actually understanding DS&A and applying it in your solution from the ground up.

Point (2) might seem irrelevant (after all, people are "building" stuff with RAG and whatnot nowadays aren't they?), but it's crucial to a lot of applications. Imagine, e.g., that there exists one correct result in your database. The guarantees provided by SOTA ANN solutions (on high-dimensional data) have a steep compute/correctness tradeoff, giving you an abysmal chance of finding your document without searching an eye-watering fraction of your database. I usually work around that with the relaxation that the result needs to be the best one but that its information can be fuzzy (admitting solutions which merge a bunch of low-dimensional queries, corrected via some symmetry in a later step), but if you actually need good KNN results then you're kind of hosed.