With all the great progress in large language models lately, and them being excellent text compressors, I've started to wonder if you couldn't just replace a search engine with a like 100mb file full of weights that let you query essentially google scale results except all locally.
Yeah you picked the hugest SOTA one of them all, but there's smaller ones like this https://bellard.org/libnc/gpt2tc.html that run well even on CPUs that might run well fine tuned specifically on search results (or at least just code queries).
The only significant difference between those models is the amount of data, the main(or the 2nd most important) reason why you're using a search engine is how much data is available on it.
If you want to search through an incredibly limited % of the web then yeah it can be a solution, but even the lamest search engine company out there would outperform a GPT-2 like model running from your laptop.