It wasn't long ago that a uni senior who worked for a decade+ on Google Search told me that it was hopeless anyone tries to compete with Google not because it sees a tonne of signals that helps with IR but because of its in-house AI/ML.
It turns out that the org that built the ultimate AI/ML that runs rings around anything that came before it for NLP (and thus IR) was a sister team at Google Translate.
It isn't inconceivable that a kid might be able to build a Google-quality web search, scalability aside, on CommonsCrawls data in a weekend. As someone who built re-ranking algorithms for a search engine built atop Yahoo! and Wikipedia (REST/SOAP) APIs back in the late 2000s as a side project (and experienced the launch and subsequent iterations of Echo/Alexa up close at Amazon), the current capabilities (of even the open weight multi-modal models) seem too good to be true.
Google itself though is saved by its enormous distribution advantages afforded by Chrome (3B to 5B users) and Android (3B+), aside from its search deals with Apple and other browser vendors.
It wasn't long ago that a uni senior who worked for a decade+ on Google Search told me that it was hopeless anyone tries to compete with Google not because it sees a tonne of signals that helps with IR but because of its in-house AI/ML.
It turns out that the org that built the ultimate AI/ML that runs rings around anything that came before it for NLP (and thus IR) was a sister team at Google Translate.
It isn't inconceivable that a kid might be able to build a Google-quality web search, scalability aside, on CommonsCrawls data in a weekend. As someone who built re-ranking algorithms for a search engine built atop Yahoo! and Wikipedia (REST/SOAP) APIs back in the late 2000s as a side project (and experienced the launch and subsequent iterations of Echo/Alexa up close at Amazon), the current capabilities (of even the open weight multi-modal models) seem too good to be true.
Google itself though is saved by its enormous distribution advantages afforded by Chrome (3B to 5B users) and Android (3B+), aside from its search deals with Apple and other browser vendors.