It's a win for Google that LLMs are getting cheaper to run. OpenAI's service is too expensive to be ad-funded. Google needs a technology that's cheaper to provide to maintain their ad-supported business model.
Google could make a bet like they did with YouTube.
At the time, operating YouTube was eye wateringly expensive and lost billions. But google could see where things were going: a triple trend of falling storage costs, falling bandwidth and transmission costs (I’m trying to dig up a link I read years ago about this but google search has gotten so shit that I can’t find it).
It was similar for Asic miners for Bitcoin. Given enough demand, specialised, lower cost hardware specially for LLMs will emerge.
On the flip side, I found only one person (I'm sure there are more) that are attacking the software efficiency side of things. You would be quite surprised how inefficient the current LLM software stack is, as I learned on a CPP podcast [0]. Ashot Vardanian has a great github repo [1] that demonstrates many ways compute can come way down in complexity and thus cost.