Scaling laws enabled an investment in capital and GPU R&D to deliver 10,000x faster training.
That took the wold from autocomplete to Claude and GPT.
Another 10,000x would do it again, but who has that kind of money or R&D breakthrough?
The way scaling laws work, 5,000x and 10,000x give a pretty similar result. So why is it surprising that competitors land in the same range? It seems hard enough to beat your competitor by 2x let alone 10,000x
That took the wold from autocomplete to Claude and GPT.
Another 10,000x would do it again, but who has that kind of money or R&D breakthrough?
The way scaling laws work, 5,000x and 10,000x give a pretty similar result. So why is it surprising that competitors land in the same range? It seems hard enough to beat your competitor by 2x let alone 10,000x