I dare say ChatGPT 3.0 and 4.0 are the only recent examples where pure computing produced a significant edge compared to algorithmic improvements. And that edge lasted a solid year before others caught up. Even among the recent improvements;
1. Gaussian splashing, a hand-crafted method threw the entire field of Nerf models out the water.
2. Deepseek o1 is used for training reasoning without a reasoning dataset.
3. Inception-labs 16x speedup is done using a diffusion model instead of the next token prediction.
4. Deepseek distillation, compressing a larger model into a smaller model.
That sets aside the introduction of the Transformer and diffusion model themselves, which triggered the current wave in the first place.
AI is still a vastly immature field. We have not formally explored it carefully but rather randomly tested things. Good ideas are being dismissed for whatever randomly worked elsewhere. I suspect we are still missing a lot of fundamental understanding, even at the activation function level.
We need clever ideas more than compute. But the stock market seems to have mixed them up.
I dare say ChatGPT 3.0 and 4.0 are the only recent examples where pure computing produced a significant edge compared to algorithmic improvements. And that edge lasted a solid year before others caught up. Even among the recent improvements;
1. Gaussian splashing, a hand-crafted method threw the entire field of Nerf models out the water. 2. Deepseek o1 is used for training reasoning without a reasoning dataset. 3. Inception-labs 16x speedup is done using a diffusion model instead of the next token prediction. 4. Deepseek distillation, compressing a larger model into a smaller model.
That sets aside the introduction of the Transformer and diffusion model themselves, which triggered the current wave in the first place.
AI is still a vastly immature field. We have not formally explored it carefully but rather randomly tested things. Good ideas are being dismissed for whatever randomly worked elsewhere. I suspect we are still missing a lot of fundamental understanding, even at the activation function level.
We need clever ideas more than compute. But the stock market seems to have mixed them up.