The CPU/GPU is to an LLM kinda like axons and dendrites are to the human brain: just a low-level implementation detail. The main crux of an LLM is what happens at a higher level.
The machine-level instructions being executed are just matrix multiplications. Billions of them. The complexity of LLM behavior is emergent from that.
The machine-level instructions being executed are just matrix multiplications. Billions of them. The complexity of LLM behavior is emergent from that.