Quantum states are all one really needs, but it turns out that it's way to computationally expensive to simulate all that just for the purpose of AI applications - so instead we have to go to higher levels of construction. Attention is surely just about on the cusp of what is computationally reasonable which means that it's not all we need, we need more efficient and richer constructions.