It is a combination of things, of which cooling is one. A more fundamental and challenging physical limit is increased loss due to higher frequency (which is why chip voltage needs to be increased). It gets to the point where too much signal loss occurs across "long" lines between parts of the chip. This is why using plasmonics in ICs is an active field of research; the hope is to be able to use optics for long distance interconnects inside the processor.
The fact that we're at a point where millimeter distances from one side of a CPU die to the other is starting to be considered "long distance communication" is bemusing and awesome at the same time.
In order to get a higher switching speed of a transistor you need to increase the voltage. For the power usage of an electronic device holds: P = V^2 / R, therefore if you want a higher clock frequency your power usage increases with the square of the applied voltage. Which means that you'll need excessive amounts of cooling power (e.g. LN2) to keep your CPU from dying at such high core voltages (2V). Usually processors are run at ~1.2V for factory specs.