This isn't a problem that silicon can fix. FHE requires orders of magnitude more operations to be done, and/or more complex operations, to achieve the same results. GP silicon already runs these operations as fast as possible, it's just that there are too many of them, and they are not even parallelizable.
I was gonna say, but yeah if you can't make it more parallel, then yeah custom silicon definitely would bottleneck on that. TBF I know nothing about this, but I have done a lot of parallelization and speeds up of algos on FPGA that are relatively easy to parallelize and seen tremendous speed ups over general purpose CPUs.
Wait, how would hardware be possibly leaky? The encryption/decryption can leak data, sure, but that's unrelated to FHE. If your hardware can leak info about the plaintext on the side processing the encrypted data, then you could do the same in software anyway, and the FHE scheme itself is clearly broken...
It's clear v4dok doesn't understand how FHE works, maybe due to confusion with enclaves like SGX. The hardware isn't able to leak the plaintext because it doesn't have the key; it executes on the ciphertext.
The hardware is not relevant if it's simply being used as an accelerator. FHE reveals no information about the underlying data, so the privacy leakage is the same, no matter if you're running the FHE code on ENIAC or on a modern supercomputer.