IIUC, it's because it's easier to rigorously prove the VM prevents classes of bugs (i.e. memory safety issues) and then reuse that VM in many places than it is to rigorously prove that many separate embedded systems not relying on the VM have independently avoided those bugs.
I don't know specific details about the correctness of any particular VM, sorry. That was just the explanation I got from another engineer that apparently had experience in developing embedded java things.