This doesn't appear to have any training facility, so its misuse would seem to be limited to the pre-trained voices supplied - for the casual user (and the ease-of-use seems to be the central issue in these comments).
My experience with voice cloning is that training is typically not required for it to work. You just embed a bit of audio of the desired voice to be cloned using the backing VAE and the model can do the rest.