Would make sense I suppose if I was using two different GPUs for the same thing ...

		CaptainOfCoit 23 days ago \| parent \| context \| favorite \| on: A bug that taught me more about PyTorch than years... Would make sense I suppose if I was using two different GPUs for the same thing and get two different outcomes. But instead I have two implementations (one naive, one tensor cores) running on the same GPU, but getting different outcomes, where they should be the same. But then this joke might be flying above my head as well.

Tensor cores use lower precision, so small numerical differences should be expected.