Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What about reproducibility of the compiler binaries? I got the impression that model training itself isn't deterministic across training hardware? Since the model is embedded in the binaries, then the binaries aren't really reproducible if the model isn't. How big is the data used to train the model? How costly is it to train the model?


From the post:

> The TensorFlow model is embedded with XLA AOT, which converts the model into executable code.

Taking the TF model to executable code should be deterministic.

Generating the TF model might not be (depends on implementation and hardware).

But it's unclear from this if this is a problem in practice - it's not uncommon for non-deterministic models to end up producing the same output because you perform thresholding/quantizing or some analogous process to convert it into a classification style output.

I.E. here, you are generating operations, which you can setup as a classification problem: Given this input and this history what is the next operation.

And of course you can always got back to your saved model and generate the code from that.

While the exact scores for the next operation might be non-deterministic you always end up with the highest scoring one being the same.


It’s pretty easy to train models deterministically. Just use the same batch size, float type, initial seed, and flip this flag: https://www.tensorflow.org/api_docs/python/tf/config/experim...


Caveat:

> on the same hardware


Based on https://github.com/google/ml-compiler-opt#pretrained-models, it seems like the models are versioned, so I guess they'd update their models when updating LLVM or something like that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: