Just shows how inefficient some of the ML research code can be

actually_a_dog · on March 31, 2023

As a former grad student, I can tell you, that's all research code, not just ML, or even "performance-oriented" research code.

robrenaud · on March 31, 2023

Training tends to require a lot more precision and hence memory than inference. I bet many of the tricks here won't work well for training.

alduin32 · on April 1, 2023

For now we've just shown how measuring memory consumption can be tricky at times.

rvz · on April 1, 2023

Exactly.

It also shows the number of impostors in this thread and inflated titles of self-proclaimed 'seniors' who can't optimize ML code to even be on the same league as Tunney (jart), and Gerganov (ggerganov).

Not even ChatGPT or Copilot could even submit a change or in-fact completely rewrite and optimize this code like they have done.

visarga · on April 1, 2023

Remember this moment when you're about to criticise LLMs. People can act suboptimal too, even experts.