Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sounds like we need some new training methods. If training could take place locally and asynchronously instead of globally through backpropagation, the amount of energy could probably be significantly reduced.


Disclosure: I work at MosaicML

Yeah, I strongly agree. While Nvidia is working on better hardware (and they're doing a great job at it!), we believe that better training methods should be a big source of efficiency. We've released a new PyTorch library for efficient training at http://github.com/mosaicml/composer.

Our combinations of methods can train CV models ~4x faster to the same accuracy on CV tasks, and ~2x faster to the same perplexity/GLUE score on NLP tasks!


I've been seeing a lot more about MosaicML on my Twitter feed. Just wanted to ask -- how are your priorities different than, say, Fastai?


The principled way of doing this is via ensemble learning, combining the predictions of multiple separately-trained models. But perhaps there are ways of improving that by including "global" training as well, where the "separate" models are allowed to interact while limiting overall training costs.


Trying to reduce energy consumption for ML like this is so silly.


Training costs are growing exponentially bigger.

The degree to which energy and capital costs can be optimized will determine how large they can go.


That's like a person driving the Model T in 1908 saying "trying to reduce gas efficiency is so silly".

Why are people so dumb when it comes to planning for the future? Does it require a 1973 oil crisis to make people concerned about potential issues? Why can't people be preventative instead of reactive? Isn't the entire point of an engineer to optimize what they're building for the good of humanity?


Reducing energy consumption for computation is not silly.

We're at a point we we're turning into a computation driven society and computation is becoming a globally relevant power consumption aspect.

> global data centers likely consumed around 205 terawatt-hours (TWh) in 2018, or 1 percent of global electricity use

And that's just data centers, if you add all client devices you probably double that.

Plus that number will only continue to grow.


Why?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: