If you’re talking about physical hardware, Google pales in comparison to Amazon,...

the-rc · on Feb 7, 2023

Does it pale? How? AWS is very opaque when it comes to power demand, but you could use that to estimate machine counts and they admitted to Greenpeace that in 2016 they used 500-650MW — a wide range that obviously obfuscates on purpose. See the last Clicking Clean report from 2017.

Google/Alphabet used 6514 terawatt hours in 2016 according to data collected from yearly reports at https://www.statista.com/statistics/788540/energy-consumptio...

If my math is right, dividing that by 8760 hours in a year, you get 743MW.

Of course, that also includes office space, etc. (did the AWS number, too?) but it should be clear that, cross checking with data center builds, optic fiber and energy purchases as well, for years Google+YT+Apps+GCP were larger than Amazon and all other AWS customers combined. I didn't even factor in efficiency, something that Amazon started focusing on quite a bit later.

Someone might be able to extrapolate both numbers to today based on infrastructure, other metrics or other spend in quarterly financial statements (or power procurement, which will be complicated by the non trivial Amazon vs AWS distinction).

All of the above to say that Amazon probably has more compute now, but it's a stretch to talk about "paling".

killjoywashere · on Feb 7, 2023

Yeah. I have seen internal numbers on YouTube's daily ingest a few times over the years and every time my jaw drops. Like, the number from 2020 is ridiculous compared to their own number from 2017, and that was ridiculous.

the-rc · on Feb 7, 2023

I have an idea what you're talking about, because I worked with the YT folks to reduce their Colossus costs and on Google storage in general until 2015. Another humbling and iluminating experience was comparing the Borg resources in the main crawling/indexing cluster to the Top500 list of the same period, something that always comes to mind and makes my eyes go wide when people compare DDG to Google. Or the day when a colleague taped a page saying only ”1E” next to our cubicle because we had just reached that much storage across the fleet.

alsodumb · on Feb 6, 2023

Unlike Azure and Amazon, Google doesn't have to rely on Nvidia GPUs for training and inference, they achieved significant performance gains by using their custom TPUs.

coder543 · on Feb 6, 2023

Your comment is simply incorrect. Amazon has had their own TPU equivalents for training and inference for years:

https://aws.amazon.com/machine-learning/trainium/

https://aws.amazon.com/machine-learning/inferentia/

I really don't think this would be a limiting factor regardless, even if Amazon didn't already have multiple generations of these products. It's not as if an Amazon or Microsoft sized company is incapable of developing custom silicon to meet an objective, once an objective is identified. TPUs also aren't really that complicated to design, at least compared to GPUs.

I'm slightly surprised Microsoft hasn't bothered to release any custom ML chips for Azure yet, but I guess they've run the numbers and decided to focus on other priorities for now.

flakeoil · on Feb 7, 2023

But you are confusing yourself. Just because AWS has more servers being rented to and used by their customers doesn't necessarily mean they have the most CPU power available for themselves to run AI models.

I also doubt computing power is the real bottleneck. Anyone of these companies (and most other companies too) can build enough large servers sites and they have the money. The costly and difficult part is the engineering resources, doing the right thing technically (AI wise) and business wise, not lose time, not bet on the wrong horse etc.

fomine3 · on Feb 7, 2023

I suspect that Google's own usage (search, YouTube, CDN, etc) is still bigger than GCP, is it correct?

scarmig · on Feb 6, 2023

> google’s private compute