This seems like an insane stance to have, it's like saying businesses should ship their own stock, using their own drivers, and their in-house made cars and planes and in-house trained pilots.
Heck, why stop at having servers on-site? Cast your own silicon waffers, after all you don't want spectrum exploits.
Because you are worst at it. If a specialist is this bad, and the market is fully open, then it's because the problem is hard.
AWS has fewer outages in one zone alone than the best self-hosted institutions, your facebooks and petagons. In-house servers would lead to an insane amount of outage.
And guess what? AWS (and all other IAAS providers) will beg you to use multiple region because of this. The team/person that has millions of dollars a day staked on a single AWS region is an idiot and could not be entrusted to order a gaming PC from newegg, let alone run an in-house datacenter.
edit: I will add that AWS specifically is meh and I wouldn't use it myself, there's better IASS. But it's insanity to even imagine self-hosted is more reliable than using even the shittiest of IASS providers.
> This seems like an insane stance to have, it's like saying businesses should ship their own stock, using their own drivers, and their in-house made cars and planes and in-house trained pilots.
> Heck, why stop at having servers on-site? Cast your own silicon waffers, after all you don't want spectrum exploits.
That's an overblown argument. Nobody is saying that, but it's clear that businesses that maintain their own infrastructure would've avoided today's AWS' outage. So just avoiding a single level of abstraction would've kept your company running today.
> Because you are worst at it. If a specialist is this bad, and the market is fully open, then it's because the problem is hard.
The problem is hard mostly because of scale. If you're a small business running a few websites with a few million hits per month, it might be cheaper and easier to colocate a few servers and hire a few DevOps or old-school sysadmins to administer the infrastructure. The tooling is there, and is not much more difficult to manage than a hundred different AWS products. I'm actually more worried about the DevOps trend where engineers are trained purely on cloud infrastructure and don't understand low-level tooling these systems are built on.
> AWS has fewer outages in one zone alone than the best self-hosted institutions, your facebooks and petagons. In-house servers would lead to an insane amount of outage.
That's anecdotal and would depend on the capability of your DevOps team and your in-house / colocation facility.
> And guess what? AWS (and all other IAAS providers) will beg you to use multiple region because of this. The team/person that has millions of dollars a day staked on a single AWS region is an idiot and could not be entrusted to order a gaming PC from newegg, let alone run an in-house datacenter.
Oh great, so the solution is to put even more of our eggs in a single provider's basket? The real solution would be having failover to a different cloud provider, and the infrastructure changes needed for that are _far_ from trivial. Even with that, there's only 3 major cloud providers you can pick from. Again, colocation in a trusted datacenter would've avoided all of this.
>, but it's clear that businesses that maintain their own infrastructure would've avoided today's AWS' outage.
When Netflix was running its own datacenters in 2008, they had a 3 day outage from a database corruption and couldn't ship DVDs to customers. That was the disaster that pushed CEO Reed Hastings to get out of managing his own datacenters and migrate to AWS.
The flaw in the reasoning that running your own hardware would avoid today's outage is that it doesn't also consider the extra unplanned outages on other days because your homegrown IT team (especially at non-tech companies) isn't as skilled as the engineers working at AWS/GCP/Azure.
> it's clear that businesses that maintain their own infrastructure would've avoided today's AWS' outage.
Sure, that's trivially obvious. But how many other outages would they have had instead because they aren't as experienced at running this sort of infrastructure as AWS is?
You seem to be arguing from the a priori assumption that rolling your own is inherently more stable than renting infra from AWS, without actually providing any justification for that assumption.
You also seem to be under the assumption that any amount of downtime is always unnacceptable, and worth spending large amounts of time and effort to avoid. For a lot of businesses systems going down for a few hours every once in a while just isn't a big deal, and is much more preferable than spending thousands more on cloud bills, or hiring more full time staff to ensure X 9s of uptime.
You and GP are making the same assumption that my DevOps engineers _aren't_ as experienced as AWS' are. There are plenty of engineers capable of maintaining an in-house infrastructure running X 9s because, again, the complexity comes from the scale AWS operates at. So we're both arguing with an a priori assumption that the grass is greener on our side.
To be fair, I'm not saying never use cloud providers. If your systems require the complexity cloud providers simplify, and you operate at a scale where it would be prohibitively expensive to maintain yourself, by all means go with a cloud provider. But it's clear that not many companies are prepared for this type of failure, and protecting against it is not trivial to accomplish. Not to mention the conceptual overhead and knowledge required with dealing with the provider's specific products, APIs, etc. Whereas maintaining these systems yourself is transferrable across any datacenter.
This feels like a discussion that could sorely use some numbers.
What are good examples of
>a small business running a few websites with a few million hits per month, it might be cheaper and easier to colocate a few servers and hire a few DevOps or old-school sysadmins to administer the infrastructure.
depends I guess, I am running on-prem workstation for our DWH. So far in 2 years it went down minutes at the time, when I decided to do so, because of hardware updates.
I have no idea where this narrative came from, but usually hardware you have is very reliable and doesn't turn off every 15 minutes.
Heck, I use old T430 for my home server and still it doesn't go down on completely random occasions (but thats very simplified example, I know)
The one in work yes, but for internal network, as we are not exposed to internet. But to be honest, we are probably one of few companies that make priority that there is always electricity and internet in the office (with UPS, electricity generator, multiple internet providers).
No idea what are the standards for other companies.
There are at least 6 cloud providers I can name that I've used which run their own data centers with capabilities similar to AWSs core products (ec2, route53, s3, cloud watch, rdb)
Ovh, scaleway, online.net, azure, gcp, aws
That's one's I've used in production, I've heard of a dozen more including big names like HP and IBM, I assume they can match aws for the most part.
...
That being said I agree multi tenant is the way to go for reliability. But I was pointing out that in this case even the simple solution of multi region on one provider was not implemented by those affected.
...
As for running your own data center as a small company. I have done it, buying components building servers and all.
Expenses and ISP issues aside, I can't imagine using in house without at least a few outages a year for anywhere near the price of hiring a DevOps person to build a MT solution for you.
If you think you can you've either never tried doing it OR you are being severely underpaid for your job.
Competent teams to build and run reliable in house infrastructure exist, and they can get you SLA similar to multi region AWS or GC (aka 100% over the last 5 years)... But the price tag has 7 to 8 figures in it.
This is the right answer, I recall studying for the solutions architect professional certification and reading this countless times: outages will happen and you should plan for them by using multi-region if you care about downtime.
It's not AWS fault here, it's the companies', which assume that it will never be down. In-house servers also have outages, it's a very naive assumption to think that it'd be all better if all of those services were using their own servers.
Facebook doesn't use AWS and they were down for several hours a couple weeks ago, and that's because they have way better engineers than the average company, working on their infrastructure, exclusively.
If all you wanted to do was vacuum the floor you would not have gotten that particular vacuum cleaner.
Clearly you wanted to do more than just vacuum the floor and something like this happening should be weighed with the purchase of the vacuum.
> AWS (and all other IAAS providers) will beg you to use multiple region
will they? because AWS still puts new stuff in us-east-1 before anywhere else, and there is often a LONG delay before those things go to other regions. there are many other examples of why people use us-east-1 so often, but it all boils down to this: AWS encourage everyone to use us-east-1 and discourage the use of other regions for the same reasons.
if they want to change how and where people deploy, they should change how they encourage it's customers to deploy.
my employer uses multi-region deployments where possible, and we can't do that anywhere nearly as much as we'd like because of limitations that AWS has chosen to have.
so if cloud providers want to encourage multi-region adoption, they need to stop discouraging and outright preventing it, first.
> AWS still puts new stuff in us-east-1 before anywhere else, and there is often a LONG delay before those things go to other regions.
Come to think of it (far down the second page of comments): Why east?
Amazon is still mainly in Seattle, right? And Silicon Valley is in California. So one would have thought the high-tech hub both of Amazon and of the USA in general is still in the west, not east. So why us-east-1 before anywhere else, and not us-west-1?
Most features roll out to IAD second, third, or fourth. PDX and CMH are good candidates for earlier feature rollout, and usually it's tested in a small region first. I use PDX (us-west-2) for almost everything these days.
I also think that they've been making a lot of the default region dropdowns and such point to CMH (us-east-2) to get folks to migrate away from IAD. Your contention that they're encouraging people to use that region just don't ring true to me.
It works really well imo. All the people who want to use new stuff at the expense of stability choose us-east-1; those who want stability at the expense of new stuff run multi-region (usually not in us-east-1 )
This argument seems rather contrived. Which feature available in only one region for a very long time has specifically impacted you? And what was the solution?
Quick follow up. I once used a IASS provider (hyperstreet) that was terrible. Long story short provider ended closing shop and the owner of the company now sells real estate in California.
Was a nightmare recovering data. Even when the service was operational was sub par.
Just saying perhaps the “shittiest” providers may not be more reliable.
> In-house servers would lead to an insane amount of outage.
That might be true, but the effects of any given outage would be felt much less widely. If Disney has an outage, I can just find a movie on Netflix to watch instead. But now if one provider goes down, it can take down everything. To me, the problem isn't the cloud per se, it's one player's dominance in the space. We've taken the inherently distributed structure of the internet and re-centralized it, losing some robustness along the way.
> That might be true, but the effects of any given outage would be felt much less widely.
If my system has an hour of downtime every year and the dozen other systems it interacts with and depends on each have an hour of downtime every year, it can be better that those tend to be correlated rather than independent.
I think you're missing the point of the comment. It's not "don't use cloud". It's "be prepared for when cloud goes down". Because it will, despite many companies either thinking it won't, or not planning for it.
> AWS has fewer outages in one zone alone than the best self-hosted institutions, your facebooks and petagons. In-house servers would lead to an insane amount of outage.
> they usually beg you to use multiple availability zones though
Doesn't help you if it what goes down is AWS global services on which you directly, or other AWS services, depend (which tend to be tied to US-east-1).
Heck, why stop at having servers on-site? Cast your own silicon waffers, after all you don't want spectrum exploits.
Because you are worst at it. If a specialist is this bad, and the market is fully open, then it's because the problem is hard.
AWS has fewer outages in one zone alone than the best self-hosted institutions, your facebooks and petagons. In-house servers would lead to an insane amount of outage.
And guess what? AWS (and all other IAAS providers) will beg you to use multiple region because of this. The team/person that has millions of dollars a day staked on a single AWS region is an idiot and could not be entrusted to order a gaming PC from newegg, let alone run an in-house datacenter.
edit: I will add that AWS specifically is meh and I wouldn't use it myself, there's better IASS. But it's insanity to even imagine self-hosted is more reliable than using even the shittiest of IASS providers.