More

jeffbarr · on Nov 18, 2022

Talk about testing in production. You have to die first to see if this triggers, and if not -- your automation becomes the ultimate in legacy code.

Seriously, this is a very interesting idea. However, after losing both of my parents in the last 4 years, there's a sad but legitimate intermediate state between alive & well, and deceased. People lose their ability to function and make decisions on their own, might not be able to renew your service.

mohamedattahri · on Nov 18, 2022

I’m sorry for your loss.

Your raise a very good point. Two things.

(1) Incapacitation is something on our radar, and we really care about getting it right;

(2) We put a lot of effort into making sharing and keeping people up to date as simple as possible, so they can reach out to us if there’s a problem.

jeffbarr · on Jan 17, 2022

Correct - you know your AWS launch history!

jeffbarr · on Jan 16, 2022

Perhaps they were thinking of Rochester, New York: https://rocwiki.org/abandoned_subway

jeffbarr · on Nov 30, 2021

I wrote the AWS post and did my best to share lots of technical details; are there any specific things that you want to know more about?

dmw_ng · on Nov 30, 2021

Generally a fan of your posts, but this one was very heavy on marketing buzzology ("cloud scale"). I can't tell if there was a genuine use case for designing a proprietary SSD, or if it were some pet project. Is "75% lower latency variability" because the first gen SSD was a CS101 project, or because AWS have developed some material edge over what others (with much wider scope) in the industry have been doing for years? I can't tell.

I can't see a reason to buy or use this product.

jeffbee · on Dec 1, 2021

I doubt that other companies' supposedly "wider scope" actually exists or gives them advantages. Both Amazon and Google make their own SSDs and have the largest computer installations in the known universe. The fact that Samsung makes a lot of SSDs for laptops may not give them wider scope at all.

simonebrunozzi · on Nov 30, 2021

I actually think that these posts have gotten much better over the past 2-3 years, at least based on my taste; the level of technical details is just right. On specific topics, I wouldn't mind James Hamilton-level specifics, but you can't be too deep on everything all the time.

(hi Jeff! Hope you're well :D)

jeffbarr · on Nov 30, 2021

Hi Simone, doing well and we are trying to add more info while still being frugal with words and with the time of our readers.

kaliszad · on Nov 30, 2021

Oh I would love some more deep dives or presentations by James Hamilton into various aspects of AWS. They combine the high level overview and the deep technical details in a very informative and entertaining way.

rektide · on Nov 30, 2021

Hi Jeff! Eeeeeek! I'd love to know so much more about the Nitro acceleration. All these accelerated fabrics are so interesting.

* What does the Nitro accelerator look like to the host? . Does the Nitro accelerator present as NVMe devices to the OS host, or is there a more custom thing it presents as? Does the Nitro accelerator use SR-IOV to or something else to present as many different PCIe adapters, per-drive PCIe, or a single PCIe device, or no PCIe devices at all, something else entirely (and if so what)? Are there custom virt-io drivers powering the VMs? How much change has gone into these interfaces in the newest iterations, or have these interface channels remained stable?

* What is the over the wire communication? Related to the above; ultimately the VM's see NVMe, & how far down the stack/across the network does that go? Is what's on the wire NVMe based, or something else; is it custom? What trade-offs were there, what protocols inspired the teams? Originally at launch it seemed like there was a custom remote protocol[1]; has that stayed? What drove the protocol evolution/change over time? What's new & changed?

* What do the storage arrays look like; are they also PCs based? Or do the flash arrays connect via accelerators too? Are these FPGA-based or hard silicon? Are there standard flash controllers in use, or is this custom? How many channels of flash will one accelerator have connected to it? How much has the storage array architecture changed since Nitro was first introduced? Do latest gen nitro & older EBS storages have the same implementation or are newer EBS storages evolving more freely now?

* On a PC, an SSD is really an abstraction hiding dozens of flash channels. There have been efforts like Open Channel SSDs and now zoned namespaces to give the PCs more direct access to the individual channels. Does the Nitro accelerator connect to a single "endpoint" per EBS, or is the accelerator fanning out, connecting to multiple endpoints or multiple channels, doing some interleaving itself?

* What are some of the flash-translation optimizations & wins that the team/teams have found?

And simply: * How on earth can hosts have so much networking/nitro throughput available to them?! It feels like there's got to be multiple 400Gbit connections going to hosts today. And all connected via Nitro accelerators?

It's just incredibly exciting stuff, there's so much super interesting work going on, & I am so full of questions! I was a huge fan of the SeaMicro accelerators of yore, an early integrated network-attached device accelerator. Getting to work at such scale, build such high performance well integrated systems seems like it has so so many interesting fascinating subproblems to it.

[1] https://www.youtube.com/watch?v=e8DVmwj3OEs#t=11m58s

Andys · on Nov 30, 2021

> * How on earth can hosts have so much networking/nitro throughput available to them?!

I feel this is something overlooked when people complain about the egress fees

lend000 · on Nov 30, 2021

If you have an existing EC2 instance with EBS storage and want to convert it to the new Nitro SSD, what will be the process for migration? E.g. a live swapping of attached storage devices, a quick reboot, or spinning up a new instance?

jeffbarr · on Nov 30, 2021

The Nitro SSDs are currently used as instance storage, directly attached to particular EC2 instances.

lend000 · on Nov 30, 2021

Thanks for the response. To clarify, does this mean that only some EC2 instances will be eligible (i.e. if I have an older EC2 instance I will have to re-create it)?

Androider · on Nov 30, 2021

Nitro SSDs appear to only be available on specific new instances types, like the just announced Im4gn and Is4gen.

posnet · on Nov 30, 2021

Are there plans to provide Metal instances with these new SSDs?

jeffbarr · on Nov 30, 2021

I don't know one way or the other, but great question. I prefer launching stuff to hinting about it :-)

posnet · on Nov 30, 2021

Fair enough, and good luck with the rest of re:Invent

sitkack · on Nov 30, 2021

I'd like to see P99.9 and MAX latency for certain read and write patterns. More concretely a before and after wrt a specific workload would be even better.

jeffbarr · on Nov 6, 2021

Hello old friend!

kragen · on Nov 7, 2021

Hello Jeff! What's a nice guy like you doing in a degraded place like this?

jeffbarr · on Aug 10, 2021

Amazon S3 on Outposts (more info at https://aws.amazon.com/s3/outposts/ ) runs on-premises, offers durable storage, high throughput, the S3 API, currently scales to 380 TB, and doesn't require you to watch for and deal with failing disks.

I believe that it addresses many of the OP's reasons for deciding against S3.

Nullabillity · on Aug 10, 2021

At least disclose your conflicts of interest when writing spam like this.

GiorgioG · on Aug 10, 2021

And only costs $169,000 to start

gunapologist99 · on Aug 10, 2021

Hey, let's be fair. The storage-optimized instances start at only $425,931. ;)

oneplane · on Aug 10, 2021

It's not like buying hardware, support personnel hours and write-off administration is that much cheaper, unless you're willing to discard some features, but at that point you're no longer comparing things equally.

rodgerd · on Aug 10, 2021

Have Outposts fixed their "always-on, lose connectivity, lose your outpost" problem that they had when I first asked about them?

Can they scale down to "I need to spin up an S3 thing for local testing" for the cost of the storage and CPU?

Am I locked into a multi-year agreement, or can I just go and throw it away in a month and stop paying?

speedgoose · on Aug 10, 2021

I'm not going to contact AWS sales when I can easily use minio on Docker or Kubernetes.

jeffbarr · on July 28, 2021

Yes indeed!

kevinslin · on July 28, 2021

you beat me to submitting the ec2 classic post. but only fitting that it come from you :)

jeffbarr · on July 28, 2021

You will be happy to know that I used Dendron to manage my TODO list that included an entry for writing this blog post!

jeffbarr · on July 5, 2021

> That's hilarious. I think Jeff has an account here.

I am 100% sure that I did not write that, and I am 100% sure that this comes across as far too Jeff-like for comfort!

sillysaurusx · on July 5, 2021

Hahaha. I bet it was a shock to see your name pop up in GPT-3.

That feeling when you don't know whether you're famous enough that you were included in the training set and successfully influenced the bot, or the bot simply used a pretty common American name.

jeffbarr · on April 28, 2021

There's more info in the Twitter thread (https://twitter.com/wrathofgnon/status/1250287741247426565). Patience and pruning every two years are key!

samatman · on April 28, 2021

Also (this is easy to miss) the technique is performed only on clones of one mutant Sugi tree (Japanese "cedar", actually more closely related to redwoods but it's its own genus).

jeffbarr · on April 28, 2021

Check out https://cacm.acm.org/magazines/2015/4/184701-how-amazon-web-... ("How Amazon Web Services Uses Formal Methods") and https://d1.awsstatic.com/Security/pdfs/One_Click_Formal_Meth... ("One-Click Formal Methods") for more info.