What's the best way to provide a new EC2 instance with a fast ~256GB dataset dir...

MaBu · on Aug 22, 2024

EFS supports 30 GiB/s throughput now. https://aws.amazon.com/about-aws/whats-new/2024/08/amazon-ef...

Otherwise instance drive and sync over S3.

ayewo · on Aug 22, 2024

Instance storage can be incredibly fast for certain workloads but it's a shame AWS doesn't offer instance storage on Windows EC2 instances.

Instance storage seems to only be available for (large) Linux EC2 instances.

msolson · on Aug 22, 2024

Instance storage can be a good option depending on your workload, but definitely has limitations. There's huge value in separating the lifecycle of storage from compute, and EBS provides higher durability than instance storage as well.

There are no operating system limitations that I'm aware of, however. I was just able to launch a Windows m6idn.2xlarge to verify.

ayewo · on Aug 23, 2024

Thanks for checking. I realize now that I wasn’t clear in my original comment.

My use case was to bring up a Windows instance using instance storage as the root device instead using of EBS which is the default root device.

I wanted to run some benchmarks directly on drive C:\ — backed by an NVMe SSD-based instance store — because of an app that will only install to drive C:\, but it seems there’s no way to do this.

The EC2 docs definitely gave me the impression that instance storage is not supported on Windows as a root volume.

Here’s one such note from the docs: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/RootDevi...

”Windows instances do not support instance-store backed root volumes.”

Really nice that you are engaging with the comments here on HN as the article’s author.

(For others who may not be aware, msolson = Marc Olson)

msolson · on Aug 23, 2024

Ah yes. Instance store root volumes are what we originally launched EC2 with--EBS came along 2 years later, but as data volumes only at first, with boot from EBS I think a year after that. There's a lot less fragility, and really it's just easier for our customers to create an EBS snapshot based AMI.

Before we launched the c4 instance family the vast majority of instance launches were from EBS backed AMIs, so we decided to remove a pile of complexity, and beginning with the c4 instance family, we stopped supporting instance storage root volumes on new instance families.

apitman · on Aug 22, 2024

Impressive, but I don't think we every determined conclusively whether our EFS problems were caused by throughput or latency.

Also, throughput is going to be limited by your instance type, right? Though that might also be the case for EBS. I can't remember. Part of the problem is AWS performance is so confusing.

flybarrel · on Aug 22, 2024

Yes throughput for EBS is also relevant to the instance type. It is all published here: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-opti...

EBS connects to EC2 via a separate pipeline, different from the EC2 instance Networking bandwidth. This is true for all Nitro instances. EFS / FSx connects to EC2 via Networking bandwidth. So you should refer to that if you are looking for the bandwidth information.

vegardx · on Aug 23, 2024

I think that depends a lot on how the data is going to be used. It sounds like you're not really using EBS volumes for what they're great at; Durability.

While instance storage is ephemeral nothing really stop you from using it as a local cache in a clustered filesystem. If you have a somewhat read intensive workload then you might see performance close to matching that of using instance storage directly.

There are some fundamental limits to how fast a clustered filesystem can be, based on things like network latency and block size. Things like locking is an order of magnitude slower on a clustered filesystem compared to locally attached storage.

benlivengood · on Aug 23, 2024

Does https://docs.aws.amazon.com/ebs/latest/userguide/ebs-volumes... not work for you? Is it the updates that need to be available quickly to all the readers?

dekhn · on Aug 22, 2024

EFS has tunable performance knobs. You can turn them way, way up to get an insanely fast file server, but it will cost $$$

lustre-fan · on Aug 23, 2024

I think FSx/Lustre would be the simplest assuming you're using Linux instances. You can definitely scale to 256GB/s or more. Plus, you can automatically hydrate the dataset from s3. Lustre is common for HPC and ML/AI, so if you doing anything along those lines it's a good fit.