There's quite a few things you've missed that are significant and should have be...

cure · on Jan 16, 2019

+1 for VPC Gateways. If you want to get decent aggregate performance out of S3 when you have many readers/writers in your VPC, you want one of those.

grahamlyons · on Jan 17, 2019

Network ACLs have been pointed out as missing from this before but quite a few people said that they were right not be included. I didn't put them in because I've never used them so didn't fall under 'need to know' from my perspective.

IPv6 is another point of contention but again it's not something I've ever used and so, apart from any other controversies with it ("...IPv6 which is only marginally better than IPv4 and which offers no tangible benefit...", https://varnish-cache.org/docs/trunk/phk/http20.html), I'm not qualified to write about it.

EIPs and ENI should probably have been in there but I don't tend to use those that often either so they didn't occur to me.

I'm not sure that VPC Gateways, DNS or DHCP are necessarily need to know things either. VPC Gateways are for a specific routing optimisation which not everyone is going to need. I didn't know the details of the DNS set up for a VPC so thank you for that.

Thank you for the feedback - I really appreciate you taking the time.

jcrites · on Jan 17, 2019

I would also add VPC PrivateLinks to the list, which let you establish private connections between systems in different VPCs without having to either peer them or connect them in other ways. PrivateLinks allow you to relieve the pressure that you might otherwise feel to build a lot of systems in the same VPC.

Another useful concept (not VPC-specific) is using the Infrastructure-as-Code paradigm (e.g., CloudFormation, Terraform) to capture all of your networking configuration in source control, along with who made any changes and the reasons or design documentation for them.

dvtrn · on Jan 16, 2019

Network ACLs [...] Whilst they are optional, having a default set it straightens out a lot of duplication that may end up in Security Groups (which are more stateful in nature).

I inherited an infrastructure that had NetACLs and security groups with duplicate entrypoints and policies, years of accumulated cruft because it was poorly designed and the documentation was even worse (read: nonexistent), security groups all the way down. That one threw me through a hard and annoying mental loop for a couple of hours until picking through with the finest tooth comb revealed what was going on.

The fun part is going to be rebuilding our routing in a new VPC such that it doesn't make the next guy want to put his head in a black hole.

I'd be lying if I said it wasn't a fun challenge in a sordid kind of way, though.

AmericanChopper · on Jan 17, 2019

I guess it’s a matter of preference, but I strongly prefer security groups over ACLs, which I don’t use at all. Even if only from a compliance perspective, a security group is equivalent to a host firewall (which personally helps me with PCI - no need for iptables and windows firewall). Whereas an ACL is a bit harder to make that case with. I also find them easier to audit.

javadocmd · on Jan 17, 2019

I like using ACLs for my coarse-grained "this subnet is allowed to talk to this subnet" rules, and security groups for everything finer-grained. Maybe I'm over-cautious, but I don't want one rogue security group opening up a tunnel to sensitive subnets.

ajbourg · on Jan 17, 2019

Yes, this is one of the best reasons to use network ACLs. (You can also achieve this with routes)

I think the idea is that separate teams with different responsibilities can manage the two different layers. Your app team may manage the security groups but the security team manages network ACLs which limit what can go into or come out of a subnet.

AmericanChopper · on Jan 17, 2019

That’s a reasonable design pattern. For my usecase we have those segmentations in place at the VPC level, so ACLs wouldn’t add anything for us.

dvtrn · on Jan 17, 2019

I'm slightly inclined to agree, it's one of those YMMV scenarios. What happened to me was there was some unholy combination of both going on, duplicating each other, in some cases weaving in and out of each other with some bastard frankenstein topology of route tables to nowhere...

those were frightening times. Entire services would fall over, dogs and cats living together

fideloper · on Jan 17, 2019

How do folks use Network ACLs? I haven't used them personally - relying more on security groups and segmenting subnets to specific tasks (e.g. attached to public network via IGW, or private network only)

I'd love to hear your use cases for Network ACLs.

himangshuj · on Jan 16, 2019

Network ACLS are quite tricky to debug. For one of my connections, network calls were failing because esp was blocked at acl layer, ACL blocks all non tcp traffic by default. Funnily, network calls with same data-center was working but was failing when calling to another data-center. I had to look at VPC flow logs to figure that non tcp protocals were being blocked.

romeisendcoming · on Jan 17, 2019

Let me summarize further. If you come from 20 years of application development and network design/administration in 'real' LAN and IGRP networks with 'real' hardware you are going to be learning everything again.

These cloud end user environments are fake eggs and saccharine sweetener.

mz1290 · on Jan 17, 2019

Agreed! I recently transitioned into a developer role and my first project was working with AWS cloudformation to build a network template.

I think the post does a good job covering the high-level material. NACLs, EIP, and perhaps peering routes would also be good to mention.

cxmcc · on Jan 16, 2019

A few more here: VPC Peering. VPC Endpoints.