Design Considerations for VPCs on AWS
Experiences of building VPCs
Few areas of cloud infrastructure are more important to get right from the start than the IP address layout of one’s Virtual Private Cloud (VPC). VPC design has far-reaching implications for scaling, fault-tolerance and security. It also directly affects the flexibility of your infrastructure: paint yourself into a corner, and you’ll spend ungodly amounts of time migrating instances across subnets to free up address space.
Fortunately, it’s easier to lay out a VPC the right way than the wrong way. You just have to keep a few principles in mind.
Proper subnet layout is the key to a well-functioning VPC. Subnets determine routing, Availability Zone (AZ) distribution, and Network Access Control Lists (NACLs).
The most common mistake I’ve observed around VPC subnetting is the treatment of a VPC like a data centre network. VPC’s are not data centres. They are not switches. They are not routers. (Although they perform the jobs of all three.) A VPC is a software-defined network (SDN) optimised for moving massive amounts of packets into, out of and across AWS regions. Your packet is picked up at the front door and dropped off at its destination. It’s as simple as that.
Because of that simplicity, a number of data centre and networking-gear issues are eliminated at the outset.
A bit of history: when I first started building data centres in the 90’s, we had 10 Mb/s ethernet switches. Ethernet uses Address Resolution Protocol (ARP) broadcasts to determine who’s where in the switch fabric. Because of that, network segments are chatty in direct proportion to the number of hosts on the broadcast domain. So anything beyond a couple hundred hosts would start to degrade performance. That, combined with the counter-intuitive nature of IPv4 subnet math, led to the practical effect of everyone using 24-bit subnets for different network segments. Three-octet addresses seemed to sit right in the sweet spot of all the constraints.
That thinking is no longer valid in a cloud environment. VPCs support neither broadcast nor multicast. What looks like ARP to the OS is actually the elegant function of the SDN. With that in mind, there is absolutely no reason to hack a VPC into 24-bit subnets. In fact, you have an important reason not to: waste. When you have a “middle-tier” subnet with 254 addresses available (or 128 or 64 or 32 or 16) and you only have 4 middle-tier hosts, the rest of those addresses are unavailable for the remainder of your workloads.
If instead you have a mixed-use subnet with 4,094 addresses, you can squeeze every last IP for autoscaling groups and more. Thus it behooves you to make your subnets as large as possible. Doing so gives you the freedom to dynamically allocate from an enormous pool of addresses.
Generally speaking, there are three primary reasons to create a new subnet:
- You need different hosts to route in different ways (for example, internal-only vs. public-facing hosts)
- You are distributing your workload across multiple AZs to achieve fault-tolerance. Always, ALWAYS do this.
- You have a security requirement that mandates NACLs on a specific address space (for example, the one in which the database with your customers’ personally identifiable information resides)
Let’s look at each of these factors in turn.
All hosts within a VPC can route to all other hosts within a VPC. Period. The only real question is what packets can route into and out of the VPC.
In fact, you could easily have a VPC that doesn’t allow packets to enter or leave at all. Just create a VPC without an Internet Gateway or Virtual Private Gateway. You’ve effectively black-holed it.
A VPC that can’t serve any network traffic would be of dubious value, so let’s just assume that you have an app that you’re making available to the Internet. You add an Internet Gateway and assign some Elastic IP addresses to your hosts. Does this mean they’re publicly accessible? No, it does not. You need to create a route table for whom the Internet Gateway is the default route. You then need to apply that table to one or more subnets. After that, all hosts within those subnets will inherit the routing table. Anything destined for an IP block outside the VPC will go through the Internet Gateway, thus giving your hosts the ability to respond to external traffic.
That said, almost no app wants all its hosts to be publicly accessible. In fact, good security dictates the principle of least privilege. So any host that doesn’t absolutely need to be reachable directly from the outside world shouldn’t be able to send traffic directly out the front door. These hosts will need a different route table from the ones above.
Subnets can have only one route table (though route tables can be applied to more than one subnet). If you want one set of hosts to route differently from another, you need to create a new subnet and apply a new route table to it.
AWS provides geographic distribution out of the box in the form of Availability Zones (AZs). Every region has at least two.
Subnets cannot span multiple AZs. So to achieve fault tolerance, you need to divide your address space among the AZs evenly and create subnets in each. The more AZs, the better: if you have three AZs available, split your address space into four parts and keep the fourth segment as spare capacity.
In case it’s not obvious, the reason you need to divide your address space up evenly is so the layout of each AZ is the same as the others. When you create resources like autoscaling groups, you want them to be evenly distributed. If you create disjointed address blocks, you’re creating a maintenance nightmare for yourself and you will regret it later.
The first layer of defence in a VPC is the tight control you have over what packets can enter and leave.
Above the routing layer are two levels of complementary controls: Security Groups and NACLs. Security Groups are dynamic, stateful and capable of spanning the entire VPC. NACLs are stateless (meaning you need to define inbound and outbound ports), static and subnet-specific.
Generally, you only need both if you want to distribute change control authority over multiple groups of admins. For instance, you might want your sys admin team to control the security groups and your networking team to control the NACL’s. That way, no one party can single-handedly defeat your network restrictions.
In practice, NACLs should be used sparingly and, once created, left alone. Given that they’re subnet-specific and punched down by IP addresses, the complexity of trying to manage traffic at this layer increases geometrically with each additional rule.
Security Groups are where the majority of work gets done. Unless you have a specific use-case like the ones described earlier, you’ll be better served by keeping your security as simple and straightforward as possible. That’s what Security Groups do best.
The above was meant as a set of abstract guidelines. I’d like to provide a concrete example to show how all this works together in practice.
The simplest way to lay out a VPC is to follow these steps:
- Evenly divide your address space across as many AZ’s as possible.
- Determine the different kinds of routing you’ll need and the relative number of hosts for each kind.
- Create identically-sized subnets in each AZ for each routing need. Give them the same route table.
- Leave yourself unallocated space in case you missed something. (Trust me on this one.)
So for our example, let’s create a standard n-tier app with web hosts that are addressable externally. We’ll use 10.0.0.0/16 as our address space.
The easiest way to lay out a VPC’s address space is to forget about IP ranges and think in terms of subnet masks.
For example, take the 10.0.0.0/16 address space above. Let’s assume you want to run across all three AZs available to you in us-west–2 so your Mongo cluster can achieve a reliable quorum. Doing this by address ranges would be obnoxious. Instead, you can simply say “I need four blocks—one for each of the three AZs and one spare.” Since subnet masks are binary, every bit you add to the mask divides your space in two. So if you need four blocks, you need two more bits. Your 16-bit becomes four 18-bits.
10.0.0.0/16: 10.0.0.0/18 — AZ A 10.0.64.0/18 — AZ B 10.0.128.0/18 — AZ C 10.0.192.0/18 — Spare
Now within each AZ, you determine you want a public subnet, a private subnet and some spare capacity. Your publicly-accessible hosts will be far fewer in number than your internal-only ones, so you decide to give the public subnets half the space of the private ones. To create the separate address spaces, you just keep adding bits. To wit:
10.0.0.0/18 — AZ A 10.0.0.0/19 — Private 10.0.32.0/19 10.0.32.0/20 — Public 10.0.48.0/20 — Spare
Later on, if you want to add a “Protected” subnet with NACL’s, you just subdivide your Spare space:
10.0.0.0/18 — AZ A 10.0.0.0/19 — Private 10.0.32.0/19 10.0.32.0/20 — Public 10.0.48.0/20 10.0.48.0/21 — Protected 10.0.56.0/21 — Spare
Just make sure whatever you do in one AZ, you duplicate in all the others:
10.0.0.0/16: 10.0.0.0/18 — AZ A 10.0.0.0/19 — Private 10.0.32.0/19 10.0.32.0/20 — Public 10.0.48.0/20 10.0.48.0/21 — Protected 10.0.56.0/21 — Spare 10.0.64.0/18 — AZ B 10.0.64.0/19 — Private 10.0.96.0/19 10.0.96.0/20 — Public 10.0.112.0/20 10.0.112.0/21 — Protected 10.0.120.0/21 — Spare 10.0.128.0/18 — AZ C 10.0.128.0/19 — Private 10.0.160.0/19 10.0.160.0/20 — Public 10.0.176.0/20 10.0.176.0/21 — Protected 10.0.184.0/21 — Spare 10.0.192.0/18 — Spare
Your routing tables would look like this:
“Public” 10.0.0.0/16 — Local 0.0.0.0/0 — Internet Gateway
“Internal-only” (ie, Protected and Private) 10.0.0.0/16 — Local
Create those two route-tables and then apply them to the correct subnets in each AZ. You’re done.
And in case anyone on your team gets worried about running out of space, show them this table:
16-bit: 65534 addresses 18-bit: 16382 addresses 19-bit: 8190 addresses 20-bit: 4094 addresses
Obviously, you’re not going to need 4,000 IP addresses for your web servers. That’s not the point. The point is that this VPC has only those routing requirements. There’s no reason to create new subnets in this VPC that don’t need to route differently within the same AZ.
Done properly, this method of planning goes a long way to ensuring you won’t get boxed in by an early decision. Everything that you’ll get into from here — Security Groups, Auto Scaling , Elastic Load Balancing , Amazon Relational Database Service, AWS Direct Connect, and more — will fit neatly into this model.