Skip to main content
VPC Lattice: Full NAT for Third-Party Connectivity
Photo by Bernd 📷 Dittrich on Unsplash

VPC Lattice: Full NAT for Third-Party Connectivity

·3103 words·15 mins
AWS Aws Vpc-Lattice Networking Vpc-Resources Nat Third-Party Terraform
Sergio Cambelo
Author
Sergio Cambelo
Cloud Architect
Table of Contents

When working in enterprise environments with AWS, one of the recurring challenges is third-party connectivity: vendors, partners, customers, or even internal legacy systems operating outside our control. In an ideal world, every party would have a unique, well-planned IP address space. In practice, this rarely happens.

The problem is well-known: we cannot guarantee that third-party CIDR ranges won’t overlap with ours. And when there’s overlap, direct IP connectivity simply doesn’t work — routing tables can’t distinguish between a local destination and a remote one if both share the same range.

In on-premises networks, this problem is commonly solved with SNAT and DNAT on a border firewall or router. The device translates source and destination addresses, creating a bridge between two otherwise incompatible address spaces. It’s a mature, well-understood pattern.

How do we bring this pattern to AWS? Historically, the options haven’t been particularly elegant:

  • NAT instances: EC2 instances performing NAT manually. Functional, but hard to scale, maintain, and make highly available.
  • NLB + PrivateLink: Allows exposing services without overlap, but requires a Network Load Balancer per service, doesn’t easily solve the bidirectional case, and scales poorly when the number of services to expose is high — each service needs its own NLB and endpoint, multiplying costs and operational complexity.
  • Complex routing tables with secondary CIDRs: Adding non-overlapping ranges to VPCs and routing through them. Works, but operational complexity grows quickly.

None of these solutions offer a truly cloud-native experience. This is where Amazon VPC Lattice with its VPC Resources capability comes in: a way to expose and consume resources across VPCs that, by design, performs NAT transparently in the service’s data plane.

In this article, we’ll see how to leverage VPC Lattice and VPC Resources to build a full bidirectional NAT pattern that solves the CIDR overlap problem with third parties in a scalable, managed way.

VPC Lattice and VPC Resources
#

Before diving into the architecture, let’s review the VPC Lattice components we’ll be using. If you’re already familiar with the service, feel free to skip ahead to the architecture section.

Service Network
#

A Service Network is the central logical grouping in VPC Lattice. It acts as a shared connectivity plane to which services, resources, and VPCs are associated. Everything associated with the same Service Network can communicate with each other, subject to the authentication and authorization policies configured.

VPCs can connect to a Service Network in two ways:

  • VPC Association: direct association of the VPC to the Service Network. Simple, but only traffic originating within the VPC itself can reach the Service Network — it’s not transitive.
  • Service Network Endpoint: a VPC Endpoint of type ServiceNetwork created in the consumer VPC. Unlike a VPC Association, it allows traffic that arrives at the VPC from outside — through VPC Peering, Transit Gateway, Direct Connect, or VPN — to reach the Service Network resources.

The difference is analogous to VPC Peering versus Transit Gateway: with VPC Peering connectivity is not transitive, while with Transit Gateway traffic from other networks can flow through it. In our third-party use case, where traffic arrives from external networks, the Service Network Endpoint is the necessary option.

Resource Configuration
#

A Resource Configuration defines a specific resource we want to make accessible through the Service Network. Unlike a VPC Lattice Service (which requires target groups, listeners, and rules), a Resource Configuration allows directly exposing:

  • An IP address (ip_resource)
  • A domain name (dns_resource)
  • An ARN of an AWS resource (arn_resource)

Each Resource Configuration specifies the protocol (currently only TCP) and the port ranges on which the resource is accessible.

Important limitation: VPC Lattice only supports TCP traffic. This means protocols based on UDP such as DNS (port 53/UDP), SNMP, syslog, or any protocol requiring UDP are not compatible with this pattern. If your use case requires UDP, you’ll need to resort to alternatives like NAT instances or NLB with PrivateLink.

Resource Gateway
#

A Resource Gateway is the entry point to the resource within the VPC where it resides. It’s deployed in the VPC’s subnets and acts as an intermediary: it receives traffic arriving from the Service Network and directs it to the actual resource.

It’s important to understand that the Resource Gateway is created in the resource provider’s VPC — that is, in the VPC where the service we want to expose is located.

Service Network Endpoint
#

The Service Network Endpoint is a VPC Endpoint of type ServiceNetwork created in the consumer VPC. Through this endpoint, VPC clients can access all resources and services associated with the Service Network.

The key difference from a VPC Association is that the Service Network Endpoint allows traffic arriving at the VPC from outside (via Transit Gateway, VPN, Direct Connect, or VPC Peering) to reach the Service Network. With a simple VPC Association, only traffic originating within the VPC itself has access.

The implicit NAT
#

And here’s the key to the entire pattern: when traffic flows through VPC Lattice — whether through a Service or a Resource Configuration — the service performs SNAT transparently. The destination resource sees a VPC Lattice address as the source IP, not the client’s real IP.

This means it doesn’t matter if the VPC CIDRs overlap: traffic is never routed directly between them. VPC Lattice acts as an intermediary and translates addresses, exactly as a border firewall would in an on-premises network.

Architecture: the dual VPC pattern
#

Now that we know the components, let’s see how they combine to solve third-party connectivity.

The architecture is based on two VPCs connected through a VPC Lattice Service Network:

Architecture diagram of the dual VPC pattern with VPC Lattice. Shows the 3P Hub VPC (left) with third-party connectivity via VPN and TGW Peering, connected through a VPC Lattice Service Network (center) with Resource Configurations, to the 3P Inspection VPC (right) with Network Firewall and connection to the Central Transit Gateway. Each VPC has a Resource Gateway and a Service Network Endpoint for bidirectional communication.

3P Hub VPC
#

The 3P Hub VPC is the third-party landing zone. This is where partner, vendor, or legacy system connections terminate:

  • Connectivity: Site-to-site VPN, Transit Gateway Peering, Transit Gateway Edge (for Direct Connect) — depending on each third party’s needs.
  • CIDR: This is a critical point. This VPC’s CIDR must be compatible with all third parties connecting to it. Several strategies exist:
    • Use a CGNAT range (100.64.0.0/10), which is rarely used in corporate networks.
    • Use an uncommon private range that doesn’t overlap with known third parties.
    • In extreme cases, use a privately-used public IP range (if one is available).
  • Size: The VPC can be relatively small. A /24 distributed across two AZs is sufficient for most cases, since it doesn’t host workloads — only connectivity components.

This VPC deploys:

  • A Service Network Endpoint to consume internal services exposed through the Service Network.
  • A Resource Gateway to expose third-party resources to the Service Network.

3P Inspection VPC
#

The 3P Inspection VPC is the internal side of the architecture. It’s connected to the corporate network through the Central Transit Gateway (or Cloud WAN, depending on the network topology).

In a production environment, this VPC would include an AWS Network Firewall to inspect all traffic between the internal network and third parties. In this article’s demo, we’ll simplify the architecture by omitting the firewall to keep the focus on the VPC Lattice pattern. If you want to dive deeper into traffic inspection with Network Firewall, check out my article on Global Security Inspection with AWS Cloud WAN.

This VPC deploys:

  • A Resource Gateway to expose internal services to the Service Network (so third parties can consume them).
  • A Service Network Endpoint to consume third-party resources exposed through the Service Network.

VPC Lattice Service Network
#

The Service Network acts as a bridge between the two VPCs. Resource Configurations define what’s exposed in each direction:

  • Third-party resources → exposed from the 3P Hub VPC via its Resource Gateway, consumed from the 3P Inspection VPC via its Service Network Endpoint.
  • Internal services → exposed from the 3P Inspection VPC via its Resource Gateway, consumed from the 3P Hub VPC via its Service Network Endpoint.

The fundamental point is that there is no direct IP connectivity between the two VPCs. All traffic passes through the VPC Lattice data plane, which performs SNAT transparently. This is what allows both VPCs — and even the third parties behind the Hub VPC — to have overlapping CIDRs without any issues.

Traffic flows
#

Let’s look in detail at how traffic flows in each direction.

Consuming a third-party service from the internal network
#

When an internal resource needs to access a service exposed by a third party, the flow is:

  1. Traffic leaves the internal resource and reaches the 3P Inspection VPC through the Central Transit Gateway.
  2. In the 3P Inspection VPC, traffic is directed to the Service Network Endpoint, which sends it to the VPC Lattice Service Network.
  3. VPC Lattice identifies the corresponding Resource Configuration and routes traffic to the Resource Gateway in the 3P Hub VPC.
  4. The Resource Gateway delivers traffic to the third-party resource. The source IP the third party sees is a VPC Lattice address, not the internal resource’s real IP.
  5. The response follows the reverse path.

Exposing an internal service to the third party
#

When a third party needs to access a service we expose, the flow is reversed:

  1. Third-party traffic arrives at the 3P Hub VPC through VPN, TGW Peering, or Direct Connect.
  2. In the 3P Hub VPC, traffic is directed to the Service Network Endpoint, which sends it to the VPC Lattice Service Network.
  3. VPC Lattice identifies the corresponding Resource Configuration and routes traffic to the Resource Gateway in the 3P Inspection VPC.
  4. The Resource Gateway delivers traffic to the internal service. The source IP our service sees is a VPC Lattice address, not the third party’s real IP.
  5. The response follows the reverse path.

In both cases, VPC Lattice acts as a transparent proxy with NAT, eliminating any dependency on each party’s IP address space.

Considerations
#

  • Latency: VPC Lattice adds an extra hop in the traffic path. In most cases, the added latency is minimal (in the order of milliseconds), but it’s important to consider for latency-sensitive applications.
  • Throughput: VPC Lattice scales automatically, but it’s worth reviewing the service quotas to ensure they meet bandwidth requirements.
  • Health checks: Resource Configurations don’t have native health checks like Target Groups in a VPC Lattice Service. If you need failure detection, you’ll need to implement it at the application level or through external monitoring.

Demo
#

Let’s deploy a demo that illustrates the complete pattern.

Note: If you plan to follow the demo, be aware that it generates AWS costs. The main cost components are VPC Endpoints (ServiceNetwork type), Resource Gateways, and VPC Lattice data processing. Don’t forget to delete resources when you’re done.

Demo diagram: two VPCs (VPC A 10.0.0.0/23 and VPC B 10.0.2.0/23) connected through a VPC Lattice Service Network. Each VPC has an EC2 instance with a web server, a Resource Gateway, a Service Network Endpoint, and a VPC Association to the Service Network. Resource Configurations expose each instance on TCP port 80.

The demo deploys:

  • VPC A (10.0.0.0/23): simulates the 3P Hub VPC. Contains an EC2 instance with a web server (Python HTTP server).
  • VPC B (10.0.2.0/23): simulates the 3P Inspection VPC (without firewall). Contains another EC2 instance with a web server.
  • VPC Lattice Service Network: connects both VPCs.
  • Resource Configurations: one per resource (each VPC’s instance), exposing port 80.
  • Resource Gateways: one per VPC, to expose the local resource.
  • Service Network Endpoints: one per VPC, to consume the remote resource. This allows transitive traffic (from VPN, Direct Connect, or Transit Gateway) to reach the Service Network.
  • VPC Associations: one per VPC, required for Resource Gateways to serve traffic to the Service Network.
Each VPC needs both a VPC Association and a Service Network Endpoint. The VPC Association allows the Resource Gateway to serve traffic to the Service Network (provider role). The Service Network Endpoint allows transitive traffic from VPN, Direct Connect, or Transit Gateway to reach the Service Network (transitive consumer role).

The result is that the instance in VPC A can access the web server in VPC B and vice versa, without any direct IP connectivity between the two VPCs. All traffic passes through VPC Lattice, which performs SNAT transparently.

You can access the demo code here:

scambelo/aws-vpc-lattice-full-nat-demo

HCL
0
0

Deployment
#

terraform init
terraform apply
Note: This deployment can take between 5 and 10 minutes. VPC Endpoints and Resource Gateways need time to provision.

Verification
#

Once the infrastructure is deployed, let’s verify that bidirectional connectivity works and that NAT is applied correctly.

With Service Network Endpoints, each resource’s DNS has an endpoint-specific format: vpce-xxx-snra-xxx.rcfg-xxx...vpc-lattice-rsc.region.on.aws. This DNS is publicly resolvable and points to the secondary IPs of the endpoint’s ENIs within your VPC. You can obtain it with aws ec2 describe-vpc-endpoint-associations.
  1. Connect to Instance A through SSM Session Manager. This instance is in VPC A (10.0.0.0/23).

    First, check the instance’s local IP:

    sh-5.2$ hostname -I
    10.0.0.137
    

    Now, access Instance B’s web server through the DNS that the Service Network Endpoint assigns to the Resource Configuration:

    sh-5.2$ curl http://vpce-0835440315e44ed1a-snra-06828d84e5edce152.rcfg-02071b3b63c375b09.4232ccc.vpc-lattice-rsc.eu-west-1.on.aws
    Hello from Instance B (10.0.2.45)
    

    The connection works despite there being no direct IP connectivity between the two VPCs.

  2. Repeat the test in the opposite direction: from Instance B, access Instance A’s web server:

    sh-5.2$ hostname -I
    10.0.2.45
    
    sh-5.2$ curl http://vpce-0739555d6f9d529a9-snra-03a882c66a1033b4b.rcfg-08c054c34657731be.4232ccc.vpc-lattice-rsc.eu-west-1.on.aws
    Hello from Instance A (10.0.0.137)
    

    Bidirectional connectivity works correctly.

  3. Now let’s verify the NAT. From Instance B, run tcpdump while Instance A makes a request:

    sh-5.2$ tcpdump -i any -c 5 port 80 -nn
    12:59:31.408712 ens5  In  IP 10.0.2.166.58602 > 10.0.2.45.80: Flags [S], seq 3388593410, win 62727, length 0
    12:59:31.408789 ens5  Out IP 10.0.2.45.80 > 10.0.2.166.58602: Flags [S.], seq 4158802459, ack 3388593411, win 62643, length 0
    12:59:31.409942 ens5  In  IP 10.0.2.166.58602 > 10.0.2.45.80: Flags [.], ack 1, win 491, length 0
    12:59:31.409986 ens5  In  IP 10.0.2.166.58602 > 10.0.2.45.80: Flags [P.], seq 1:174, ack 1, win 491, length 173: HTTP: GET / HTTP/1.1
    12:59:31.409994 ens5  Out IP 10.0.2.45.80 > 10.0.2.166.58602: Flags [.], ack 174, win 489, length 0
    

    The source IP that Instance B sees is 10.0.2.166 — the Resource Gateway’s IP in VPC B, not Instance A’s real IP (10.0.0.137). This confirms that VPC Lattice performs SNAT through the Resource Gateway, hiding the real source IP.

Feel free to modify the provided Terraform code to experiment with different configurations: change the CIDRs, add more Resource Configurations, or try DNS-type resources instead of IP.

Remember to destroy the Terraform resources when you’re done using them.

terraform destroy

Considerations and limitations
#

Before adopting this pattern, it’s important to consider the following:

TCP only
#

As mentioned, VPC Lattice only supports TCP traffic. If your third-party use case requires UDP (e.g., DNS over UDP, SNMP, syslog, or streaming protocols), this pattern is not applicable. In those cases, the alternatives remain:

  • NAT instances with iptables/nftables rules for SNAT/DNAT.
  • NLB with PrivateLink for specific services (TCP/UDP only, but without the full NAT — SNAT + DNAT — transparency of Lattice).

Service quotas
#

VPC Lattice has quotas worth reviewing before scaling the pattern:

  • Resource Configurations per Service Network.
  • Resource Gateways per VPC.
  • Service Network Endpoints per VPC.

Check the current quotas in the official documentation.

Subnet sizing
#

Resource Gateways and Service Network Endpoints consume IP addresses from the subnets where they’re deployed. According to the AWS documentation:

“We assign IP addresses to each elastic network interface from its subnet in multiples of /28. The number of IP addresses assigned in each subnet depends on the number of resource configurations and we add additional IPs in /28 blocks as needed.”

In our empirical testing we observed that:

  • The Resource Gateway consumes 16 IPs (one /28 block) per subnet, regardless of the number of Resource Configurations it serves.
  • The Service Network Endpoint consumes few visible IPs per ENI, but AWS internally reserves /28 blocks of address space for each Resource Configuration associated with the Service Network.

This means that although the visible assigned IPs are few, AWS needs contiguous available space in the subnet for its internal infrastructure. In our tests, a /24 VPC with 2 /25 subnets (123 usable IPs per subnet) failed with InsufficientFreeAddressesInSubnet with only 2 Resource Configurations.

We recommend /23 as a minimum for this pattern with 2 AZs. In our tests, a /24 VPC (/25 subnets) failed with InsufficientFreeAddressesInSubnet with only 2 Resource Configurations and 2 AZs. With /23 (/24 subnets) it worked correctly even with 3 Resource Configurations. Plan subnets with headroom, as AWS reserves address space internally in /28 blocks that are not visible as assigned IPs on the ENIs.

As of today, AWS does not document the exact formula for how much address space each Resource Configuration consumes in the Service Network Endpoint. If your use case involves exposing dozens of services through this pattern, we recommend validating subnet sizing in a test environment before going to production.

Comparison with PrivateLink + NLB#

PrivateLink + NLBVPC Lattice SNE
NLB requiredYes (one per service)No
Endpoints required1 per service1 for all services
Transparent NATNo (DNAT only)Yes (full SNAT + DNAT)
Transitive (VPN/TGW)NoYes
Subnet sizing1 IP per endpoint per AZRequires larger subnets (see note above)

VPC Lattice simplifies operations and offers transparent NAT with transitivity, but consumes more IP address space than classic PrivateLink. It’s a trade-off to consider in environments with IP addressing constraints.

Costs
#

VPC Lattice costs consist of:

  • Hourly cost for VPC Endpoints (ServiceNetwork type) and Resource Gateways.
  • Per-GB cost for processed data.

For high-traffic scenarios, it’s worth comparing VPC Lattice costs with alternatives like NAT instances (EC2 cost) or NLB + PrivateLink (NLB + PrivateLink hourly and per-GB cost).

Multiple third parties
#

The pattern scales well for multiple third parties. There are two strategies:

  • A single Hub VPC for all third parties: simpler to manage, but requires the Hub VPC’s CIDR to be compatible with all of them. Additionally, segmentation between third parties depends on VPC Lattice policies and security groups.
  • One Hub VPC per third party: greater isolation, but more components to manage. Each Hub VPC would have its own Resource Gateway and Service Network Endpoint.

The choice depends on the required isolation level and the number of third parties.

Wrapping up
#

In this article, we’ve seen how VPC Lattice and VPC Resources allow us to solve the classic CIDR overlap problem in third-party connectivity in a cloud-native way. The dual VPC pattern with Resource Gateways and Service Network Endpoints provides full bidirectional NAT without needing to manage NAT instances, complex routing tables, or additional network appliances.

The key to the pattern is that VPC Lattice acts as a transparent intermediary, performing SNAT on all traffic passing through its data plane. This eliminates the dependency on IP address space and allows connecting networks with overlapping CIDRs in a fully managed way.

Did you enjoy this article? Subscribe to the newsletter and be the first to know when a new one is published!

Subscribe

References
#

Related

AWS Cloud WAN: Simplify Your Global Network Infrastructure
·1804 words·9 mins
AWS Aws Networking Cloudwan Vpc Terraform Infrastructure-as-Code
Technical guide to the AWS European Sovereign Cloud for architects
·3024 words·15 mins
AWS Aws Sovereign-Cloud European-Sovereign-Cloud Terraform Networking Security
How to Work With IPv6 in AWS and Don't Die in the Process
·1797 words·9 mins
AWS Aws Networking Ipv6