IPsec VPN Explained: IKE Phases, Security Associations, and Route vs Policy-Based Tunnels
A Cloud Architect’s Deep Dive AWS Solutions Architect Perspective
The Real Problem: Two Sites, One Untrusted Network
Picture this: you have two office locations Business Site 1 in London, Business Site 2 in Frankfurt and you need them to communicate securely. Between them sits the public internet: shared, observable, and completely outside your control. Anyone on-path can potentially intercept, modify, or replay the traffic.
The naive solution just send the data is obviously unacceptable. The slightly-less-naive solution encrypt it somehow runs straight into a fundamental question: how do two parties that have never met before agree on an encryption key without that key being interceptable in transit?
That is the problem IPsec solves. Not just encryption, but the entire process of establishing a secure, authenticated, integrity-protected channel between two peers who start with zero shared secrets.
IPsec is not a single protocol. It is a suite of protocols and mechanisms that work together to provide:
Authentication — confirming each peer is who it claims to be
Integrity — ensuring packets have not been tampered with in transit
Confidentiality — encrypting the payload so intercepted traffic is unreadable
Anti-replay protection — preventing recorded packets from being reinjected
At its core, an IPsec VPN establishes a secure tunnel between two gateways (a local peer and a remote peer). Everything that follows in this article is the machinery behind how that tunnel gets built and maintained.
Hi — this is Pushpit from CloudOdyssey . Each week, I write about Cloud, DevOps, Systems Design deep dives and community update around it. If you have not subscribed yet, you can subscribe here.
Before We Touch IKE: Encryption Fundamentals You Need to Understand
If you skip this section, the IKE phases will feel like magic steps rather than logical engineering decisions. Don’t skip it.
Symmetric Encryption
Symmetric encryption uses the same key to both encrypt and decrypt. AES is the canonical example. It is fast, computationally cheap, and well-suited to encrypting large volumes of data which is exactly what a VPN tunnel does once it’s established.
The problem is obvious: if both sides need the same key, how does one side get the key to the other side without someone intercepting it?
Asymmetric Encryption
Asymmetric encryption uses a key pair a public key and a private key. Data encrypted with the public key can only be decrypted by the corresponding private key. You can hand your public key to anyone in the world and it doesn’t compromise your security.
Asymmetric encryption solves the key distribution problem, but it comes with a cost: it is orders of magnitude slower than symmetric encryption. Using RSA to bulk-encrypt a 1 Gbps tunnel is not viable.
The Practical Solution: Use Asymmetric to Bootstrap Symmetric
The engineering answer is to use asymmetric cryptography to securely establish a shared symmetric key, then use that symmetric key for all actual data encryption. This is the foundation of almost every secure protocol in use today TLS, SSH, and yes, IPsec.
Diffie-Hellman: How Two Parties Independently Arrive at the Same Key
Diffie-Hellman (DH) is the specific mechanism IPsec uses to solve the key exchange problem. Here’s the high-level concept without the maths:
Both peers agree publicly on a DH group (essentially a set of mathematical parameters)
Each peer independently generates a private key this never leaves the device
Each peer derives a public key from their private key using the agreed parameters
The public keys are exchanged over the network (they can be intercepted; that’s fine)
Each peer then independently combines their own private key with the remote peer’s public key and both arrive at the same shared value
The critical property: an eavesdropper who intercepts both public keys cannot derive the shared value without knowing at least one private key. This is the Diffie-Hellman problem, and it is computationally hard to break with properly sized groups.
That shared DH value is then used as the seed material for generating symmetric keys. Neither side ever transmitted the symmetric key they both computed it independently.
With that in place, we can now talk about IKE.
IKE Phase 1: Building the Foundation (IKE Security Association)
IKE stands for Internet Key Exchange. It is the protocol that manages the negotiation and establishment of IPsec tunnels. Phase 1 has one job: establish a secure, authenticated channel between the two VPN peers. Everything transmitted in Phase 2 will be protected by this channel.
This is a deliberate design choice. Phase 1 is the heavy lifting. It is slower and more computationally intensive because it has to solve the hardest problems: authentication and key exchange from a cold start.
Step 1: Authentication
Before any key exchange happens, each peer needs to prove its identity. There are two main methods:
Pre-Shared Key (PSK): Both sides have the same passphrase configured out-of-band. It’s simpler to set up but less scalable. If the PSK is weak or leaked, the whole tunnel is compromised.
Certificates (PKI): Each peer has a certificate issued by a Certificate Authority. This is more operationally complex but significantly more scalable and auditable preferred in enterprise environments.
Steps 2–5: The Diffie-Hellman Exchange
As illustrated in the diagram below, once identity is established, the DH key exchange proceeds:
Concretely:
Each side generates a DH private key locally
Each side derives a corresponding DH public key
Public keys are exchanged across the public internet (unencrypted at this point which is fine by design)
Each side independently uses its own private key and the peer’s public key to compute the shared DH value
Both sides arrive at the same DH key without ever transmitting it
Step 6–7: Key Material Exchange and Symmetrical Key Derivation
Using the shared DH key, both peers now encrypt further exchanges passing additional key material, nonces, and algorithm proposals. From this combined material, both sides independently derive the same symmetrical encryption key.
This symmetrical key is what protects the Phase 1 tunnel itself. All further communication between the peers including the entire Phase 2 negotiation happens inside this encrypted channel.
The result is the IKE Security Association (IKE SA) a bidirectional, authenticated, encrypted control channel between the two peers. It is sometimes called the Phase 1 tunnel, though technically it’s the management channel through which Phase 2 is negotiated.
Why is Phase 1 slow? Because asymmetric cryptography and DH computation are expensive. Phase 1 exists to pay that cost once. Phase 2 piggybacks on that investment.
IKE Phase 2: The IPsec Security Association (Operational Tunnel)
Phase 2 runs entirely inside the protection of Phase 1. Its purpose is to negotiate the parameters for the actual data-carrying tunnel and generate the keys that will encrypt your traffic.
What Happens in Phase 2
Step 1 — Negotiate encryption and integrity algorithms: Both sides communicate their supported algorithms (e.g., AES-256, SHA-256) and agree on the strongest set they share. This negotiation is encrypted by the Phase 1 symmetrical key.
Step 2 — Pass additional key material: Using the existing DH key and newly exchanged nonces, both sides derive fresh keying material. Importantly, the IPsec keys used to encrypt actual traffic are derived separately from (though seeded by) the Phase 1 keys. This provides key separation compromising the data plane key does not directly expose the control plane key.
Step 3 — Generate the IPsec key: The DH key material plus the exchanged nonces are combined to produce a symmetrical IPsec key. Both sides derive this independently and arrive at the same value.
Step 4 — Bulk encryption of interesting traffic: The IPsec key is now used to encrypt all traffic matching the VPN’s traffic selectors. This is the operational tunnel the one your application data actually flows through.
Why Phase 2 is Fast
Phase 2 operates entirely within an already-secure channel. It doesn’t need to redo authentication or full asymmetric key exchange. The heavy work was done in Phase 1. Phase 2 is lightweight negotiation followed by symmetric key derivation computationally cheap relative to Phase 1.
The dependency is absolute: Phase 2 cannot exist without Phase 1. If the Phase 1 SA expires or drops, Phase 2 tunnels tear down with it. This is why monitoring your IKE SA health matters in production environments.
Security Associations: What They Actually Are
The term “Security Association” (SA) appears constantly in IPsec documentation, often without a precise definition. Here’s the exact picture.
A Security Association is a one-way logical connection that defines the parameters used to secure traffic in a single direction. It specifies:
The encryption algorithm and key in use
The integrity/authentication algorithm and key
The lifetime of the association (time or byte-based)
The Security Parameter Index (SPI) a 32-bit identifier included in each packet header that tells the receiving device which SA to use to process that packet
SAs are unidirectional by design. To have two-way encrypted communication, you need two SAs one for each direction. This pair is called an SA pair. The IKE SA is bidirectional at the management level (IKEv2 specifically), but IPsec SAs for data traffic are always paired.
The SPI is critical to understand for exam purposes. When Site 1 sends an ESP (Encapsulating Security Payload) packet to Site 2, the packet contains the SPI. Site 2 looks up the SPI in its local SA database (SAD) and retrieves the corresponding decryption key and algorithm. The SPI is not a secret it’s a lookup index. The security comes from the keys the SA contains, not from the SPI itself.
Rekeying
SAs have lifetimes. Before an SA expires, IKE negotiates new SAs — a process called rekeying. This is essential for:
Forward secrecy: Old traffic cannot be decrypted even if current keys are later compromised
Operational continuity: Traffic doesn’t drop when keys rotate
In IKEv2, rekeying is handled more gracefully than in IKEv1. In AWS Site-to-Site VPN, SA lifetimes and rekeying are managed automatically.
Policy-Based vs Route-Based VPN: A Critical Architectural Distinction
This is one of the most practically important concepts in IPsec VPN design, and it directly affects how you architect solutions in AWS.
Policy-Based VPN
In a policy-based VPN, traffic is matched by access control policies essentially ACL rules that define “traffic from source X to destination Y using protocol Z should be encrypted and sent through this tunnel.”
Key characteristics:
Each policy (traffic selector pair) creates its own IPsec SA pair
Three subnets on each side potentially means three separate SA pairs one per source/destination combination
Different policies can use different encryption parameters
Tightly coupled to the specific traffic flows defined at configuration time
Adding new subnets that need VPN protection requires adding new policies and triggers new SA negotiation
Policy-based VPNs give you fine-grained control but they don’t scale well. They also create operational complexity: each new encryption domain adds SAs to manage, and mismatched policies between peers are a common cause of tunnel failures.
A practical implication: AWS does not support policy-based VPN as the primary mode for Site-to-Site VPN. AWS VGW and Transit Gateway use route-based VPN. If a customer’s on-premises firewall only supports policy-based VPN, this creates compatibility constraints that require careful architectural consideration.
Route-Based VPN
In a route-based VPN, traffic is matched by the routing table, not by explicit policy rules. The VPN creates a virtual tunnel interface (VTI), and traffic is sent through that interface based on normal IP routing decisions.
Key characteristics:
Single SA pair per tunnel regardless of how many subnets are in play
Any traffic routed to the tunnel interface is encrypted with the same IPsec SA
Dynamic routing protocols (like BGP) can run over the tunnel this is significant
Simpler SA management: one tunnel = one SA pair
Adding a new subnet simply requires a routing entry pointing to the tunnel interface
Route-based VPNs are the preferred architecture in cloud environments and modern SD-WAN designs. The simplicity and scalability advantages are substantial, and BGP support is critical for environments with complex or changing route tables.
AWS Site-to-Site VPN: IPsec in Cloud Architecture
AWS Site-to-Site VPN is a managed IPsec VPN service. Understanding how it maps to what we’ve discussed above is directly relevant to the Solutions Architect Associate exam.
AWS VPN is Route-Based
AWS Site-to-Site VPN creates route-based tunnels. Each VPN connection consists of two tunnels active/passive by default, though active/active is possible with certain routing configurations. The redundancy is architectural: if one tunnel’s underlying path fails, traffic routes through the other.
The two-tunnel design maps directly to AWS’s availability model. Each tunnel terminates on a different endpoint in different Availability Zones on the AWS side.
Virtual Private Gateway vs Transit Gateway
Virtual Private Gateway (VGW): The older construct. Attached to a single VPC. Supports BGP and static routing. Suitable for simple hub-and-spoke connectivity between an on-premises network and a single VPC.
Transit Gateway (TGW): The scalable alternative. Acts as a regional routing hub. Multiple VPCs and multiple VPN connections attach to the Transit Gateway. Route tables on the TGW control traffic flow between attachments. This is the recommended architecture for multi-VPC environments or when you need to connect multiple on-premises locations to AWS.
For the exam: understand that TGW supports Equal-Cost Multi-Path (ECMP) routing across multiple VPN tunnels. This allows both tunnels of a VPN connection — or tunnels across multiple VPN connections to be active simultaneously, effectively aggregating bandwidth.
BGP Integration
AWS Site-to-Site VPN supports Border Gateway Protocol (BGP) for dynamic routing. When BGP is configured:
AWS advertises its VPC CIDR ranges over the tunnel
The on-premises gateway advertises its prefix ranges to AWS
Route propagation is dynamic no manual static route maintenance required
Routing decisions can be influenced using BGP attributes (AS path, MED)
For exam purposes: BGP over VPN is the recommended approach for complex or large-scale environments. Static routing is available but operationally heavier to maintain.
What the Exam Tests
The SAA-C03 exam tends to test IPsec VPN at the following conceptual levels:
Understanding that Site-to-Site VPN uses IPsec and creates two redundant tunnels
Knowing the difference between VGW and TGW and when to use each
Recognising that route-based VPN with BGP is the scalable default
Understanding that VPN is over the public internet if private, dedicated connectivity is required, Direct Connect is the answer
Knowing that accelerated Site-to-Site VPN uses AWS Global Accelerator endpoints to reduce latency for geographically distant peers
The exam does not ask you to configure IKE parameters. But understanding Phase 1 and Phase 2 at the conceptual level helps you correctly diagnose scenario questions about VPN failures, connectivity issues, and architectural trade-offs.
Putting It All Together: The Architectural Summary
IPsec VPN is not a single feature it is a layered architecture that solves the key exchange and secure tunnel establishment problem in a deliberate sequence.
Phase 1 (IKE SA) builds trust. It authenticates both peers, executes the Diffie-Hellman key exchange, and establishes an encrypted management channel. This is the slow, expensive, one-time foundation. Pre-shared keys or certificates provide the authentication. DH provides the key exchange without ever transmitting the shared secret. The resulting IKE SA is a secure tunnel for negotiating what comes next.
Phase 2 (IPsec SA) protects data. Running inside the Phase 1 channel, Phase 2 negotiates encryption and integrity algorithms, derives symmetric IPsec keys from the Phase 1 key material, and establishes the actual data-carrying tunnel. This is the operational pipe your application traffic flows through.
Security Associations are the unit of state. Each SA is unidirectional, identified by an SPI, and contains the keying material and algorithm parameters for a single direction of traffic. A working tunnel is always an SA pair. Understanding SAs explains why VPN debugging so often involves inspecting SA databases and why mismatched lifetime or algorithm configurations cause tunnel failures.
Policy-based vs route-based is an architectural choice, not just a configuration option. Policy-based VPNs give fine-grained per-flow control at the cost of scalability and operational complexity. Route-based VPNs are simpler, scale with routing rather than with explicit policy rules, and support dynamic routing protocols like BGP. For cloud architects and specifically for AWS route-based is the default and the recommended pattern.
For anyone building multi-site connectivity on AWS whether connecting a corporate data centre, a branch office, or a partner network IPsec VPN over Transit Gateway with BGP is the standard playbook. Understanding why that architecture works the way it does requires understanding everything above.
This article is part of an ongoing series on AWS networking fundamentals for cloud architects and engineers preparing for the AWS Solutions Architect Associate and Professional examinations.





