SSL Termination Models and Session Stickiness in Modern Cloud Load Balancing
A Practical AWS ELB (ALB & NLB) Architecture Guide
At some point in your cloud journey, you stop asking “How do I enable HTTPS?” and start asking a far more consequential question:
Where exactly should TLS terminate in my system?
That question looks innocuous. It isn’t.
Where SSL/TLS terminates reshapes your system’s performance characteristics, security boundaries, routing capabilities, and operational complexity. Add session stickiness into the mix, and now the load balancer is no longer just distributing traffic it is influencing application correctness under failure.
This article walks through SSL termination models and connection stickiness as they exist in real AWS production architectures today, with a primary focus on Application Load Balancers (ALB) and Network Load Balancers (NLB).
The goal is not to list features, but to build a mental model you can reuse in production design, security reviews, and system design interviews.
1. Why SSL/TLS Placement Matters
TLS exists for a single reason: to ensure that data cannot be read or modified by anything between the client and the server. That includes compromised routers, hostile networks, and intermediaries you do not control.
But encryption has a cost.
TLS introduces cryptographic computation, handshake latency, certificate management, and state that must be maintained across connections. In small systems this overhead is invisible. At scale, it becomes architectural.
The moment you introduce a load balancer, you introduce a decision point:
Does encryption terminate at the load balancer, or does it continue all the way to the application?
That decision determines:
Whether the load balancer can see HTTP-level data
Where certificates live and who manages them
Whether AWS infrastructure ever handles decrypted traffic
How much cryptographic work backend services must perform
In modern cloud systems, this decision usually manifests as one of three patterns:
SSL bridging, SSL passthrough, or SSL offloading.
Each exists because a different constraint dominates.
2. SSL Bridging The ALB Default (and Why It Dominates)
If you configure an Application Load Balancer with an HTTPS listener, you are almost certainly using SSL bridging, whether you consciously chose it or not.
Conceptually, the flow looks like this:
A client establishes a TLS connection to the ALB. The ALB terminates TLS, decrypts the traffic, inspects the HTTP request, makes routing decisions, and then initiates a new TLS connection to the selected backend target.
You now have two encrypted hops:
Client → ALB
ALB → Backend instance
Because the ALB terminates TLS, it must have access to the certificate. In AWS, this is typically handled via AWS Certificate Manager (ACM) or uploaded certificates.
Once traffic is decrypted, the ALB gains full Layer 7 visibility. It can see host headers, paths, cookies, HTTP methods, and custom headers. This visibility is what enables ALB’s defining capability: application-aware routing.
A single ALB can serve multiple domains, route traffic to different target groups based on URLs or headers, and consolidate what previously required many separate load balancers.
This is why SSL bridging is the default and most common ALB architecture. It balances security with application intelligence.
The cost is real. Encryption happens twice. Backend instances still manage certificates and perform cryptographic work. That overhead is accepted because the routing flexibility it enables is often worth far more than the CPU it consumes.
3. SSL Passthrough When the Load Balancer Knows Nothing
At the opposite extreme is SSL passthrough.
In this model, the load balancer never terminates TLS. It forwards encrypted traffic as opaque bytes to backend instances.
This typically occurs when:
An NLB listener is configured for TCP, or
TLS is used end-to-end but not terminated at the load balancer
Here, the client establishes TLS directly with the backend application. Each backend instance terminates TLS itself, manages certificates, and performs all cryptographic operations.
From the load balancer’s perspective, this is not HTTP traffic. It is just TCP.
That means:
No header inspection
No path-based routing
No cookies
No application awareness
Why would anyone choose this?
Because security boundaries sometimes outweigh convenience.
In regulated or security-sensitive environments — financial systems, healthcare platforms, government workloads there are often explicit requirements that only the application process may ever see decrypted data. Even managed services are considered too much exposure.
SSL passthrough ensures:
No certificates exist at the load balancer layer
No plaintext traffic outside the application
True end-to-end encryption in the strictest sense
This is why Network Load Balancer is usually the tool of choice here. NLB operates at Layer 4, scales extremely well, preserves client IPs, and introduces minimal latency but it does not understand applications.
You gain isolation. You lose intelligence. That trade-off is deliberate.
4. NLB TLS Termination Similar Words, Different Intent
At this point, a careful reader will notice an apparent contradiction:
Network Load Balancer can terminate TLS. So where does that fit?
NLB does support TLS listeners and certificate-based termination. But this mode is often misunderstood.
When an NLB terminates TLS:
TLS is decrypted at the load balancer
Traffic is forwarded to backends over raw TCP
There is still no HTTP parsing
No visibility into headers, paths, or cookies
In other words, TLS termination at NLB does not make it an ALB.
This mode exists primarily for three reasons.
First, client IP preservation. NLB preserves the original source IP even when terminating TLS, which matters for auditing, rate limiting, and protocol-level logic.
Second, extreme throughput and low latency. NLB’s architecture is optimized for massive scale. TLS termination here is about offloading cryptography without introducing application awareness.
Third, basic SNI-based certificate selection. NLB can choose certificates during the TLS handshake, but that is the extent of its intelligence.
The distinction is subtle but critical:
ALB terminates TLS to understand the application.
NLB terminates TLS to scale encrypted transport.
Same terminology. Completely different purpose.
5. SSL Offloading Performance at the Cost of a Wider Trust Boundary
SSL offloading is the final major pattern.
In this model:
TLS is terminated at the load balancer
Traffic is decrypted
Backend connections use plain HTTP
Backend instances:
Do not require SSL certificates
Perform no cryptographic operations
Process only HTTP traffic inside the VPC
This approach exists for performance and operational simplicity. Removing TLS from backend tiers reduces CPU usage, lowers latency, and centralizes certificate management.
But this model relies on an assumption that must be stated carefully:
A VPC is isolated, not inherently trusted.
Once traffic is plaintext internally, east-west communication becomes readable if controls fail. Compromised workloads, misconfigured security groups, or overly broad shared VPC models can dramatically increase blast radius.
This is why Zero Trust architectures often reject SSL offloading entirely. They assume the internal network is hostile by default.
SSL offloading can be reasonable but only when paired with strong segmentation, tight security group boundaries, minimal lateral movement, and disciplined workload isolation. Without those controls, it stops being an optimization and becomes a liability.
6. Session Stickiness Why “Stateless” Is Often Aspirational
In theory, modern applications are stateless. Any request can land on any backend instance and succeed.
In practice, many systems only approximate this ideal.
Authentication state, shopping carts, multi-step workflows, and in-memory caches often introduce hidden coupling between users and specific backends. When a load balancer distributes requests blindly, those assumptions break — sometimes subtly, sometimes catastrophically.
Users get logged out mid-session. Forms fail halfway through submission. Transactions behave inconsistently. These failures rarely look like infrastructure problems at first, which makes them especially hard to diagnose.
Session stickiness exists because real systems often lag behind architectural ideals. It is not an endorsement of stateful design it is a pragmatic response to imperfect reality.
The risk is not using stickiness. The risk is forgetting why it was introduced.
7. How Stickiness Behaves in ALB and NLB
Application Load Balancers implement stickiness at Layer 7, using cookies.
When a client’s first request hits an ALB, the load balancer selects a backend target and issues a cookie in the response. Subsequent requests carrying that cookie are routed back to the same target, for as long as the configured duration remains valid.
Because ALB understands HTTP semantics, this behavior is explicit, controlled, and time-bound. If a backend fails, the ALB can break the association and reroute traffic predictably.
Network Load Balancers behave very differently.
NLB has no concept of sessions or cookies. Any perceived affinity comes from Layer 4 behavior typically source IP and port combinations. This is not a session mechanism; it is a side effect of transport routing.
At real-world scale, this becomes dangerous. Thousands of users may appear to originate from a single IP due to NAT gateways, proxies, or mobile carrier networks. Traffic unintentionally concentrates on a single backend. When network paths change, affinity breaks abruptly.
The practical conclusion is simple:
ALB stickiness is a deliberate architectural tool.
NLB “stickiness” is an emergent property you should not rely on.
If user-level continuity matters, Layer 7 control is not optional.
8. A Reusable Mental Model
At this point, the services matter less than the boundaries they impose.
Every load-balanced system quietly answers three questions:
Where does encryption end?
Who is allowed to see decrypted data?
How much state does the infrastructure tolerate?
ALB exists to understand applications.
NLB exists to move packets at scale.
TLS termination and stickiness are not configuration checkboxes they are architectural commitments.
Closing Thought
SSL termination models and session stickiness define where trust lives, where complexity accumulates, and how failures surface.
If you can explain why TLS terminates where it does and why requests stay pinned or do not you are no longer memorizing cloud services. You are designing systems.
That is the difference production environments care about.


