How AWS CloudFront Actually Works: A Deep Dive into CDN Architecture

For engineers who want to understand the system, not just pass a certification.

Feb 24, 2026

There is a tendency in cloud documentation to explain what a service does without explaining why it works the way it does. CloudFront is a prime example. Most introductory material describes it as a “content delivery network that caches content closer to users” which is technically accurate but operationally useless if you are trying to design a resilient, performant architecture around it.

This article breaks down the CloudFront architecture in full: how requests flow through the infrastructure, how caching layers interact, what behaviors actually control, and why certain design decisions exist. It also covers the things that trip engineers up particularly around what CloudFront does not do.

Hi — this is Pushpit from CloudOdyssey . Each week, I write about Cloud, DevOps, Systems Design deep dives and community update around it. If you have not subscribed yet, you can subscribe here.

What CloudFront Is, and What It Is Not

AWS CloudFront is a globally distributed content delivery network. Its fundamental job is to serve content to end users from a point in the network that is geographically and topologically closer to them than the origin server is. This reduces latency, reduces the load on origin infrastructure, and improves throughput for read-heavy workloads.

The critical word in that last sentence is “read-heavy.” CloudFront is purpose-built for content delivery that is, download operations, GET requests, and static or semi-static content distribution. It is not a write-caching layer. PUT, POST, PATCH, and DELETE operations pass through CloudFront transparently to the origin; they are not cached, not buffered for batching, and not processed at the edge in any meaningful way. If your application uploads files, submits form data, or writes to a database, those operations go directly to the origin. CloudFront’s value is in the return path serving content back to users at scale.

Understanding this boundary between read and write paths is fundamental to designing systems that use CloudFront correctly.

The Core Components of CloudFront Architecture

Before examining the request flow, it helps to understand the four structural components that make up a CloudFront deployment: Origins, Distributions, Edge Locations, and the Regional Edge Cache.

Origins

An origin is the source of truth for your content. It is the server or service that CloudFront retrieves content from when it does not have a cached copy. CloudFront supports several origin types: Amazon S3 buckets, Application Load Balancers, EC2 instances, API Gateway endpoints, and any publicly accessible HTTP server including servers outside of AWS entirely.

An important architectural clarification: content is uploaded directly to origins. There is no mechanism for pushing content through CloudFront to your origin. If you are deploying a new version of a static website, you upload the files to S3. If you are serving video, the source files live on your media server. CloudFront is downstream of the origin, not upstream.

A single CloudFront distribution can have multiple origins. This is a common pattern one S3 bucket for static assets, an ALB for dynamic API traffic, and perhaps another S3 bucket for user-uploaded media. How requests get routed to different origins is managed by behaviors, which are discussed in detail below.

Distributions

A distribution is the primary unit of configuration in CloudFront. When you deploy CloudFront for your application, you create a distribution. The distribution defines your origins, your cache behaviors, your SSL/TLS settings, and your access policies. It is the configuration object that CloudFront replicates out to its global network.

When you create a distribution, AWS assigns it a domain name of the form d1234abcd.cloudfront.net. You can map your own domain (e.g., cdn.example.com or assets.example.com) to this via CNAME. Crucially, CloudFront integrates with AWS Certificate Manager (ACM) for HTTPS, which means you can attach a managed SSL certificate to your distribution and serve content over TLS without managing certificate rotation yourself.

The configuration you define in a distribution origins, behaviors, TTLs, protocol policies is deployed globally to the CloudFront edge network. When you make a configuration change, CloudFront propagates that change to all of its edge locations, which takes on the order of minutes.

Edge Locations

Edge locations are where end users actually connect to CloudFront. AWS operates hundreds of edge locations across dozens of countries. These are the Points of Presence (PoPs) that form the front line of the CloudFront network. When a user in Frankfurt makes a request to your CloudFront distribution, DNS resolution routes them to an edge location in or near Frankfurt rather than to your origin in us-east-1.

Each edge location maintains its own cache. If the requested content is in that cache and has not exceeded its TTL, the edge location serves it directly without contacting any upstream infrastructure. This is a cache hit, and it is what you are optimizing for when you design a CloudFront deployment.

Edge locations are intentionally lean. They have limited storage capacity relative to your total content catalog. A large media library or a highly diverse URL space may result in content regularly falling out of the edge cache due to eviction. This is where the Regional Edge Cache becomes important.

Regional Edge Cache

The Regional Edge Cache (REC) sits between the edge locations and your origin in the CloudFront request chain. It is a significantly larger cache layer AWS typically describes it as having a larger cache capacity than individual edge locations. Rather than deploying one REC per country or city, they are positioned regionally, with multiple edge locations sharing a single Regional Edge Cache.

The practical purpose of the REC is to handle content that is popular enough to justify caching, but not popular enough to remain continuously in the edge location caches. When content ages out of an edge location, the next request for that content goes to the REC rather than directly to the origin. If the REC has it cached, it can serve it back to the edge location, which then serves the user without any traffic reaching your origin. This two-tier cache architecture dramatically reduces origin traffic for large content catalogs.

How a Request Actually Flows Through CloudFront

Understanding the request flow is where theory becomes operationally useful. The path a request takes depends on the state of the cache at each layer.

Scenario One: Cache Hit at the Edge

A user requests a file say, assets.example.com/css/main.css. DNS resolves to a nearby edge location. The edge location checks its local cache for the object matching that URL and cache key. The object is present and the TTL has not expired.

The edge location serves the response directly to the user. Origin is never contacted. Regional Edge Cache is never contacted. Total latency is the network round trip between the user and the edge location, plus the time to read from the edge cache which is measured in single-digit milliseconds for geographically proximate users.

This is the ideal state. Maximizing cache hit rate at the edge is the primary goal of CloudFront configuration.

Scenario Two: Cache Miss at the Edge, Hit at the Regional Edge Cache

The user requests a file. The edge location does not have it cached, either because it was never requested at this edge location, or because it was evicted. The edge location forwards the request upstream to the Regional Edge Cache.

The REC checks its cache for the same object. It finds a valid cached copy. It returns the object to the edge location, which caches it locally and serves it to the user. The origin is not contacted.

The user experiences slightly higher latency than a direct edge cache hit, since the request traveled one additional hop. But origin infrastructure is still not touched, which is the important outcome from a scalability standpoint.

Scenario Three: Full Cache Miss — Origin Fetch

Neither the edge location nor the Regional Edge Cache has the requested object. The REC forwards the request to the origin this is called an origin fetch. The origin processes the request and returns the content. The REC caches the response according to the cache headers and TTL configuration. The edge location caches the response as well, then serves it to the user.

All subsequent requests for the same content at this edge location will hit the local cache until the TTL expires or the object is evicted.

The origin fetch path is the highest-latency path and the one that generates actual load on your origin infrastructure. Minimizing the frequency of origin fetches by tuning TTLs appropriately, structuring cache keys carefully, and designing content for cacheability is where most CloudFront performance work actually happens.

CloudFront Behaviors: The Real Control Plane

Behaviors are arguably the most misunderstood aspect of CloudFront configuration. The AWS console does not always make their function obvious, and the word “behavior” is vague enough to obscure what they actually do.

A behavior is a sub-configuration inside a distribution that defines how CloudFront handles requests matching a specific URL path pattern. Every distribution has at least one behavior, the default behavior, which matches all requests using the * wildcard. Additional behaviors are matched against request paths using patterns like /img/*, /api/*, or /static/css/*.css, and they take precedence over the default when a request path matches them.

When a request arrives at a CloudFront edge location, CloudFront evaluates the request path against the behaviors defined in the distribution, starting with the most specific path patterns. If a match is found, that behavior’s configuration governs the request. If no specific pattern matches, the default behavior applies.

What a Behavior Configures

Each behavior is an independent configuration context. Within a behavior, you define:

Origin or Origin Group: Which backend origin does CloudFront forward cache misses to? A behavior mapped to /img/* might point to an S3 bucket containing images, while a behavior mapped to /api/* routes cache misses to an Application Load Balancer. This is how a single CloudFront distribution can serve content from multiple distinct backends without the user being aware of the topology.

TTL Settings: Behaviors control the minimum, maximum, and default TTL values for objects matched by that path pattern. You can set long TTLs hours or days — for immutable versioned assets like main.v4.js and short TTLs for content that changes more frequently. The TTL you configure in CloudFront interacts with the Cache-Control and Expires headers that your origin sends. If your origin sends a Cache-Control: max-age=3600 header, CloudFront will respect that unless your behavior’s minimum TTL is higher or maximum TTL is lower.

Protocol Policies: Each behavior specifies whether HTTPS is required, redirected to, or optional. You can enforce HTTPS at the edge, meaning users who request content over HTTP get redirected to HTTPS automatically, without any origin involvement.

Cache Key and Origin Request Policies: The cache key determines what CloudFront considers a unique object. By default, only the URL path is used. If your application varies responses based on query strings, cookies, or headers, you need to configure CloudFront to include those in the cache key. Misconfiguring this is a common source of cache pollution serving the wrong response to the wrong user or unnecessary cache bypass.

Viewer Request and Response Policies: Behaviors can attach Lambda@Edge functions or CloudFront Functions that execute at the edge, either before the cache is consulted (viewer request) or after the cache responds (viewer response). These enable edge-side logic like request rewriting, A/B testing, and authentication header injection without round-tripping to the origin.

Restricted Access (Signed URLs and Signed Cookies): Behaviors control whether content requires a valid CloudFront signed URL or signed cookie to be served. This is the mechanism for protecting private content — a private S3 bucket can be configured as an origin with Origin Access Control (OAC), and the behavior requires signed access, meaning only users with a valid token can retrieve the content. This is how media platforms, SaaS products, and content subscription services deliver private assets through a CDN without exposing the underlying storage.

Multiple Origins and Behaviors in One Distribution

A practical CloudFront architecture almost always involves multiple origins and multiple behaviors. Here is a concrete example to illustrate how this comes together.

Consider a SaaS web application with the following content profile: a React single-page application served from S3, a REST API running on an ECS-backed ALB, and a private document storage system in a separate S3 bucket that requires per-user access control.

You configure a single CloudFront distribution with three origins: the public S3 bucket, the ALB, and the private S3 bucket with Origin Access Control enabled. Then you define three behaviors:

The /api/* behavior points to the ALB origin. TTL is set to zero (or a very low value), since API responses are dynamic. Query strings and relevant headers are forwarded to the origin. This behavior does not cache at all in practice, but routing API traffic through CloudFront still provides the benefit of AWS’s network backbone and persistent TCP connections between the CloudFront edge and the origin.

The /docs/* behavior points to the private S3 bucket. This behavior requires signed URLs. TTL can be moderate, since the documents themselves are not changing on every request, but access must be validated per-request at the viewer request stage via a Lambda@Edge function that checks the signature.

The default * behavior points to the public S3 bucket serving the React application. Assets are versioned (e.g., main.abc123.js), so TTLs are set to a very high value 86400 seconds or more. Cache invalidations are rarely needed because new deployments produce new filenames.

This single distribution handles three distinct content types, three distinct caching strategies, and two distinct access control regimes. From the user’s perspective, everything is served from the same domain. From the infrastructure perspective, the traffic is cleanly separated and each segment is independently tunable.

What Engineers Get Wrong About CloudFront

A few recurring misconfigurations are worth addressing directly.

Treating CloudFront as a write-through cache. If a user submits a form or uploads a file, that request goes to the origin. CloudFront passes it through. There is no write caching, no write buffering, and no write optimization at the edge. Design your write paths as if CloudFront does not exist in that direction.

Not understanding cache key composition. If your application sends different responses based on a cookie (e.g., a user session cookie), and you forward that cookie to the origin but also include it in the cache key, you will have effectively made every response unique and uncacheable. Carefully separate which cookies and headers need to be forwarded to the origin from which ones should influence the cache key.

Over-relying on cache invalidation. CloudFront supports explicit cache invalidations, where you instruct the edge network to purge a specific path. This is useful but it is not a caching strategy. Building a deployment pipeline that requires cache invalidation to work correctly means you are dependent on a global propagation event completing before users see correct content. Asset versioning embedding a content hash or version number in filenames is far more reliable and makes invalidation unnecessary for static assets.

Not setting appropriate TTLs per content type. A single TTL applied to all content is almost always wrong. HTML files that reference versioned assets should have short TTLs (minutes to an hour). Versioned asset files themselves should have very long TTLs. API responses should have short or zero TTLs. The behavior system exists precisely to allow this granularity.

CloudFront in the Context of a Production System

CloudFront does not operate in isolation. It interacts with Route 53 for DNS, ACM for certificate management, S3 for static storage, Lambda@Edge and CloudFront Functions for edge compute, WAF for request filtering, and Shield for DDoS protection. Understanding CloudFront architecture means understanding it as a component in a broader system, not as a standalone product.

The distribution is the artifact that ties these integrations together. It is the configuration deployed to the edge network, and it is the object you version, audit, and manage through infrastructure-as-code. Treating distributions as first-class infrastructure defining them in Terraform or CDK, tagging them, attaching logging to them is the practice that separates a production-grade CloudFront deployment from an ad-hoc one.

At its core, what CloudFront offers is a controlled separation between where content lives and where content is served from. The origin is the system of record. The edge network is the delivery layer. Behaviors are the rules that govern how those two sides interact. When you understand that model clearly, designing CloudFront architectures correctly becomes significantly more straightforward.

If you found this useful, be ready for the next piece in this series covers ……….

Discussion about this post

Ready for more?