Event-Driven Architecture: What Finally Clicked for Me After Years of Building Systems That Were “Fine”… Until They Weren’t

Why your architecture diagram looks like spaghetti and why that might actually be a good thing.

Jan 29, 2026

Lately I’ve been trying to understand why modern systems keep drifting toward event driven architecture not from blog diagrams or cloud vendor slides, but from the perspective of someone who’s actually watched systems creak under load, cost more than expected, and fail in oddly non obvious ways.

Event driven architecture isn’t a buzzword you arrive at on day one. It’s usually where you end up after trying simpler things, breaking them, fixing them, and slowly realizing that the way you think about “requests” and “services” might be the real constraint.

What finally clicked for me was this: most large systems don’t spend their time doing work. They spend their time waiting. And event driven design is fundamentally about eliminating unnecessary waiting.

To explain that, it helps to walk through how systems actually evolve. I’ll use YouTube as a recurring mental model not because we know its internals, but because the problem it solves is intuitive: users upload videos, those videos get processed, stored, and served at massive scale.

The Monolith: When Everything Lives Together

Almost every system starts as a monolith, whether intentionally or not.

Imagine early YouTube as a single application. Users upload videos, the same application handles authentication, accepts the upload, transcodes the video, stores it, updates metadata, and serves playback requests. Everything runs together, usually on a few big machines. Scaling is mostly vertical: bigger servers, more CPU, more RAM.

At small scale, this is not a mistake. It’s actually efficient. Fewer moving parts, fewer deployment pipelines, fewer failure modes to reason about.

But monoliths have a brutal property: everything fails together and scales together.

If video processing spikes CPU usage, uploads slow down. If uploads spike, playback latency increases. If a memory leak exists in one part of the codebase, the entire system suffers. You don’t get to isolate pain.

Billing pressure shows up early. You’re paying for peak capacity even when the system is mostly idle. You can’t selectively scale the expensive parts. You’re forced into provisioning for worst-case load because there’s no clean boundary between responsibilities.

Operationally, it becomes tense. A deploy touches everything. A rollback affects everyone. At some point, the cost of “simple” becomes very real.

This is usually when teams say: we need structure.

Tiered Architecture: Structure Without Freedom

The next evolutionary step is almost always tiering.

Uploads move to one tier. Processing moves to another. Storage gets abstracted behind its own layer. Now YouTube has an upload service, a processing service, and a storage backend. Things feel cleaner. Teams can think in boxes instead of a blob.

And this does help.

But tiered architectures are still tightly coupled at runtime. Requests flow synchronously through tiers. Upload calls processing. Processing calls storage. Latency compounds, and failures cascade.

If the processing tier slows down, uploads start timing out. If storage has higher latency, processing backs up. You’ve separated codebases, but not fate.

What I’ve seen in systems like this is a fragile equilibrium. Everything works fine until one tier experiences stress. Then queues form implicitly connection pools fill up, threads block, retries multiply. Instead of absorbing load, the system amplifies it.

You get better organization, but not resilience. And you’re still paying for always on infrastructure even when nothing interesting is happening.

At some point, someone asks the dangerous question: why does upload need to wait for processing at all?

Queues and Worker Fleets: The First Real Decoupling

This is where systems start to behave differently.

Instead of the upload tier calling the processing tier directly, it drops a message into a queue. “A new video has been uploaded.” That’s it. The upload path is now fast and predictable.

Processing happens asynchronously. A fleet of workers pulls jobs from the queue, transcodes videos, stores outputs, and moves on. If uploads spike, the queue grows. If traffic drops, workers scale down or even to zero.

This single change rewires the system’s behavior.

Spikes stop being emergencies and start being buffers. Scaling becomes elastic rather than reactive. Upload no longer fails just because processing is slow. You’ve introduced time as a design tool.

This is usually the first moment when engineers feel real relief. The system breathes. Costs align better with usage. Failures become localized.

But queues alone don’t solve everything. Processing workers still need to be running, monitored, deployed, and scaled. As systems grow, the number of distinct processing paths multiplies. Transcoding, thumbnail generation, metadata extraction, notifications each wants its own logic.

Which leads us to microservices.

Microservices: Ownership and Isolation, at a Cost

Microservices are often explained poorly, but their real value isn’t technical it’s organizational.

Instead of one processing system, YouTube might have a transcoding service, a thumbnail service, a recommendation signal service, a notification service. Each is owned by a team, deployed independently, and scaled independently.

Fault isolation improves dramatically. A bad deploy in thumbnail generation shouldn’t break uploads. Teams move faster because blast radius is smaller.

But there’s a subtle cost that becomes visible only at scale: microservices are still always waiting.

Each service runs continuously, listening for requests or polling queues. Even when nothing is happening, infrastructure is alive. Memory is allocated. Containers are running. Operational overhead accumulates quietly.

Microservices fix coupling and ownership, but they don’t fundamentally change the mental model. The system is still built around request/response and long lived processes.

Eventually, especially in systems with highly uneven workloads, someone notices that most services spend most of their time idle.

That’s when the shift happens from services that wait, to systems that react.

Event-Driven Architecture: Treating Events as Facts

What event driven architecture changes is not deployment units, but philosophy.

Instead of thinking “service A calls service B,” you start thinking “something happened.”

A video was uploaded. A transcoding job completed. A thumbnail was generated. A policy check failed. These are not requests they’re facts. They happened whether anyone was listening or not.

In an event-driven system, producers emit events without knowing who consumes them. Consumers subscribe to events they care about and react when and only when those events occur. An event router or event bus sits in the middle, acting as a neutral exchange.

This decoupling is deeper than queues. It’s not just asynchronous execution; it’s causal independence.

Upload doesn’t know about processing. Processing doesn’t know about notifications. New consumers can be added without touching producers. Old consumers can disappear without breaking the system.

What finally clicked for me is how this eliminates constant waiting.

Instead of fleets of services sitting idle, resources spin up in response to events. Work happens only when there’s something to do. Cost aligns naturally with usage. Scaling becomes proportional rather than predictive.

For something like YouTube, this means the system grows not by holding capacity, but by reacting to reality. A surge of uploads creates a surge of events. The system responds, processes, and then goes quiet again.

That quiet matters. At scale, quiet is money.

The Trade Offs Nobody Should Hand-Wave Away

Event-driven architecture is not free.

Debugging becomes harder because cause and effect are separated in time and space. Tracing a user’s journey means stitching together events across systems. Observability is no longer optional; it’s existential.

Event contracts require discipline. Once you publish an event, it becomes an API. Breaking it can silently break downstream consumers you don’t even know exist.

There’s also cognitive overhead. Engineers have to think in flows, not call stacks. Failure modes are different. Idempotency stops being theoretical and becomes mandatory.

I’ve seen teams underestimate this and pay for it later.

But when the discipline is there, the payoff is real. Systems become more adaptable than predictable. Change becomes additive rather than invasive.

Why Systems Keep Evolving This Way

What I’ve come to believe is that event driven architecture isn’t about tools, clouds, or vendors. It’s about aligning software with how the real world works.

The real world doesn’t poll. It doesn’t wait. It reacts to events.

As systems grow, pretending everything is a synchronous conversation becomes increasingly artificial. Events acknowledge that most of what we do in distributed systems is responding to facts after they’ve already occurred.

For engineers who already know monoliths, queues, and microservices, event-driven architecture isn’t the next buzzword to adopt. It’s the natural conclusion of a long learning journey about coupling, cost, failure, and scale.

I didn’t arrive here because someone told me it was “modern.” I arrived here because, over time, it felt like the most honest way to design systems that don’t spend their lives waiting for something to happen.

do share your thoughts below ツ

Join Pushpit Saluja’s subscriber chat

Available in the Substack app and on web

Cloud Odyssey

Discussion about this post

Ready for more?