#SystemsComplexityMay 29, 2026 · 6 min read

Why "Just Add a Flag" Eventually Fails

A feature flag is a great way to defer a decision and a terrible way to avoid one forever — and how flag sprawl is usually a missing model in disguise.

Every engineer knows the moment: "let's just add one more flag." The first flag solves a real problem. The second is harmless. By the tenth, nobody can say which combinations are valid, which are merely untested, and which are simply never hit in production.

What makes this worth dwelling on is that each flag, on its own, is a good decision. Nobody adds a bad flag. The damage is in the aggregate, and the aggregate is something no one is looking at.

Flags multiply states, not features

A single boolean is two states. Ten independent booleans describe more than a thousand combinations — and you only ever designed a handful of them on purpose. The rest exist by accident: combinations nobody intended, nobody tested, and nobody can reason about.

That's the quiet trap. Adding the tenth flag feels like adding one more feature. It actually multiplies the state space. A flag for the new checkout, a flag for legacy tax handling, a flag for a beta shipping flow — each reasonable alone — silently create a combination QA never ran, on the one account that happens to have all three on.

Why it keeps happening

A flag is the cheapest way to ship a change without committing to a design. It's local, reversible, and fast — which makes it exactly the right tool for a temporary rollout, an experiment, or a kill switch. The problem isn't the tool. It's that "temporary" flags are almost never removed, because removing one means proving a combination is dead, and nobody has time to prove a negative.

So they accumulate. Each is justified; the cost is collective; and like most accumulating complexity, it's paid by every future change rather than by the change that introduced it.

What the flags were trying to say

Most flag sprawl is a model you haven't built yet, expressed as scattered booleans. Three flags that are only ever set together aren't three flags — they're one mode. A flag that really answers "which plan is this account on?" isn't a boolean — it's an enum, or a small state machine. The flags are a symptom: the system needed a concept, and got a pile of switches instead.

The better move: name the model

The fix usually isn't fewer flags by willpower. It's noticing, earlier, that the flags are describing something, and giving that something a name — a set of modes, a state machine, a configurable rule. Once the model exists, most of the flags fold back into it, and a new variation becomes a value in a known space instead of one more independent boolean multiplying the rest.

This is the same move as turning scattered conditionals into configuration you can reason about: you stop adding branches and start describing behaviour as data. A flag is fine as a switch. It fails as a permanent substitute for a model.

A feature flag is a great way to defer a decision. It's a terrible way to avoid one forever.

The lesson

Flags aren't the enemy; unowned, accumulating flags are. Use them for what they're good at — rollouts, experiments, kill switches — and give them an expiry in your head the moment you add them. But when the same flags keep travelling together, read it as the system telling you a concept is missing. Name the model, fold the flags into it, and delete the ones that were only ever describing it badly.

← All writing Get in touch →