February 18, 2021 · 1 min read · 244 words

In Defense of Black Boxes

Black box models are not the problem. Unexpected results -- or variance in the results -- are the real problem.

By Ben Wilson

Core Argument

Black box models are not the problem. Unexpected results — or variance in the results — are the real problem.

At some point everything is a black box. We are made of atoms. Atoms are made of neutrons, protons, and electrons. Neutrons are made up of neutrinos that can somehow go faster than light! Then we don’t know beyond that. The neutrino phenomenon is the current boundary of science — it’s a black box. Yet we can get on with our daily lives without needing to understand neutrinos. Why? Because they behave consistently (enough) that we don’t need to worry about it.

How about a more tame example? A vending machine. You put money in, you select your item, and the item comes out. You don’t need to understand the internal mechanism. It’s a black box — but it works reliably.

Why It Matters

The founder’s argument here is contrarian: the analytics engineering community often emphasizes transparency and explainability (rightfully so), but the real issue isn’t opacity — it’s reliability. A model you don’t fully understand but that produces consistent, predictable results is more trustworthy than a “transparent” model with high variance. The standard for trust should be behavioral consistency, not internal visibility.

This has practical implications for how teams evaluate ML models, vendor tools, and even transformation logic — focus on output stability and predictability rather than demanding full internal transparency.

Core Argument

Why It Matters

Related issues

The protocols nobody calls the protocols

I stopped writing about data. Here's what brought me back.

SC 021 — The Advantages of Going Deep on Architecture