AI Product Quality Isn't a Metric

How can AI products deliver on their promise? The answer lies not in the tech, but in how we measure their behavior. We look to timeless systems-thinking principles for a better approach.

"All the uncertainties we have raised must confront and correct each other, there must be dialogue." - Edgar Morin

In our previous post, we unpacked why AI product development, probabilistic, context-bound, and ever-evolving, demands a different playbook. In this article, we explore why AI quality has been elusive for developers and how we can think differently about it.

Product development has long been guided by a simple principle: you can't improve what you can't measure. While that principle remains true for AI, our methods of measurement are obsolete. We are still trying to measure a technology that is fluid and dynamic with the tools of a static world.

The measurement gap becomes obvious the moment we work with large language models. An AI product is a system that connects users, experts, models, and infrastructure. AI product quality now depends on the whole system.

The Domain Expert as the Architect of Meaning

To make quality a first-class design principle, the domain expert's role transforms from a downstream inspector of outputs into the upstream architect of the system's values. Reverse engineering their intuitive judgment from outputs is limited and fragile; explicit expert principles must be the blueprint.

Why Numbers No Longer Tell the Whole Story

AI product teams are building systems that generate human language, yet still rely on quantitative tools to measure them. Numbers can track accuracy, but rarely capture context, tone, or trust. True quality lies in the qualitative, and building for it demands a new culture of measurement.

The Ownership Vacuum

When product executives are unsure who is responsible for AI quality among product managers, designers, engineers, or AI research teams, it signals that the old, siloed roles no longer match today's cross-functional reality. A systemic property cannot be assigned to a single function or role.

Metrics as a Guidance System, Not Just a Ruler

A systems theory lens allows us to stop looking at isolated metrics and instead see the entire network of incentives we are creating. AI product metrics are the signals that guide the behavior of that system.

The Product is the Connector

Within this systems-centric framework, the AI product links the key participants:

the expert, who provides the objective,
the user, who provides the real-world feedback,
and the product design and development teams, who build and maintain the structure for information flow.

Quality is not the output of any single component, but an emergent property of the overall system. A good AI product is not just accurate. It is coherent.

Coherence is achieved when shared ownership across an AI organization creates a loop of shared meaning-making. It connects the domain expert's mental map to the real-world needs of the user, replacing linear, disconnected tools with a holistic and adaptive process.

Hamel Husain, A Field Guide to Rapidly Improving AI Products, March 24, 2025.
Thomas & Uminsky, Reliance on Metrics Is a Fundamental Challenge for AI, arXiv 2002.08512, 2020.
Edgar Morin, "Coherence and Epistemological Opening", in On Complexity, Hampton Press, 2008.