When AI Fails, What Actually Failed? The Distinction AI Governance Keeps Missing
Michael A. Santoro / Jun 10, 2026In February, a US Tomahawk missile struck the Shajareh Tayyebeh girls’ school in Minab, Iran, killing 165 people, most of them girls between the ages of 7 and 12. A preliminary military investigation found that the strike resulted from outdated targeting data in a Defense Intelligence Agency database. The school had been mislabeled as a military facility, a classification that had been wrong for at least a decade.
In the aftermath, two strands of commentary emerged, each capturing part of what went wrong. One focused on AI itself, questioning whether such systems should be accorded weight in lethal targeting decisions and pointing to ways the technology may corrupt the human judgment meant to supervise it. The other traced the failure to intelligence verification procedures — verification that has nothing intrinsically to do with AI — and argued for reform in those procedures rather than in the AI architecture.
Each strand has substantial merit. Each is also incomplete. The public conversation about AI failures lacks a vocabulary for distinguishing what is actually being debated, and until that vocabulary is built, reforms will continue to misfire — tightening the wrong controls in one direction and leaving the right ones unaddressed in the other.
As I have argued in several recent pieces, meaningful accountability for AI systems deployed by governments and militaries must be built upstream, into design choices, into authorization processes, into the architecture of the systems themselves, rather than into last-minute human overrides at the point of action. I stand by that argument.
But conversations with practitioners over the past several months have sharpened my view in an important way. The upstream-accountability argument assumes that the systems are reliable enough to bear the responsibility assigned to them. They are not — at least not yet. And the public debate about AI governance keeps collapsing two fundamentally different categories of failure into one.
Disentangling them is the first step toward a more honest conversation about what AI accountability can and cannot do given the current state of the technology.
Imperfect information
The first problem is as old as governance itself. Decisions are only as good as the information on which they are based. When data is faulty, outdated, or incomplete, any decision-maker, human or algorithmic, is likely to get it wrong.
The Minab strike is a devastating illustration. The school had been clearly separated from an adjacent military compound for at least a decade. The compound's security posts had been removed years earlier. The school was the only operational facility at the site. Yet the targeting database carried a classification from a different era. A human operator consulting that database in real time would have reached the same catastrophic conclusion. The failure was upstream, in the intelligence gathering, verification, and database maintenance processes that preceded any targeting decision, whether made by a human or an algorithm.
The same pattern recurs in civilian AI deployments. Epic Systems' sepsis prediction model, a machine-learning tool implemented in hundreds of US hospitals, was shown in a 2021 JAMA Internal Medicine study to have missed roughly two-thirds of sepsis cases at Michigan Medicine while generating clinically unmanageable rates of false alerts. The breakdown occurred upstream of the model: the electronic health record data the model relied on at the deployment site was incomplete and inconsistently coded, with key lab values, vital signs, and chart updates not entered in time or entered in fields the model could not parse. A clinician working from the same EHR would have been working from the same compromised information. The failure was not the inference. It was the data that the inference was asked to read.
A clarifying note from the engineering side is worth making here. "Data" in AI systems means two distinct things: the curated datasets used to train and validate a model, and the runtime inputs that feed into a deployed system. Minab was a failure of the second kind. Bias embedded during training, which I address below, is a failure of the first. Both are, in a literal sense, data problems, but they call for very different remedies.
The lesson here is about data, not about autonomy. Whether the targeting decision is made by a human analyst, by an AI system, or by some combination of both, the quality of the underlying information bounds the quality of the decision. Improving the quality, timeliness, and verification of intelligence data is an urgent priority in its own right — a question logically prior to, and separable from, the debate over whether and how AI should be involved in such decisions at all. Better methods of information gathering — improved satellite imagery analysis, pattern recognition across large datasets, faster cross-referencing of intelligence sources — may improve data quality. But no analyst and no algorithm can compensate for fundamentally flawed inputs.
Imperfect systems
The second problem is different in kind. AI systems themselves — the models, algorithms, and architectures that power them — are not yet perfected. They are subject to limitations inherent in the current state of the technology, not in the quality of the data they receive. This is the part of the conversation that public-policy writing tends to gloss over, often treating AI systems as more capable and more stable than they actually are.
These limitations take several forms. Models trained predominantly on data from specific operational environments can fail catastrophically when confronted with genuinely novel situations. This brittleness or inability to adapt reliably when presented with inputs outside the training distribution can produce outcomes that are not merely wrong but systematically wrong in ways that are difficult to anticipate.
In military contexts, researchers have warned that a single unlawful use of a civilian vehicle by combatants could lead a system to classify all similar vehicles as legitimate targets. This is the mirror image of Minab. There, the data was wrong and the system processed it faithfully — a human consulting the same database would have reached the same conclusion. Here, the underlying observation is accurate, but the system draws a categorical inference no human analyst would make. Minab is a failure of information. This is a failure of the system itself.
Training data can also entrench bias, even when the data appears neutral on its face, through feedback loops, proxy variables, and flawed logical reasoning. Systems trained on surveillance footage and behavioral patterns may encode profiling based on race, religion, or geography. In government contexts more broadly, predictive systems used in child welfare, criminal justice, and resource allocation have been shown to reproduce and, in some cases, amplify biases present in the institutional data on which they were trained.
Beyond bias and brittleness lies the deeper challenge of opacity. Many high-performing AI systems operate in ways that are insufficiently transparent — even to their designers — to support reliable attribution of outcomes or prediction of behavior under operational stress. When a system produces an unexpected outcome, it may not be possible to reconstruct the chain of reasoning that led to it. That problem becomes especially acute when the outcome is a lethal strike or a denial of critical public services.
Structurally, this resembles the attribution problem that has long preoccupied nuclear strategists: accountability depends on the ability to identify the source of a particular outcome. Opacity undermines AI accountability for much the same reason that imperfect attribution defeats nuclear deterrence. And some degree of this unpredictability is not a bug to be patched but an inherent property of statistical inference itself — the trade-off for capabilities that exceed what rule-based systems can deliver.
These are not fringe concerns. They are the everyday operating conditions of contemporary AI. To the engineers who build and deploy these systems, the distinction between the data was wrong and the system itself behaved unpredictably, is so basic as to be almost trivial. In the public policy debate, somehow, it has remained invisible. Part of the difficulty is that the underlying technology is moving fast enough that policy debate struggles to characterize a moving target — but that is an argument to sharpen the categories, not to blur them.
Why the distinction matters
Conflating these two categories of failure produces three distinct governance errors.
First, misdiagnosed accountability. When a catastrophe results from faulty information, as in the Minab school attack, accountability properly falls on those responsible for maintaining the integrity of the intelligence chain: the analysts, database managers, and verification processes that failed to update a decade-old classification. When a catastrophe results from a system performing in ways its designers did not anticipate, accountability falls on those who designed, validated, and authorized the deployment of the system itself. The actors are different. The failures are different. The remedies are different.
Second, misdirected reform. Imperfect information calls for better intelligence processes, more rigorous verification protocols, and more frequent data updates — reforms that are largely organizational and procedural. Imperfect systems call for continued research, more rigorous pre-deployment testing, greater transparency in model design, and more realistic assessments of whether the technology is mature enough for specific applications. Treating all AI failures as information problems leads to underinvestment in system validation. Treating all failures as system problems leads to unnecessary resistance to technologies that may perform effectively, with better data.
Third, the distinction demands intellectual honesty about the current state of AI. The principle that accountability belongs upstream is sound, but its practical implementation depends on continued progress in making AI systems more robust, more transparent, and more predictable. It also depends on two diagnostic capacities that the field has not yet built.
The first is the proper classification of failure types. The second is a solution to what I have called the denominator problem: the field's near-absence of basic measurement infrastructure for counting AI harms against the total number of opportunities that produce them. Without these capacities, even our most ambitious accountability frameworks cannot tell us whether they are working.
Until they are developed, frameworks must explicitly account for both categories of shortcomings, demanding better information and better systems, rather than treating either alone as sufficient. And the distinction this piece draws — between bad data and brittle systems — is itself only the first cut. A mature governance vocabulary will eventually need to separate distrust rooted in bad data, distrust rooted in system brittleness, distrust rooted in opacity, and distrust rooted in the irreducible uncertainty of statistical inference itself.
And there is increasing urgency to this work, because the federal scaffolding that might once have shared it is being withdrawn. A recent Presidential Executive Order has taken federal preclearance explicitly off the table and adopted an innovation-first posture, with the practical effect that the burden of distinguishing data failures from system failures — and of building the measurement infrastructure that would credibly demonstrate the distinction — now falls on deployers themselves rather than on the agencies that might once have regulated them.
Policy implications: Two tracks, not one
Three concrete implications follow.
First, AI governance frameworks should establish separate accountability tracks for data integrity and system integrity. A failure investigation should not be permitted to conclude with a finding on one dimension alone without also evaluating the other.
Second, when the data is bad, the algorithm makes it worse. This may sound like common sense, and it is — but policy debates continue to treat AI capability and data quality as separable investments. They are not. A flawed input into a manual decision may produce a single flawed outcome. The same flawed input into an algorithmic system can produce many flawed outcomes, faster and at scale. Algorithms amplify whatever the data gives them. Improving systems without improving data does not produce better decisions; it produces worse decisions, faster.
Third, data-integrity reform and system-integrity research compete for the same budgets and the same political attention, and treating them as substitutable is a false economy. An agency can fund a data-modernization initiative, point to it, and leave the underlying system failures wholly unaddressed — or pour money into model research while the data pipeline quietly decays. Neither investment discharges the need for the other. Both must be funded, and the public conversation must learn to ask for both.
The public debate about AI in government would benefit enormously from this distinction. It is the difference between asking “was the data right?” and asking “is the system ready?” Both questions matter. They are not the same question. Until we can ask each one separately, we cannot answer either one well.
Thanks to I-Jeng Wang of the Johns Hopkins Applied Physics Laboratory for helpful comments.
Authors


