Intent Engineering: Why Enterprise AI Fails When It Works Too Well

In January 2026, Klarna reported that its AI customer service agent now performs the work of 853 full-time employees and has saved the company $60 million. In the same earnings cycle, CEO Sebastian Siemiatkowski admitted publicly that the strategy had cost the company something far more valuable than the money it saved. He was still trying to buy it back.

This is not another story about AI being overhyped. It is the opposite. The AI worked brilliantly. And that was the problem.

The distinction between AI that fails and AI that succeeds at the wrong thing is rapidly becoming the most important unsolved problem in enterprise technology. It has a name now, one that the industry will be hearing a great deal more in the months ahead: intent engineering.

The Problem That Prompt Engineering Never Solved

For the past three years, the AI industry moved through two distinct phases of instruction craft. First came prompt engineering — the art of writing precise instructions that guide model outputs. It taught a generation of practitioners that how you communicate with AI matters. But as models grew more capable, prompt engineering became less of a bottleneck. Today’s frontier models handle instructions robustly without elaborate prompt scaffolding.

Then came context engineering, popularized by Andrej Karpathy and others. This was the insight that what you feed into a model’s context window — the documents, examples, retrieval results, constraints — matters more than the prompt itself. Context engineering was a genuine advance. It moved the conversation from “how do I talk to AI” to “what does AI need to know.”

But both prompt engineering and context engineering share a fundamental limitation: they operate at the task level. They are about getting the right output for a single interaction.

Intent engineering operates at the organizational level. It asks a different question entirely: when an AI agent makes a decision autonomously, does that decision reflect what the organization actually values? Not what was most recently in the context window. Not what was specified in the prompt. What the organization needs.

Prompt engineering tells agents what to do. Context engineering tells agents what to know. Intent engineering tells agents what to want.

The Klarna Catastrophe: A Case Study in Misaligned Success

In early 2024, Klarna rolled out an AI-powered customer service agent across 23 markets in 35 languages. The numbers were staggering. It handled 2.3 million conversations in the first month. Average resolution time dropped from 11 minutes to two. The CEO projected $40 million in annual savings.

Then customers started complaining. Generic answers. Robotic tone. Zero ability to handle anything requiring judgment. By mid-2025, Siemiatkowski told Bloomberg that while cost had been the predominant evaluation factor, the result was lower quality across the board. Klarna began frantically rehiring the human agents it had laid off months earlier.

The comforting interpretation — the one that circulated widely in 2025 — was that AI simply cannot handle nuance. A more accurate reading, now visible with the benefit of hindsight, is that the AI was extraordinarily good at resolving tickets fast, and resolving tickets fast was the wrong objective.

Klarna’s actual organizational intent was not “resolve tickets fast.” It was “build lasting customer relationships that drive lifetime value.” But nobody formalized that distinction. Nobody translated it into a format the AI system could use. And so the agent did exactly what it was optimized to do, with devastating efficiency.

When a customer who had been loyal for three years encountered a frustrating experience, the AI agent treated them identically to a brand-new user with a simple question. The agent did not know that retention matters more than resolution speed. It did not know that tone mismatch is a leading indicator of churn. It did not know that some customers should be routed to humans — not because AI is incapable, but because preserving the relationship outweighs the efficiency gain.

These are not AI limitations. They are intent gaps. The 700 human agents who were laid off took with them institutional knowledge that was never captured, never formalized, and never made available to the system that replaced them: the understanding of which customers need patience, which situations require judgment, and which interactions build the relationship equity that drives long-term revenue.

The $200 Billion Intent Gap

Klarna is not an outlier. It is a preview.

McKinsey’s latest global AI survey found that 74% of companies report no tangible value from AI investment — up from 70% the prior year, despite average investment doubling. More money flowing in, returns not materializing.

Deloitte’s 2026 State of AI in the Enterprise report, surveying over 3,000 leaders across 24 countries, found that 84% of companies have not redesigned jobs around AI capabilities and only 21% have a mature model for agent governance. Meanwhile, 70% are planning to increase AI investment further. More spending, same structural blindness.

Salesforce surveyed 2,000 senior IT leaders and found that while 86% believe agentic AI is coming in the next one to three years, 74% say their organization is not ready. Leaders believe autonomous agents are inevitable. They have started paying for them. But almost none have solved the foundational problem of how those agents will know what the organization actually wants.

Perhaps no product illustrates the intent gap more clearly than Microsoft Copilot. When Microsoft launched Copilot in late 2023, it was the most aggressive enterprise AI sales campaign in history. 85% of Fortune 500 companies adopted it. And then adoption stalled hard. Gartner found that only 5% of organizations moved from pilot to larger-scale deployment. Bloomberg reported Microsoft slashing internal sales targets after the majority of salespeople missed their numbers. Even inside companies that signed six-figure Copilot deals, employees resisted — preferring ChatGPT or Claude for actual work.

The standard explanation centers on UX problems and model quality. Those are real issues, but they are not the fundamental one. The fundamental issue is that Copilot was deployed into organizations that had not done the groundwork to make it useful. Most companies threw Copilot at knowledge workers without defining how it should integrate into workflows, what organizational data it should access, what decisions it should influence, or what quality standards should apply. The result: technically functional AI producing organizationally useless outputs. Documents that did not match the company voice. Summaries that missed what mattered in meetings. Suggestions that were technically correct but strategically wrong.

Another instance of the intent gap. AI deployed without organizational intent infrastructure.

The Three Layers of Intent Engineering

When you break down what intent engineering actually requires, three distinct layers emerge. Most companies are stuck on the first one.

Layer 1: Unified Context Infrastructure

This is the foundational layer — how data, processes, and knowledge flow to AI systems. Deloitte found that only about 20% of executives are fully confident their data is AI-ready. Only 14% have implemented a fully unified data strategy. The rest are running AI off fragmented, inconsistent, partially accessible data.

The practical impact is devastating. An AI agent trying to answer a customer question might not know that the customer has an open support ticket in one system, a pending order change in another, and a history of high-value purchases in a third. The data exists. It is not connected in a way the agent can see.

Organizations have invested heavily in deploying AI models — 72% have started, according to Deloitte — but data architecture investment has lagged by 40% in a single year. Companies are buying tools faster than they are building the organizational systems that make those tools useful.

Layer 2: Coherent AI Worker Toolkit

This layer concerns how humans and agents collaborate. What tools exist? What workflows are designed for human-agent interaction? Where should AI automation replace human effort, where should it augment it, and where should human judgment be non-negotiable?

Most organizations have no framework for these questions. The result is that AI tools are deployed in isolation, without coherent integration into the way work actually happens. Each department, sometimes each team, uses AI differently, with different expectations, different quality standards, and different understanding of what the AI is supposed to accomplish.

Layer 3: Intent Engineering Proper

This is the layer that almost certainly does not exist in your organization. It is the discipline of encoding organizational purpose into machine-readable, agent-actionable formats.

Traditional OKRs were designed for people. They encode human-readable goals and assume human judgment about prioritization, trade-offs, and ambiguity. When you give someone an OKR, they interpret it through layers of professional experience, team dynamics, cultural norms, and institutional memory.

An agent does not have ten years of professional experience. It does not have the social intelligence to read a room or the institutional memory to know that the last time someone tried a particular approach, it created a mess.

What intent engineering requires is a new kind of infrastructure — what you might call goal translation architecture. At the top, you need goal structures that agents can interpret and act on. Not “increase customer satisfaction” — that is a human-readable aspiration. An agent needs to know: what signals indicate customer satisfaction in our context? What data sources contain those signals? What actions am I authorized to take? What trade-offs am I empowered to make — speed versus thoroughness, cost versus quality? Where are the hard boundaries I may not cross?

Below that, you need delegation frameworks: organizational values translated into decision boundaries. Amazon’s leadership principles work for humans because humans can interpret “customer obsession” through judgment. For agents, you need specific decision trees. When two legitimate goals conflict, which wins? Under what circumstances can the agent deviate from standard procedure?

And then the measurement layer: organizational KPIs re-engineered for agent accountability. Not just whether the agent completed its task, but whether it completed its task in a way that serves the organization’s broader objectives. Did the agent resolve the ticket? Yes. Did it build a lasting customer relationship? No. That second measurement is what intent engineering demands.

Why This Has Not Been Built Yet

Part of the answer is pure speed. The race to deploy AI has wildly outpaced the infrastructure needed to direct it. Companies are buying tools faster than they are building the organizational systems that make those tools useful.

But the deeper reason is that nobody’s job description says “translate organizational intent into machine-readable agent parameters.” The closest roles — chief AI officer, head of strategy, VP of operations — are consumed by the deployment race itself. The genuinely new discipline of formalizing organizational purpose into structured, queryable, agent-actionable formats does not yet have a home in the corporate org chart.

Google’s Agent Development Kit is one of the earliest technical attempts to formalize pieces of this, separating agent context into working context, session memory, long-term memory, and artifacts, each with specific governance. Academic research from Google DeepMind has proposed five levels of AI agent autonomy — operator, collaborator, consultant, approver, and observer — each implying different governance requirements.

But these are technical scaffolding. The organizational practice of intent engineering — the hard work of making institutional values, priorities, and trade-off hierarchies machine-readable — remains largely unbuilt.

The Intent Race

The race in enterprise AI is no longer about model capability. Models are converging. They are all competent. What differs dramatically is whether organizations have the infrastructure to direct those capabilities toward what actually matters.

Klarna had access to world-class AI. The AI worked. It was not aligned with what the company actually needed because the company had never formalized what it actually needed in a way an AI system could use. That is not a technology problem. It is an organizational infrastructure problem.

The next wave of enterprise AI value will not come from better models or another Copilot license. It will come from organizational intent architecture: making your company’s goals, values, decision frameworks, and trade-off hierarchies discoverable, structured, and agent-actionable. It will come from building the alignment infrastructure that lets agents make decisions that are not just technically correct, but strategically coherent.

The companies that solve this first will have a genuine and durable competitive advantage. Their agents will work in the way the organization actually needs. Their competitors will still be debugging expensive chatbots.

🧭 Decision Radar

Dimension	Assessment
Relevance for Algeria	High — Algerian enterprises deploying AI customer service and automation face the same intent gap
Infrastructure Ready?	Partial — AI tools available but organizational intent infrastructure largely absent
Skills Available?	No — intent engineering as a discipline doesn’t exist in Algerian enterprises yet
Action Timeline	6-12 months
Key Stakeholders	CTOs, COOs, AI project leads, digital transformation teams
Decision Type	Strategic

Quick Take: Before rushing to deploy AI agents, Algerian enterprises must first formalize what they actually want those agents to optimize for — not just the easy-to-measure metrics but the organizational values that drive long-term success.