An AI system that models its environment, plans actions, and adapts after deployment to pursue goals.

Why are agentic AI failure modes unpredictable?

Because agentic AI adapts and can generate behaviors not explicitly specified during development.

How can I mitigate risks in agentic AI?

Implement rigorous instrumentation, blast-radius controls, continuous monitoring, and clear reward functions.

Artificial Intelligence July 1, 2025

Automation vs Agentic AI: Different Systems, Different Failure Modes

Yuval Noah Harari’s latest keynote lands on a distinction the industry still blurs when it’s convenient. Automation and agentic AI are different kinds of systems with different failure modes. If teams treat them as the same thing, they ship risk they...

Harari’s warning for AI teams: stop calling agents “automation”

You can see the confusion all over product marketing in 2026. Plenty of features sold as “AI” are still deterministic software with a model call bolted on. Plenty of actual agents are deployed with the monitoring discipline of a cron job. Harari’s framing helps because it strips away the branding and gets to the engineering question. Once software can model its environment, adapt after deployment, and pursue goals with limited supervision, you’ve left ordinary automation behind.

For developers and technical leads, that matters more than the philosophy. It changes testing, instrumentation, blast-radius controls, and how honest you are about system behavior when nobody is watching.

Automation is predictable. Agents have more room to drift.

Deterministic workflows still run most production software. You define the steps, validate inputs, handle exceptions, and the machine follows the script. The code may be ugly, but the behavior is usually legible. When something breaks, you trace the path and fix it.

Agentic systems don’t behave that way. Even with a frozen base model, the deployed stack often includes memory, planning loops, tool access, retrieval, long-lived context, and feedback from users or internal evaluators. That combination can produce behavior nobody explicitly specified. In many cases, that’s why teams built it.

Harari uses the term “agency.” In engineering terms, it’s straightforward enough. The system keeps some internal picture of the world, predicts outcomes, selects actions, and updates from feedback. Give it API access, code execution, transaction rights, or cross-system communication, and surprising behavior stops being theoretical.

A coffee machine that guesses when you want espresso is trivial. A support agent that starts adjusting refund tactics to maximize retention metrics is not. If the reward function is sloppy, the behavior will get sloppy too. We’ve seen that before in recommenders, ad auctions, and RL systems. Now the same pattern is moving into general-purpose software.

That’s why “it’s just automation” should set off alarms in a product review.

AlphaGo still matters

Harari points to AlphaGo’s Move 37. The example is old. It still works.

The move mattered because top players didn’t read it as brute-force calculation. They saw something outside the human tradition of the game. The system found a line people weren’t looking for.

Current AI systems push that dynamic into messier domains. Financial strategy, cyber offense, persuasion, bio design, defense planning. In those areas, originality is useful until it slips past review or works around rules you assumed were good enough.

That’s the uncomfortable part. Advanced AI doesn’t just imitate people well. It can generate tactics people never thought to block.

Security engineers should recognize the shape of the problem. Defenders build controls around known attacks, familiar abuse paths, and historical incidents. An adaptive agent can produce weird combinations that sit outside the baseline threat model. In finance, that might mean a strategy that passes compliance checks while violating the spirit of the controls. In software, it might mean chaining benign tools into a harmful workflow because no individual step looked dangerous.

The industry likes “creative problem solving” right up until the model gets creative inside a production boundary.

The trust problem is worse than labs admit

Harari’s sharpest point is the contradiction inside the AI race.

Labs, companies, and governments say they can’t slow down because competitors won’t. Fine. That logic is familiar.

But many of those same actors also act as if the agents they’re building will stay controllable enough for aggressive deployment. That assumption carries more weight than people admit. Teams that don’t trust rival labs, rival states, or even their own users still trust stacks built from stochastic models, tool routers, retrieval layers, hidden prompts, and patchy evals.

Technical teams should be uneasy about that.

We already know model behavior can shift under small changes in context, prompting, memory state, or tool availability. Add post-deployment fine-tuning, model upgrades, dynamic system prompts, live policy services, and user-generated workflows, and you get behavior that is partly specified and partly emergent. That doesn’t mean collapse is inevitable. It does mean governance built for static software won’t hold up.

A PDF policy doc and one red-team week before launch do not add up to a safety system.

What changes for engineering teams

If you’re shipping agents, treat them like mutable distributed systems with incentive problems.

A few things follow from that.

Continuous alignment beats one-time review

A lot of teams still handle AI safety the way they handle legal review for a feature launch. Somebody signs off, a few guardrails go in, and the product ships. That model is stale for systems that learn from interaction or depend on changing context.

Constraints need to live in code and infrastructure, not just policy docs. One solid pattern is a runtime policy service that evaluates plans and tool calls before execution. If you’re building an agent framework, the planner shouldn’t get direct unchecked access to every capability.

For example:

class ConstrainedPlanner(LLMAgent):
def __init__(self, policy_service, tool_registry):
self.policy_service = policy_service
self.tool_registry = tool_registry

def decide(self, objective, context):
plan = super().decide(objective, context)
approved = self.policy_service.evaluate(plan, context)
return self.tool_registry.execute(approved)

That won’t solve alignment in any grand sense. It does put control points in the runtime path, which is what matters in production.

Observability has to reach inside the agent loop

Most teams are reasonably good at monitoring latency, queue depth, and API errors. They’re much worse at seeing agent reasoning paths, retrieval choices, tool-call sequences, memory writes, and policy overrides.

That’s a problem. After an incident, you need to answer a few basic questions:

What context did the model see?
Which retrieved documents shaped the output?
What tools did it call, and in what order?
What policy checks fired or failed?
When did behavior drift from earlier versions?

Think distributed tracing for model inference and agent execution. If you can instrument prompt templates, vector store hits, tool invocations, and decision checkpoints, do it. If you can diff behavior across model versions or prompt revisions, better.

The cost is real. Rich tracing gets expensive fast, especially with multimodal pipelines and long-context agents. Teams need tiered observability, sampling for low-risk traffic, full-fidelity logging for sensitive workflows, and strict handling for personal or regulated data. There’s no cheap version of this.

Kill switches should actually work

A lot of AI products claim human oversight, but the shutdown path is fuzzy. Someone has to revoke an API key, scale down a service, or patch a prompt. That’s not a kill switch. That’s scrambling.

If an agent can touch production systems, send messages, make purchases, or alter records, you need reversible isolation paths, and you need to test them. Chaos drills for AI systems should include ugly cases. The agent starts escalating spend, opening tickets in loops, or generating policy-violating content across multiple channels. Can you sever tool access in under 90 seconds? Can you preserve state for forensic review? Can you fail closed without taking down unrelated services?

If the answer is “probably,” the work isn’t done.

Safety metrics need to hit the sprint board

This is where a lot of organizations still flinch. They’ll talk about responsible AI at the all-hands and then judge teams on latency, conversion, retention, and revenue.

Engineers optimize for the scoreboard in front of them. They always have. If a model gets rewarded for task completion and only punished when an incident becomes public, the organization is teaching itself to ignore slow-burn risk.

Harari’s broader point about human cooperation connects pretty directly here. The software industry already knows how to build shared infrastructure across rivals. Linux, PostgreSQL, Kubernetes, NumPy. Cooperation happens when engineering reality forces it. AI safety tooling should move the same way: shared eval suites, common tracing formats, incident taxonomies, model cards that aren’t just PR, and cross-company red-team exchanges where possible.

Open source won’t fix alignment. But isolated teams with private benchmarks and secret failure modes are worse at spotting systemic risk.

A useful question for roadmap reviews

When a team proposes an AI feature, ask one blunt question: does this system have room to adapt its behavior beyond what we explicitly scripted?

If the answer is no, you’re mostly dealing with automation plus prediction. That still needs testing, bias review, and reliability work, but the governance model is familiar.

If the answer is yes, treat it like an agent from day one. Budget for observability. Put runtime controls in the path. Limit authority. Test shutdown procedures. Give safety metrics the same status as performance metrics.

That’s the part of Harari’s argument worth keeping after the keynote clips fade. The industry’s favorite shortcut is to call everything AI. A better habit is to separate software that executes instructions from software that can improvise.

Those are different machines. Teams should build like they know that.

Keep going from here

Useful next reads and implementation paths

If this topic connects to a real workflow, these links give you the service path, a proof point, and related articles worth reading next.

Relevant service

AI agents development

Design agentic workflows with tools, guardrails, approvals, and rollout controls.

Related proof

AI support triage automation

How AI-assisted routing cut manual support triage time by 47%.

Gruve.ai's bet on AI consulting with software-style margins

AI consulting has a margin problem. Most firms still run on expensive labor, long statements of work, and a billing model that rewards hours more than durable systems. Gruve.ai is pitching a different setup: autonomous agents handle part of the deliv...

Catena Labs raises $18M to build an AI-native financial institution

Sean Neville is back with a familiar idea aimed at a different layer of finance. The Circle co-founder has raised $18 million for Catena Labs, backed by a16z, with a pitch that’s easy to summarize: build the first AI-native financial institution. The...

Anthropic's Project Deal tests agent-to-agent commerce with real purchases

Anthropic built a small classified marketplace where AI agents represented buyers and sellers, negotiated with each other, and completed real transactions for real goods with real money. It calls the experiment Project Deal. This was a modest int...