How is an AI agent different from a standard LLM application?

Agents autonomously plan, act, and iterate using tools, whereas an LLM application only generates text from prompts.

Does retrieval-augmented generation count as an agent?

No, RAG is a human-defined workflow since its sequence of steps is predetermined.

What are the three core capabilities of an AI agent?

Reasoning about tasks, acting via tools or APIs, and iterating when results fail checks.

Generative AI April 10, 2025

What Counts as an AI Agent? A Practical Definition for Developers

“Agent” has become one of the sloppiest words in AI marketing. A chatbot gets called an agent. A scheduled automation with an LLM attached gets called an agent. A retrieval system gets called an agent because it sounds better on a slide. For develope...

AI agents are being oversold. The useful distinction is simpler than that

“Agent” has become one of the sloppiest words in AI marketing.

A chatbot gets called an agent. A scheduled automation with an LLM attached gets called an agent. A retrieval system gets called an agent because it sounds better on a slide. For developers, the distinction that matters is simpler: who decides the next step?

Strip the jargon out and you mostly get three kinds of systems:

a language model that answers prompts
a workflow that follows steps a human defined
an agent that can choose actions, use tools, and retry on its own

That should be straightforward. It stops being straightforward once every vendor starts saying “agentic.”

Plain LLMs

A standard chat app built on GPT, Claude, Gemini, or an open model is the easy case:

user input -> model -> text output

Ask for an email draft, you get an email draft. Ask for a regex, a SQL query, or a code review note, same pattern.

The model may feel smart because it’s strong at language and decent at inference. But by itself it’s still limited by the context window, the training data, and whatever system prompt you wrapped around it. It doesn’t know your calendar. It doesn’t know today’s weather. It doesn’t know what’s in your internal wiki unless you give it that information somehow.

A lot of “agent” demos are just LLMs with better plumbing.

Most production systems are workflows

Once a model gets access to external data or tools, people often jump straight to “agent.” Usually that’s wrong.

A workflow is still human-authored logic. The system can call APIs, query a vector store, read a spreadsheet, summarize documents, and write results somewhere else, but the path is fixed in advance.

For example:

Check Google Calendar for an event
Call a weather API for that date and location
Generate a summary
Optionally send it by email or text-to-speech

Useful? Sure. Autonomous? Not really.

The model may generate part of the output, or choose among tightly constrained options, but you still defined the sequence. If the result is bad, a human edits the prompt, changes a branch condition, or adds another API call. Humans still own the reasoning loop.

That’s also where RAG, or retrieval-augmented generation, usually fits. The idea sounds fancier than it is: fetch relevant information first, then ask the model to answer with that context. A docs chatbot backed by embeddings and a vector database is generally a workflow. Same for a support assistant that looks up order status before replying.

That’s good news. Workflows are easier to audit, easier to test, and much easier to stop from doing something dumb.

What actually makes something an agent

You have an agent when the model gets room to decide what to do next.

Usually that means three capabilities:

Reasoning about the task and available options
Acting through tools like APIs, browsers, databases, or other models
Iterating when the first result fails a check

The usual label here is ReAct, short for reason-and-act. The idea isn’t new. What’s changed is that current tooling makes it easier for non-research teams to build around it.

An agent can take a goal like “find the clips in this video corpus that contain skiers,” choose which detectors or search methods to use, inspect results, refine the query, and return candidates without a human wiring every step by hand. Andrew Ng has used examples like this for a reason. They make the boundary clear. The human gives the goal, not the full control flow.

That autonomy is the appeal. It’s also where a lot of the trouble starts.

Why the distinction matters

If you’re building internal tools, customer support systems, research assistants, or code automation, the workflow-versus-agent line changes a lot of downstream engineering decisions.

Reliability

A fixed workflow is predictable. Given the same inputs, it usually behaves the same way. You can test it with fixtures, build assertions around outputs, and see regressions clearly.

Agents are looser. They decide which tool to call and when to stop, so two runs may diverge. That can be fine for exploratory work. It’s rough for production operations, money movement, or customer communication.

Cost and latency

Agents tend to run multi-step loops. Plan, call tool, inspect result, reflect, retry, summarize. Token usage climbs quickly. So does latency.

That’s manageable in async back-office work. It’s a problem in user-facing apps where people expect a response in two seconds, not twenty.

A lot of “agentic” products quietly add hard caps on tool calls or retries for this exact reason. Otherwise the helpful autonomous assistant turns into a budget leak.

Security

Once a model can use tools on its own, the threat model changes.

A workflow that queries one internal database has a fairly contained blast radius. An agent with email access, repo access, issue-tracker access, and a browser is a different problem. Prompt injection stops being a chatbot nuisance and starts looking like a permission-escalation path.

Serious implementations need:

scoped credentials
allowlisted tools and domains
step logging
approval gates for sensitive actions
output validation before side effects happen

If your “agent” can write to production systems, treat it with the same paranoia you’d apply to any junior automation with shell access. Probably more. This one is easy to manipulate with adversarial text.

Evaluation

Text-quality benchmarks won’t save you here.

You need task-level evals: did it choose the right tool, retrieve the right data, avoid hallucinating, stop at the right time, and finish within acceptable cost? For agents, observability matters almost as much as model quality. Traces, tool-call logs, failure taxonomies, and replayable sessions are table stakes.

That’s why the agent tooling market keeps circling the same unglamorous features: orchestration, tracing, guardrails, evals, approvals. The demo is easy. Keeping the system legible after 10,000 runs is harder.

The practical pattern in 2026

Most useful systems still follow a simple rule: use the least autonomy that gets the job done.

For a lot of teams, that means:

plain LLM for drafting, summarization, and coding assistance
workflow for retrieval, document processing, lead enrichment, support triage, and content ops
agent only where the task is open-ended enough that hardcoding the path gets brittle

That last bucket is real. Research, multi-step investigation, software maintenance across large repos, and operations tasks with branching paths can benefit from agents. But plenty of companies are jumping straight to agents because the term sounds advanced, then rediscovering why deterministic systems are useful.

If your process looks like “first do A, then B, then C,” start there. Don’t pay the autonomy tax unless the problem actually needs it.

A concrete example: the social content pipeline

One tutorial example uses Make.com, Google Sheets, Perplexity, and Claude to assemble article links, summarize them, generate social posts, and run on a schedule.

That’s a workflow. A decent one.

The inputs are known. The order is fixed. If the LinkedIn copy is weak, a human adjusts the prompt or edits the post template.

To turn that into an agent, you’d give the system a broader objective such as “publish a strong social summary of today’s relevant AI news,” then let it decide which sources to trust, whether to fetch more context, how to judge quality, whether to rewrite weak drafts, and when to stop iterating.

That may produce better output some of the time. It will also be harder to control, harder to evaluate, and easier to derail with bad source material.

The term still matters, if you use it properly

“Agent” is still a useful term. It points to a real architectural shift.

The mistake is using it as shorthand for any LLM application. That blurs meaningful differences in system design, testing, cost, and risk. A retrieval app that answers questions over your company docs has very different engineering needs from an autonomous assistant that can browse the web, open Jira tickets, modify a CRM record, and decide when it’s done.

Senior teams should be precise about this, especially with leadership. Call a workflow an agent and expectations drift toward autonomy you may not support. Call an agent a workflow and you understate the operational risk.

The cleanest question is still the best one: who decides the next step?

If the engineer designed the path, you have a workflow. If the model decides within tool and policy constraints, you have an agent.

That clears up most of the noise. It also keeps teams from building the wrong thing for the wrong problem.

Keep going from here

Useful next reads and implementation paths

If this topic connects to a real workflow, these links give you the service path, a proof point, and related articles worth reading next.

Relevant service

AI agents development

Design agentic workflows with tools, guardrails, approvals, and rollout controls.

Related proof

AI support triage automation

How AI-assisted routing cut manual support triage time by 47%.

How Sage Future is using autonomous AI agents in nonprofit fundraising

Sage Future, a nonprofit focused on effective giving, is reportedly using autonomous AI agents to run parts of real fundraising campaigns. This is ongoing operational work tied to outreach, campaign planning, research, shared documents, and social po...

Zendesk’s new AI agent claims 80% support resolution. How plausible is that?

Zendesk says its new autonomous AI agent can resolve 80% of support issues without a human. That's a big claim, but not a ridiculous one. If a company’s support queue is packed with returns, password resets, order tracking, subscription changes, ship...

An AI glossary for people tired of vague terms like agents and reasoning

TechCrunch published a broad AI glossary this week. That might sound basic, but a lot of the AI market still runs on mushy language. Founders say “agent” when they mean workflow automation. Vendors say “reasoning” when they mean slower inference with...