Artificial Intelligence December 5, 2025

Simular launches a macOS agent that can operate your computer directly

Simular has released a 1.0 macOS agent that can operate a computer directly, and says Windows support is coming through Microsoft’s Windows 365 for Agents program. It also raised a $21.5 million Series A led by Felicis, with NVentures and South Park ...

Simular launches a macOS agent that can operate your computer directly

Simular wants desktop AI agents to stop improvising and start shipping code

Simular has released a 1.0 macOS agent that can operate a computer directly, and says Windows support is coming through Microsoft’s Windows 365 for Agents program. It also raised a $21.5 million Series A led by Felicis, with NVentures and South Park Commons participating.

The funding is notable. The product design is the more interesting part. Simular is going after a problem most AI agent demos blur or ignore: desktop work is messy, long-running, and fragile. Browser agents already struggle once a workflow leaves the tab. Add Excel, PDFs, a file picker, an old line-of-business app, or a terminal window, and reliability drops fast.

Simular’s pitch is straightforward. Let the model explore. Then turn the successful path into code.

Why this stands out from the usual computer-use demo

A lot of companies now say their models can “use a computer.” Usually that means a vision model watches the screen and guesses where to click next. It can look good in a short demo and unravel 30 steps later.

Simular is aimed at that exact problem. Its system watches a human-supervised run, finds a working path through a task, and converts that path into deterministic code that can be replayed. The company calls it a neuro-symbolic computer-use agent. The label is a little academic. The underlying idea makes sense.

Use the model for exploration. Use code for repetition.

That’s a far saner way to automate desktop workflows than leaving an LLM to wing every click indefinitely.

If you’ve spent time around RPA, none of this is mysterious. UiPath and Automation Anywhere learned the hard way that enterprise automation lives or dies on repeatability, selectors, retries, state checks, and logs. Simular is applying the same lesson with the LLM at the authoring layer instead of the execution layer.

That difference matters.

The technical bet: compile successful behavior into workflows

Simular hasn’t published a full architecture, but the shape of it is clear. The agent observes desktop state, proposes actions, gets corrected by the user when needed, and once a run succeeds, turns the action trace into something like a script or workflow.

That changes the second run completely.

Instead of asking a model again and again what to do next, the system can execute a fixed sequence with checks like:

  • is the expected window open?
  • does the “Export” button exist?
  • did the file save to the right path?
  • did the app drift into a different state?

This is where the hallucination story starts to sound credible. Hallucinations on desktop agents aren’t just a language problem. They’re a control problem that compounds over time. If each step has a small chance of failure, thousands of steps make failure the default outcome. Deterministic replay cuts that risk hard.

A solid implementation probably needs some internal DSL for UI actions, state assertions, and fallback logic. Even if users never see it, the workflow engine needs a typed vocabulary for things like:

click(element="Export") wait(window="Save As", timeout=5000) assert(file.exists("/tmp/report.csv"))

Without that layer, desktop automation degrades into screenshot spaghetti.

Why OS-level control matters

Browser agents are useful until real business software shows up. Then you run into native apps, file dialogs, local folders, Outlook attachments, Excel macros, VPN-gated tools, and old Windows software that half the enterprise still depends on.

That’s the territory Simular is targeting.

On macOS, desktop control usually means some mix of the Accessibility API, AppleScript, Shortcuts, screen parsing, and OCR. On Windows, you’re likely dealing with UI Automation, Win32, PowerShell, and maybe remote orchestration in a managed environment. None of this is clean.

Accessibility trees are better than pure vision, but many apps expose them badly. OCR helps, but canvas-heavy UIs and odd scaling can wreck detection. Pixel coordinates are easy to record and brittle in production. Timing issues never really disappear.

So the hard part here isn’t proving that a model can click around a desktop. It’s building a runtime that survives latency, pop-ups, permission prompts, partial renders, dynamic layouts, and the general hostility of desktop software.

That’s where a lot of agent startups run into a dull but important fact: computer use is mostly systems engineering.

The Microsoft angle matters

Simular says it’s been accepted into Microsoft’s Windows 365 for Agents program, alongside Manus AI, Fellou, Genspark, and TinyFish. That matters because unmanaged endpoint automation is a security and governance headache.

If an AI agent can control a desktop, it can exfiltrate data, trigger actions under a user’s identity, and wander into systems it shouldn’t touch. Running those workflows inside cloud-hosted Windows sessions gives enterprises a cleaner answer on isolation, policy enforcement, audit logs, and revocation.

That doesn’t fix everything. A bad workflow in a hosted session is still a bad workflow. But it does narrow the blast radius.

For technical buyers, this may be the strongest enterprise signal in the announcement. Desktop agents are easier to tolerate when they run in a fenced-off environment instead of on someone’s actual laptop with Slack, SSH keys, and personal files sitting next to the work.

The use cases are mundane. That’s a good sign.

Simular cites things like VIN searches at a dealership and extracting contract data from PDFs for homeowners associations. That may sound small. It’s also exactly the sort of work desktop automation tends to handle well.

These systems do best on tasks that are narrow, repetitive, and expensive to do by hand. Nobody needs a desktop genius. They need something that can open these files, extract these fields, update this sheet, save the result, and stop breaking.

That kind of workflow has three advantages:

  • clear start and stop conditions
  • measurable success criteria
  • enough volume to justify automation

It also fits Simular’s compile-and-replay approach. A bounded task is much easier to freeze into deterministic code than an open-ended assistant session.

Simular also has an open source macOS project called simular-pro. That could help with adoption among engineering teams that want to inspect how much of the stack is real and how much is demo gloss. In this category, black-box magic is a hard sell. People want to know what happens when the UI changes, what gets logged, and how recoverable failures really are.

What developers and AI leads should watch

If you’re evaluating Simular or anything similar, don’t focus on whether the first demo works. Focus on whether the 200th run still works after the app updates, the network slows down, and the screen layout shifts slightly.

A few things matter right away.

Treat generated workflows like production code

If the system outputs deterministic scripts or structured workflows, put them in Git. Review them. Diff them. Run tests in a VM before they touch a live process. Teams already know how to govern code. That’s a much better control surface than trying to review free-form prompts.

Favor selectors over screenshots

If an agent can anchor to accessibility attributes, window titles, menu names, and control identifiers, use those. Vision should be the fallback. A screen is a bad API when a real one exists.

Design for safe re-runs

Desktop workflows fail halfway through all the time. Good automation needs idempotency checks, partial progress markers, and cleanup steps. Otherwise a retry creates duplicates, submits forms twice, or overwrites files in the wrong state.

Log everything

Desktop automation is miserable to debug without timestamps, screenshots, action traces, and state assertions. If a workflow fails in production and your only artifact is “the agent got confused,” you don’t have much of a system.

Keep the scope tight at first

The sweet spot is repetitive clerical work across awkward interfaces. Start there. If a vendor promises broad autonomous desktop reasoning across arbitrary apps, assume the reliability curve is ugly until proven otherwise.

Where this could break down

Simular’s approach is smart, but the ugly parts are still there.

UI drift is a constant tax. So are OS permission changes, modal interruptions, multi-monitor weirdness, and apps that render text in ways OCR struggles with. Long workflows still need recovery logic. Human supervision during the discovery phase sounds reasonable, but it also suggests workflow creation may stay expensive unless success rates improve quickly.

There’s also a product question hanging over this. Who owns the compiled workflow after the agent discovers it? If it’s really code, advanced teams will want to edit it, version it, and port it. If Simular keeps that layer abstract, customers may feel stuck inside another automation silo.

Then there’s cost. Exploration uses model inference, perception, and orchestration. Replay should be much cheaper, but only if the system actually gets the model off the hot path. If “deterministic execution” still falls back to LLM calls all over the place, the economics get ugly fast.

That’s the line worth watching.

The part Simular seems to get right

The strongest idea in this launch is simple: agents become far more useful when they turn into software artifacts.

That framing is better than endless prompt loops pretending to be automation. Engineers can test artifacts, version artifacts, and audit artifacts. A bot that succeeds once, fails twice, and explains itself in fluent nonsense is much harder to trust.

Simular still has to prove that its macOS agent works outside curated workflows, and Windows support will probably be the bigger commercial test anyway. But the direction looks right. Desktop agents need less improvisation and more compilation. A lot of the market still hasn’t caught up to that.

Keep going from here

Useful next reads and implementation paths

If this topic connects to a real workflow, these links give you the service path, a proof point, and related articles worth reading next.

Relevant service
AI agents development

Design agentic workflows with tools, guardrails, approvals, and rollout controls.

Related proof
AI support triage automation

How AI-assisted routing cut manual support triage time by 47%.

Related article
Atlassian puts AI agents into Jira as assignable teammates

Atlassian’s latest Jira update does something a lot of AI tooling has sidestepped: it makes agents visible, assignable, and measurable inside the same workflow humans already use. In the new open beta, AI agents can show up in Jira as actual assignee...

Related article
NeoCognition emerges from stealth with $40M to build AI agents based on human learning

NeoCognition, a startup spun out of Ohio State professor Yu Su’s AI agent lab, has emerged from stealth with a $40 million seed round led by Cambium Capital and Walden Catalyst Ventures. Vista Equity Partners joined, along with angels including Intel...

Related article
Astropad Workbench targets AI agent oversight on macOS, not remote IT

Astropad has a new app called Workbench, and the pitch is narrower than it sounds. It’s remote desktop aimed at people supervising AI agents on Macs. Not help desks. Not gamers chasing frame times. That focus makes sense. A lot of agent workflows sti...