Databricks raises $1B to build the database layer for AI agents
Databricks is closing a fresh $1 billion round at a $100 billion valuation, co-led by Thrive and Insight Partners. The money backs a clear push: Databricks wants to own the database layer for AI agents. That shows up in two products it launched in Ju...
Databricks thinks AI agents will create most of your databases, and it just put $1B behind that bet
Databricks is closing a fresh $1 billion round at a $100 billion valuation, co-led by Thrive and Insight Partners. The money backs a clear push: Databricks wants to own the database layer for AI agents.
That shows up in two products it launched in June. One is Lakebase, a Postgres-based operational database for AI apps and fast-moving developer work. The other is Agent Bricks, a platform for building enterprise agents that can handle routine business tasks without constantly going off the rails.
The funding number is huge. The more interesting part is the thesis behind it. Databricks CEO Ali Ghodsi says the “user” of a database is shifting from a human developer or analyst to an AI agent that creates, modifies, and throws away databases constantly. The company says AI-generated databases jumped from 30% to 80% in a year.
Maybe that number gets revised. The direction still tracks. Agent-heavy software behaves differently from a normal SaaS app. It fans out work, spins up temporary environments, retries aggressively, and creates infrastructure as a side effect. That changes the database layer fast.
Databricks is going after Postgres now
Databricks built its reputation on analytics, lakehouses, and data infrastructure. Operational databases were somebody else’s market. Oracle, AWS, Azure, Neon, Supabase, PlanetScale, with Snowflake adjacent to the same fight. Those boundaries held up for a while.
AI has started to break them.
If developers are building agent-backed apps, they usually want a few things in one stack:
- a transactional database for application state
- vector search, or at least vector-adjacent retrieval
- governance and identity controls
- a path back into the warehouse or lakehouse for analytics and training data
That gives Databricks a real opening higher up the application stack. It already has data gravity inside big enterprises. If it can offer a Postgres system that fits cleanly into that existing estate, it has a stronger shot than most late arrivals.
Lakebase is the practical version of that strategy. It’s Postgres-based, which matters. Almost nobody wants to commit enterprise app development to a strange new database model right now. Postgres is the safe bet, the portable one, and the default under a lot of AI app frameworks whether vendors say so or not.
The shot at Supabase is obvious. The bigger target is the roughly $105 billion database market that still assumes a human is provisioning and operating most of the system.
That assumption is aging badly.
Familiar architecture, different workload
The core design Databricks is pushing is separated compute and storage for an operational Postgres system.
Yes, that comes straight out of the cloud analytics playbook. Databricks helped popularize that architecture on the warehouse side. Now it wants the same model in transactional databases for agent workloads.
The pitch is simple:
- keep durable data and logs on cheap object storage
- run Postgres compute in ephemeral workers
- scale compute up and down independently
- stop paying for idle database instances that exist because an agent might come back in 20 minutes
That last point matters. A human team might provision a handful of long-lived databases. An agent system may create thousands of short-lived environments, many idle most of the time. Traditional instance pricing gets expensive in a hurry under that pattern.
So yes, this is a database architecture story. It’s also a pricing story.
If Databricks can make “database-per-agent” or “workspace-per-task” financially sane, that changes the market. If it can’t, this stays a neat demo that dies in budget review.
When the database user isn’t human
The database industry spent decades optimizing for people. People open dashboards, run queries, deploy apps, create a few schemas, maybe overprovision production out of caution. Agents behave differently.
They’re bursty. They parallelize everything. They retry when they don’t get what they want. They create resources they never clean up. They also have no sense of blast radius unless the platform imposes one.
That leads to a different set of requirements.
Cheap isolation
You need aggressive isolation because agents make mistakes at machine speed. Sometimes that means schema-per-agent with strong access controls and row-level security. Sometimes it means database-per-agent because the toolchain is untrusted or the task is sensitive.
Both choices cost something. Schema-level isolation is lighter and cheaper. Database-level isolation is cleaner when containment matters.
Lifecycle control
Ephemeral databases sound great until you’ve got 40,000 of them still hanging around three weeks later because nobody enforced TTLs. If this category is real, the control plane matters as much as the SQL engine.
A sensible design is boring and strict: agents request a workspace, the platform records metadata and an expiration time, and a reconciler creates or destroys the actual database resources. Don’t hand lifecycle ownership directly to the agent if you care about spend or compliance.
Identity for agents, not just users
A lot of enterprise stacks still treat “service account” as good enough identity for automation. It isn’t. If multiple agents share a principal, your audit log turns into fiction.
Databricks is right to treat non-human principals as a first-class requirement. You need per-agent quotas, per-agent permissions, and audit trails that show which system took which action and under what policy.
Retry-safe writes
Agents retry constantly. That gets ugly when records duplicate, side effects fire twice, or downstream systems start failing in loops. Operational databases for agent workloads need idempotency support, outbox patterns, and guardrails around DML and DDL.
That isn’t glamorous. It is the part that decides whether enterprise deployments survive contact with production.
The trade-offs are real
Databricks’ architectural choice makes sense, but it carries the usual serverless Postgres baggage.
Cold starts are still cold starts. If compute scales to zero, some requests will sit through wake-up time. Fine for background agent tasks. Less fine for latency-sensitive user flows.
Tail latency can get messy when hot data drops out of cache and durable storage lives on object storage underneath. Maybe acceptable for a lot of AI workflows. Probably not for checkout.
Metadata management gets harder when you map huge numbers of logical databases or schemas onto shared storage layers. The control plane can become its own scaling problem.
Connection churn is nasty in agent systems. A swarm of tasks can stampede your database gateway unless you use pooling, multiplexing, and backpressure aggressively. Postgres still lives in the real world of finite connections and finite memory.
The pitch is good. The category still has to prove it can deliver operational predictability, not just elasticity.
Who should worry
A few companies should be paying attention.
Supabase and serverless Postgres vendors now have a competitor with real enterprise distribution and a much deeper governance story. Databricks has a credible shot at large accounts that want Postgres close to their analytics stack and AI tooling.
Vector database vendors should expect more pressure from “good enough” retrieval inside Postgres, especially in enterprise apps where transactional coupling and simpler governance matter more than peak ANN performance. Dedicated vector systems still have a role at larger scale, but the easy deals are up for grabs.
Cloud database incumbents need a cleaner answer for agent-native provisioning, isolation, quotas, and non-human IAM. “We support automation” doesn’t help much if the default pricing model assumes warm, long-lived instances.
If agents really do end up creating most new databases, the operational database market starts to look a lot more like serverless compute than classic managed instances.
What developers and technical leads should take from this
Don’t treat this as only a Databricks funding story. It’s also a signal about where application architecture is moving.
If you’re building agent-heavy systems now, a few design choices matter more than they did a year ago:
- Pick
schema-per-agentwhen you need lower overhead and can tolerate shared infrastructure. - Pick
database-per-agentwhen isolation matters more than efficiency. - Put TTL, quotas, and cleanup in a control plane, not in the agent logic.
- Treat agent identity as a first-class IAM problem.
- Assume retry storms will happen and design write paths accordingly.
- Be honest about latency sensitivity before you buy into scale-to-zero economics.
If you’re evaluating Lakebase specifically, the key question is whether Databricks can make Postgres behave well under agent-style workload patterns while keeping governance and cost under control. Plenty of companies can run Postgres. That part isn’t special.
Databricks is betting that the next big database customer won’t be a DBA or even an app team. It’ll be a swarm of software workers asking for databases by the minute. That sounds a little absurd until you look at how AI app stacks already behave. Then it starts to sound early, but plausible.
Useful next reads and implementation paths
If this topic connects to a real workflow, these links give you the service path, a proof point, and related articles worth reading next.
Design agentic workflows with tools, guardrails, approvals, and rollout controls.
How AI-assisted routing cut manual support triage time by 47%.
AI agents can call tools, chain prompts, hit APIs, read docs, schedule jobs, and write code. Then they hit a very ordinary constraint: paying for things. That’s the gap Sapiom wants to fill. The startup has raised a $15 million seed round led by Acce...
Firecrawl has raised a $14.5 million Series A led by Nexus Venture Partners, with participation from Shopify CEO Tobias Lütke and Y Combinator. That’s a meaningful round for a company working on a part of the AI stack that’s easy to underestimate. Fi...
NeoCognition, a startup spun out of Ohio State professor Yu Su’s AI agent lab, has emerged from stealth with a $40 million seed round led by Cambium Capital and Walden Catalyst Ventures. Vista Equity Partners joined, along with angels including Intel...