Artificial intelligence June 20, 2026

Pramaana Labs raises $27M to apply formal verification to AI

Pramaana Labs has raised a $27 million seed round led by Khosla Ventures to tackle a hard enterprise AI problem: models that sound right when they’re wrong. The round, announced Wednesday, includes Accel, BoldCap, Nexus Venture Partners, Premji Inves...

Pramaana Labs raises $27M to apply formal verification to AI

Pramaana Labs raises $27M to make AI answers provable, not just plausible

Pramaana Labs has raised a $27 million seed round led by Khosla Ventures to tackle a hard enterprise AI problem: models that sound right when they’re wrong.

The round, announced Wednesday, includes Accel, BoldCap, Nexus Venture Partners, Premji Invest, and Unbound. Pramaana plans to focus on high-stakes areas such as tax preparation, law, drug discovery, and cybersecurity, where a confident hallucination can become a bad filing, a regulatory problem, a missed vulnerability, or worse.

The startup’s technical bet is to combine large language models with formal verification, the mathematical machinery used to prove that software or logic satisfies a specification. Its work centers on Lean, the open source proof assistant and programming language used by mathematicians and computer scientists to verify proofs.

It’s an ambitious plan. It’s also a hard one.

Reliability has outgrown prompt engineering

Most enterprise AI deployments run into the same wall. The demo works. The pilot looks useful. Then someone asks whether the system can be trusted with business decisions, audit trails, compliance exposure, or customer-facing workflows.

That’s where LLMs still struggle.

A language model can summarize a contract, explain a tax rule, or propose a molecule. It can also invent a citation, misapply an exception, or fail silently when a small detail changes the answer. Retrieval-augmented generation helps by grounding answers in documents, but RAG doesn’t prove reasoning. It improves the inputs. It doesn’t guarantee that the output follows the rules.

Pramaana is aiming at that gap.

The company’s system still uses a conventional LLM for natural language interaction and flexible problem solving. It then adds a deterministic verification layer meant to check whether the model’s reasoning satisfies formalized rules. In a tax scenario, the LLM might parse a user’s situation and propose an answer, but that answer would need to pass a rules-based check against a codified version of the relevant tax logic.

CEO and co-founder Ranjan Rajagopalan described tax law as a good fit for this approach. “It’s like math in the sense that you have a lot of rules that you need to abide by,” he told TechCrunch. “Once you have a codified version of it, the reasoning on top of it starts becoming deterministic.”

The model handles language and messy human input. The verification layer checks the final reasoning with machinery that doesn’t improvise.

Why Lean is the right kind of hard

Formal verification has been around for decades, mostly in places where failure is expensive: avionics, cryptography, compilers, chip design, distributed systems, and security-critical infrastructure. Tools such as Coq, Isabelle/HOL, TLA+, Dafny, and Lean let engineers specify properties and prove that implementations satisfy them.

Lean has gained particular visibility through formal mathematics. It lets users encode definitions, theorems, and proofs in a machine-checkable form. If the proof type-checks, the system has verified that the logical steps hold under the encoded assumptions.

That’s very different from asking an LLM to “think step by step.”

A proof assistant doesn’t care whether an argument sounds plausible. It checks structure, types, and whether each inference follows from prior definitions and accepted rules. That rigor is exactly why formal methods are attractive for AI reliability.

It’s also why they’re painful.

Someone has to formalize the domain. In software, that means writing specifications. In tax, law, or drug discovery, it means translating messy real-world rules, exceptions, definitions, thresholds, and edge cases into executable logic. That is slow, expert-heavy work. It can pay off, but only if the formal model stays current and maps cleanly enough to reality.

Pramaana appears to understand that. According to the source report, the company plans to build domain-specific Lean-style verification systems overseen by experts. For tax law, it’s working with former IRS commissioner Danny Werfel. Professors from IIT Delhi, IIT Madras, and UC Berkeley are involved in cybersecurity and drug discovery work.

That expert layer is the product.

The Catala precedent matters

Pramaana isn’t starting from pure theory. France’s CATALA project, developed with INRIA and French public agencies, has already formalized parts of the country’s tax and benefits system into executable code.

Catala treats legal rules as a programming language problem. Statutes and administrative rules contain conditions, exceptions, and computations. If they can be encoded precisely enough, software can execute them consistently and produce traceable results.

Tax is a natural fit because much of it behaves like a decision tree plus arithmetic, even when the prose is painful. Eligibility, thresholds, deductions, credits, phaseouts, filing status, dates, and jurisdiction-specific exceptions can all be represented in formal logic if the scope is controlled.

Law outside those bounded tasks gets harder. Contract review, litigation strategy, and statutory interpretation often involve ambiguity, precedent, intent, and competing readings. Formal systems can help with constrained work, such as checking whether a clause satisfies a policy or whether a filing includes required elements. They won’t settle every legal question.

Drug discovery is harder again. Parts of computational chemistry and bioinformatics can be formalized. Many others depend on probabilistic models, wet-lab validation, incomplete biological knowledge, and noisy data. A verifier can check whether a workflow followed specified constraints. It can’t prove that a molecule will behave safely in humans.

That distinction is important. Formal verification can provide strong guarantees inside the box you define. It says much less about whether you defined the right box.

What developers should watch

For engineering teams, the interesting piece is the architecture: probabilistic front end, deterministic back end.

That pattern is already showing up across AI products. An LLM converts natural language into structured intent. A rules engine, planner, compiler, database, theorem prover, or runtime then performs the actual operation. The model becomes an interface and proposal generator, not the final source of truth.

A practical version might look like this:

  • User asks a natural language question.
  • LLM extracts entities, facts, constraints, and intent.
  • The system maps those into a formal representation.
  • A verifier checks consistency against domain rules.
  • The final answer includes a trace, proof artifact, or executable derivation.
  • If verification fails, the system asks for missing information or rejects the answer.

That’s a healthier design than letting a model produce polished prose and hoping a post-hoc classifier catches the mistakes.

It also raises engineering questions vendors tend to underplay.

Latency can be ugly. Formal proof search and constraint solving can be computationally expensive, especially if the system has to resolve ambiguous inputs or generate proof terms dynamically. Developers will need to know where verification happens, how long it takes, what gets cached, and whether failures are explainable.

Coverage is another problem. If only part of the domain has been formalized, the system needs to say so clearly. A verified answer for one category of tax treatment doesn’t imply reliability across the whole tax code. Partial formalization is useful only when the product exposes its boundaries.

Versioning will be painful. Laws change. Security rules change. Scientific assumptions change. Any serious deployment needs auditable rule versions, migration paths, test suites, and provenance for the formal definitions themselves. Otherwise the “verified” layer becomes another source of stale logic.

Then there’s the input problem. Formal verification checks reasoning over formalized facts. If the LLM extracts the wrong facts from a document, or the user leaves out a key detail, the verifier may faithfully validate the wrong scenario. Robust systems will need uncertainty handling, data validation, document grounding, and human review paths for high-risk cases.

Security teams may care early

Cybersecurity is one of the more plausible early markets for this approach, assuming Pramaana keeps the scope narrow.

Security already uses formal methods in protocol verification, access control policies, cryptographic proofs, and static analysis. LLMs are useful for triage and code understanding, but they’re unreliable when asked to reason precisely about exploitability or policy compliance.

A hybrid system could help answer bounded questions: Does this IAM policy allow privilege escalation? Does this protocol state machine violate a safety property? Does a patch preserve a specified invariant? Does generated code satisfy a memory safety constraint?

Those are structured verification problems with natural language at the edges.

The hard part is scale. Real codebases are large, dependency-heavy, and full of dynamic behavior. Formalizing enough of a production system to prove useful properties remains expensive. Most organizations won’t fully verify their applications. They might pay for narrow verification around high-value assets, generated code, smart contracts, policy engines, or compliance-critical workflows.

That’s where Pramaana could find traction: hard checks around specific classes of AI-assisted work.

A big seed round doesn’t prove the product

A $27 million seed round is large, even by AI infrastructure standards. It gives Pramaana room to hire specialists, build domain models, and work with experts before trying to sell broadly. It also shows how much investors want tools that make enterprise AI less slippery.

But formal verification doesn’t become easy because an LLM sits next to it.

The bottleneck is likely to be domain encoding, not model access. Building and maintaining a high-quality formal representation of tax law, regulatory logic, or biomedical constraints requires rare talent: domain experts who can think formally, and engineers who can translate expert knowledge into machine-checkable systems without flattening away important nuance.

There’s a product risk too. Buyers may like the reassurance of “verified AI,” but they’ll still ask familiar questions: How much of my workflow is covered? How do I inspect the proof? What happens when rules conflict? Can my own experts modify the logic? Does it integrate with existing systems? Who is liable when the formalization is wrong?

Good answers to those questions will matter more than the elegance of the architecture.

Useful, if the boundaries stay visible

Rajagopalan’s line that “the world’s hardest problems are not unsolvable. They are unformalized” is catchy, and in some domains it’s directionally right. Plenty of high-value work is governed by rules that can be encoded, tested, and executed more consistently than humans apply them today.

Some hard problems are hard because the rules are incomplete, contested, or empirical. Formal verification can reduce one class of AI failure: invalid reasoning against known rules. It won’t eliminate bad inputs, ambiguous laws, outdated assumptions, weak specifications, or overconfident product claims.

That’s still useful. The best version of Pramaana’s approach doesn’t need to solve all of AI reliability. It needs to make certain answers checkable, expose where the checks stop, and refuse to pretend that proof machinery covers what it hasn’t formalized.

Keep going from here

Useful next reads and implementation paths

If this topic connects to a real workflow, these links give you the service path, a proof point, and related articles worth reading next.

Relevant service
AI model evaluation and implementation

Compare models against real workflow needs before wiring them into production systems.

Related proof
Internal docs RAG assistant

How model-backed retrieval reduced internal document search time by 62%.

Related article
Nexos.ai raises €30M Series A to build enterprise AI infrastructure

Nexos.ai has raised a €30 million Series A at a €300 million valuation, with Index Ventures and Evantic Capital co-leading the round. The startup was founded by Nord Security co-founders Tomas Okmanas and Eimantas Sabaliauskas, and its pitch is clear...

Related article
OpenAI outlines Pentagon use of classified AI models with technical safeguards

OpenAI says the Department of Defense will be able to use its models on classified networks, with technical safeguards that OpenAI keeps in place. Sam Altman framed the deal around two boundaries: no domestic mass surveillance, and no handing lethal ...

Related article
Why VCs still think enterprise AI adoption finally starts next year

Venture investors are making the same call again: next year is when enterprise AI starts paying off. This time, the pitch is less gullible. TechCrunch surveyed 24 enterprise-focused VCs, and the themes were pretty clear. Less talk about bigger chatbo...