Artificial Intelligence November 15, 2025

Databricks co-founder says open source is central to US AI strategy

Andy Konwinski, Databricks co-founder and now co-founder of Laude, made a blunt case this week at the Cerebral Valley AI Summit: if the U.S. wants to stay ahead of China in AI, it needs to lean harder into open source. The framing is political. The s...

Databricks co-founder says open source is central to US AI strategy

Databricks’ open source warning on AI comes down to who gets to do the research

Andy Konwinski, Databricks co-founder and now co-founder of Laude, made a blunt case this week at the Cerebral Valley AI Summit: if the U.S. wants to stay ahead of China in AI, it needs to lean harder into open source.

The framing is political. The substance is technical. Konwinski’s argument is that too much frontier work in the U.S. now disappears behind API walls and private repos, while Chinese labs keep releasing open models, code, and papers that other teams can actually build on. He goes further than the usual open-versus-closed debate. He says U.S. research output is slipping relative to China, and that the trend threatens the country’s long-term position.

You can push back on the geopolitical framing. The underlying point still holds.

AI progress depends on shared artifacts. Papers matter. Weights matter more. Training code, eval harnesses, data recipes, kernel work, and reproducible baselines matter most. When that stays private, the field narrows. A small number of labs keep moving. Everyone else is stuck with leaks, benchmark chatter, or expensive API access.

That hurts startups. It hurts universities. It also hurts the labs that assume secrecy is enough to keep them ahead.

Why this is landing now

The timing makes sense. Over the past year, Chinese groups like DeepSeek and Alibaba’s Qwen team have become impossible to ignore in open model work. Their releases show up fast in eval suites, inference stacks, finetuning pipelines, and reproduction repos. Developers use them because they’re available, strong, and documented well enough to run without guesswork.

The biggest U.S. labs still lead at the top end, but they’ve become more closed. OpenAI and Anthropic are the obvious examples. Meta still publishes weights, but not often or clearly enough to make “American open AI” feel like a real strategy. At the same time, those companies keep hiring top academic researchers with pay universities can’t touch.

Konwinski described that as the U.S. “eating our corn seeds,” which is dramatic but fair. If too much frontier work moves into private labs and stays there, academia turns into a farm system instead of a source of new methods and architectures. You can already see it. Plenty of strong researchers still publish. Fewer can test ideas at meaningful scale without help from industry.

That changes the kind of science the field gets.

Replication is where progress compounds

The strongest part of Konwinski’s case is practical. Open releases speed up replication, and replication is where a lot of the field’s real progress happens.

When a lab ships weights, code, and enough detail to reproduce training and evals, other teams can validate the result in days, sometimes hours. That cuts through one of AI’s worst habits: treating benchmark claims as settled fact before anybody has reproduced them.

We’ve seen this before. BERT spread because people could run it. ResNet changed computer vision because it was teachable and reproducible. The same pattern showed up more recently with LLaMA derivatives. Once the base models circulated, the community moved quickly on instruction tuning, preference optimization, quantization, tool use, and deployment patterns. Closed models can move the market. Open ones move the field.

And a lot of the most valuable engineering work sits below the headline result.

Inference throughput comes from attention kernels, memory layout, scheduler choices, KV cache tricks, and serving architecture. FlashAttention, PagedAttention in vLLM, and optimizations in stacks like TensorRT-LLM are good examples. This is not side work. It decides whether a model is financially usable in production.

Open repos make those gains portable. One good pull request can cut latency or GPU memory use enough to change a deployment plan.

The next model breakthrough probably won’t come from one lab

Konwinski tied this to architecture research, and that part is worth paying attention to.

The current generative AI wave still rests on the 2017 Transformer paper, which was public. The next jump probably won’t arrive as one neat replacement. It’ll more likely come from a messy mix of ideas: Mixture-of-Experts routing, state space model hybrids, retrieval-heavy systems, memory mechanisms, speculative decoding, better long-context handling, and training objectives that improve reasoning without sending cost through the roof.

Those areas need lots of ablations, lots of failed runs, and lots of shared baselines. Closed labs can do that internally, but they do it with a narrower group of people and a narrower set of incentives. Open ecosystems spread that work across hundreds of teams, each pushing on a different bottleneck.

That’s usually how an idea turns from an interesting paper into standard practice.

Look at the recent pattern around open Chinese models. A strong base model lands. Within weeks, one team adds a routing trick, another tests long-context behavior on LongBench, somebody ports it to vLLM, and a fourth group publishes a finetuning recipe with LoRA or QLoRA. The speed comes from access. A grad student or startup engineer can fork, test, and publish a credible improvement without asking a platform owner for permission.

Machine learning still works best when more people can touch the thing.

Open weights don’t solve the compute problem

There’s a limit to the rosy version of this story. Open weights help. They don’t change the economics.

Training frontier models still takes huge clusters, expensive interconnects, and teams that know how to keep large jobs from falling apart. If a release includes code but no usable checkpoints, its practical value drops fast. If the model card is thin and the training data stays opaque, reproducibility is partial at best.

There’s also a licensing mess. “Open” is doing a lot of work in AI. Apache-2.0 and MIT are straightforward. Llama-style licenses are not. Some releases are source-available but not open in the usual software sense. Others come with commercial restrictions that become a real problem once a company wants to ship.

Then there’s misuse risk. Open weights widen access for legitimate research and product work. They also widen access for abuse. That means safety tooling has to improve alongside openness: better evals, provenance, checkpoint signing, content filters, watermarking where it’s technically defensible, and release strategies that account for capability thresholds.

Those trade-offs are real. They don’t erase the broader point.

For teams building AI systems now

For builders, the strategic takeaway is straightforward: don’t assume the durable moat is the base model.

That was already thin logic when open models could get close to top-tier closed performance on many tasks. It looks thinner now that strong open weights can sit inside mature serving stacks like vLLM or TGI. If the choice is between renting intelligence from an API forever and running solid open models with your own evals and controls, the second path keeps looking better.

Teams still win on a few things:

  • proprietary data and data quality
  • product integration and workflow fit
  • reliability under load
  • privacy and deployment control
  • domain-specific evaluation
  • inference cost discipline

That has operational consequences. If you’re planning AI infrastructure for 2026, model flexibility matters. Legal clarity does too.

A sane playbook looks something like this:

  • start with open-weight models that are already strong for your task, especially code and multilingual workloads where Qwen and DeepSeek families have been competitive
  • validate the license before anybody prototypes against it
  • use PyTorch 2.x, FSDP, DeepSpeed ZeRO-3, or managed distributed training if you need full finetuning
  • use LoRA, QLoRA, or similar adapter methods when cost matters more than squeezing out the last few points
  • store checkpoints as safetensors, not pickle-based formats
  • sign artifacts with tools like Sigstore or cosign
  • treat data provenance and evals as first-class engineering work

Evaluation is usually where mature teams separate themselves. A useful benchmark stack mixes capability and production metrics: MMLU, HumanEval or MBPP for code, TruthfulQA, GSM8K or MATH, long-context tests like Needle-in-a-Haystack, plus latency, cost, cache behavior, hallucination rate, and failure modes under your actual traffic.

If you’re doing RAG, add targeted retrieval evals. A model can look great on generic benchmarks and still fall apart once retrieval quality drops or context windows get noisy.

The policy fight matters less than the research pipeline

There will be a predictable policy debate around all this, and some of it matters. The part developers should care about is simpler: who gets to participate in improving the stack?

If the answer is a few large U.S. labs and whoever they decide to work with, the U.S. weakens its own position over time. A country doesn’t keep a research lead by shrinking the number of people who can inspect, reproduce, and improve the work.

Konwinski’s warning is a little self-interested. He runs a fund and accelerator aimed at AI researchers, so of course he wants a healthier open ecosystem. Still, the diagnosis matches what engineers already see. Open artifacts tend to create more usable progress per dollar than closed claims do.

The best move for the U.S. probably isn’t forcing every frontier model into the public domain. It’s rebuilding the habit of releasing enough of the stack that universities, startups, and independent research groups still matter.

That’s how the transformer spread. The next big idea will probably spread the same way.

Keep going from here

Useful next reads and implementation paths

If this topic connects to a real workflow, these links give you the service path, a proof point, and related articles worth reading next.

Relevant service
RAG and AI systems

Use open and commercial models where they fit, with evaluation and deployment controls.

Related proof
Internal docs RAG assistant

How grounded retrieval made internal knowledge easier to use.

Related article
Pinterest says open-source AI is matching performance at lower cost

Pinterest used its latest earnings call to say out loud what plenty of engineering teams have already learned: for the right workloads, open-source AI is good enough, fast enough, and a lot cheaper. CEO Bill Ready said Pinterest is seeing “tremendous...

Related article
Why Benchmark is putting another $225M into Cerebras

Benchmark raising $225 million in special funds to buy more of Cerebras says two things pretty clearly. Cerebras is no longer priced like an oddball chip startup. It just raised $1 billion at a $23 billion valuation, with Tiger Global leading the rou...

Related article
AI in 2026 becomes infrastructure, not spectacle

AI in 2026 looks less like a spectacle and more like infrastructure. That's better for the people who actually have to ship software, run systems, and answer for the bill. After two years of brute-force scaling, the center of gravity is shifting. Big...