Artificial Intelligence November 13, 2025

Anthropic's $50 billion data center plan says more about Fluidstack than scale

Anthropic says it will spend $50 billion on U.S. data centers with Fluidstack, with the first facilities in Texas and New York due online in 2026. The number is huge, but the more telling part is the partner and the model behind the deal. Until now, ...

Anthropic's $50 billion data center plan says more about Fluidstack than scale

Anthropic’s $50 billion data center bet is about control

Anthropic says it will spend $50 billion on U.S. data centers with Fluidstack, with the first facilities in Texas and New York due online in 2026. The number is huge, but the more telling part is the partner and the model behind the deal.

Until now, Anthropic has relied heavily on hyperscalers, especially Amazon and Google. This pushes it toward dedicated capacity built around Claude training and inference. That matters because the main constraint in frontier AI still isn't model design. It's compute you can secure, schedule, cool, and pay for without wasting half of it.

Anthropic isn't chasing the biggest headline. It's buying tighter control over hardware, power, network layout, and job scheduling. For a lab that cares deeply about training efficiency, that follows.

Why Fluidstack matters

Fluidstack started as a neocloud upstart. Now it's landing deals that put it in the same conversation as much larger infrastructure companies. It already has a 1 GW AI project in France, access to Google TPUs, and partnerships with companies including Meta and Mistral. Anthropic picking Fluidstack says two things.

First, neoclouds are growing up. There is a market for operators that sit between a hyperscaler and a colocation provider. Labs want custom clusters without turning themselves into full-time data center businesses.

Second, major AI labs want more low-level control than AWS, Azure, or Google Cloud usually expose. Public cloud still makes sense for bursty demand, specialty hardware, global reach, and plenty of enterprise deployment. It's weaker for giant training jobs spread across tightly coupled accelerator pods where every point of utilization counts.

Reserved capacity is now part of strategy, not just purchasing.

What “custom built for Anthropic” probably means

Anthropic hasn't published a hardware bill of materials, but “custom built” narrows the range.

Start with rack density. Frontier training clusters are moving into 80 kW to 150 kW per rack territory, sometimes higher. Air cooling struggles there during sustained heavy training. Expect direct-to-chip liquid cooling, aggressive thermal management, and facilities designed for long stretches of near-max utilization.

Networking matters just as much. Large-scale training rises or falls on interconnect performance. If the cluster is GPU-based, think InfiniBand NDR or XDR where supply allows, or 800G Ethernet with RoCEv2 plus a lot of tuning around congestion control and packet telemetry. Inside the node, you're looking at NVLink or NVSwitch class fabrics to keep tensor and pipeline parallel workloads moving.

This is where people still get sloppy. Training a large model isn't just adding GPUs. It's synchronization cost, memory sharding, checkpointing, expert routing if you're using MoE, and data movement that can destroy throughput if the topology is bad. Owning or reserving that topology helps.

Anthropic also likely wants a heterogeneous fleet. It already uses Google TPUs for parts of its work. Fluidstack's TPU access makes that more plausible. So the likely end state is mixed capacity: NVIDIA for some training and inference workloads, TPUs where Anthropic's software stack fits, and maybe room for AMD in inference-heavy roles if the economics work.

That puts pressure on the software layer to stay portable across CUDA, XLA, and maybe ROCm without turning into a maintenance nightmare. That's hard. It's also why most talk about multi-accelerator strategy sounds cheap until someone has to keep kernels, runtimes, and performance aligned across vendors.

The economics are aggressive

The financial logic behind this spend matters. Anthropic has reportedly been targeting something like $70 billion in revenue and $17 billion in cash flow by 2028. Even with a discount on those projections, the company is acting as if demand for Claude-class models and AI agents will stay high for years.

That's the bigger claim. Data centers are slow to build, expensive to operate, and awkward to repurpose if demand fades. You don't commit $50 billion unless you think inference demand will stay brutal and training runs will keep growing or keep coming.

There is risk. AI infrastructure spending already has the feel of an arms race driven by optimistic forecasts. Meta is talking about $600 billion over three years. Stargate's number is $500 billion. Those plans could still leave pockets of overbuild, especially if algorithmic gains start doing more of the work than brute-force scaling.

Anthropic's plan looks narrower and probably more tied to actual workload needs. Still, $50 billion is a giant wager on sustained utilization. Empty racks are expensive. Underused accelerators are worse.

Why Texas and New York

Texas makes sense. It offers flexible power procurement, plenty of room to expand, and access to wind and solar through ERCOT. If you're trying to build large AI campuses quickly, Texas stays high on the list.

New York is the more interesting pick. It gives Anthropic geographic spread, a different grid profile through NYISO, and another option for latency and power planning. Splitting capacity across both regions cuts exposure to a single grid and gives the company more room if local power prices or permitting conditions change.

For AI infrastructure, power is as strategic as chips, maybe more so. Accelerators can be backordered, but power constraints can kill a project outright. That's why modern AI site design now turns on PUE targets near 1.15, liquid cooling loops, switchgear lead times, and long-term power purchase agreements. The GPUs get the attention. The bottleneck is often transformers, chillers, and grid access.

What this does to the cloud pecking order

The old assumption was that hyperscalers win and everyone else rents. That assumption is weakening.

Frontier labs still need AWS, Google, and probably Azure for different reasons. They need global infrastructure, enterprise sales channels, managed services, and room to burst when dedicated clusters are full. But for core training, the public cloud premium looks less compelling once you're big enough to justify dedicated infrastructure.

Fluidstack and companies like it are taking advantage of that gap. They can move faster on custom cluster design, expose more of the hardware behavior, and avoid forcing labs into a general-purpose cloud model. That's attractive if you care about topology-aware scheduling, low-level networking, custom cooling, and accelerator-specific tuning.

Smaller teams may get squeezed. If the largest labs keep locking up future capacity for accelerators, 800G optics, high-end NICs, and liquid cooling equipment, everyone else gets longer lead times and worse pricing. The stack is stratifying. Frontier labs buy certainty. Everyone else gets leftovers, cloud markup, or both.

What engineers should pay attention to

Most developers won't touch an Anthropic-run cluster. They'll still feel the effects.

If you build or run ML infrastructure, this is another sign that portability across hardware backends is no longer optional. A stack built around one vendor and one topology will age badly. PyTorch 2.x, torch.compile, OpenXLA, Triton, and backend-specific paths matter because mixed fleets are becoming normal at the top end.

Topology awareness matters too. If your training code scales in a clean benchmark and then falls apart across real rack boundaries, that's your problem. Frameworks like Megatron-LM, DeepSpeed, FSDP, and ZeRO-3 help, but they don't solve placement by themselves. Schedulers need to understand NUMA locality, NVLink groups, pod boundaries, and network congestion. Kubernetes can cover some of this with Topology Manager and device plugins, but plenty of production clusters still treat accelerator placement too casually.

Sparse models and mixture-of-experts get more attractive in this environment. When compute is expensive and reserved, efficiency gets attention. MoE can cut training cost per token, but only if routing stays well behaved and cross-node traffic doesn't explode. Engineers should be watching fabric counters and expert hot spots, not just aggregate token throughput.

Inference also needs to grow up. Teams serving large models should already be using tools like vLLM, TensorRT-LLM, and similar runtimes that push memory use and scheduling harder. If model demand grows the way Anthropic seems to expect, inference efficiency becomes a board-level issue.

Security is another quiet implication. Dedicated AI facilities aimed at enterprise workloads will need stronger tenant isolation, hardware attestation, and tighter data path controls. Features like TDX, SEV-SNP, and newer GPU attestation support stop looking optional once regulated customers want fine-tuning or private inference on shared infrastructure.

The blunt read

Anthropic is moving from major cloud customer to serious AI infrastructure operator, even if Fluidstack handles much of the buildout. That's a meaningful shift.

The company is betting that custom capacity will beat generic cloud economics for Claude at scale. That bet is plausible. The harder question is whether enough revenue arrives quickly enough to justify everyone building giant AI campuses at the same time.

For engineers, the practical takeaway is simpler. Expect mixed accelerators, tighter capacity planning, more topology-aware software, and less tolerance for wasteful training jobs. The infrastructure is getting specialized. The software stack has to keep up.

Keep going from here

Useful next reads and implementation paths

If this topic connects to a real workflow, these links give you the service path, a proof point, and related articles worth reading next.

Relevant service
Data engineering and cloud

Fix pipelines, data quality, cloud foundations, and reporting reliability.

Related proof
Cloud data pipeline modernization

How pipeline modernization cut reporting delays by 63%.

Related article
Can renewable energy keep up with the AI data center buildout?

AI infrastructure is now big enough to bend energy planning around it. The International Energy Agency pegs 2025 data center investment at $580 billion, about $40 billion more than global spending on new oil supply. The number that matters after that...

Related article
Microsoft says its first production Nvidia AI factory is now running in Azure

Microsoft just made a pointed infrastructure announcement. Satya Nadella says the company has deployed its first production Nvidia “AI factory” inside Azure, with more coming across Microsoft’s global data center footprint. The numbers are big enough...

Related article
Andy Jassy's shareholder letter makes Amazon's $200 billion infrastructure case

Andy Jassy’s annual shareholder letter is meant for investors. This year, it also reads like a broad challenge to the infrastructure market. Amazon says it plans to spend $200 billion in capex in 2026, and Jassy uses the letter to defend that number ...