TCS secures $1B from TPG for India AI data center buildout
Tata Consultancy Services has secured $1 billion from TPG to fund half of HyperVault, a $2 billion project to build about 1.2 gigawatts of AI-focused data center capacity in India. The financing matters. So does the location. India has long been a hu...
TCS and TPG’s $2 billion AI data center bet puts India in the compute race
Tata Consultancy Services has secured $1 billion from TPG to fund half of HyperVault, a $2 billion project to build about 1.2 gigawatts of AI-focused data center capacity in India.
The financing matters. So does the location.
India has long been a huge software market and a major source of data while much of the underlying compute sat elsewhere. That imbalance is getting harder to defend. The country produces roughly 20% of the world’s data but accounts for only about 3% of global data center capacity. For large training jobs, or latency-sensitive inference at scale, that gap turns into a cost and reliability problem pretty quickly.
HyperVault is an attempt to close part of it.
Built for dense AI workloads
The project is described as a network of high-density, liquid-cooled facilities designed for AI workloads. That’s the right starting point. AI infrastructure no longer fits neatly into the old enterprise colo model.
A standard rack might draw 5 to 15 kW. AI clusters now often run at 40 to 80 kW per rack, and dense GPU configurations can hit 100 to 150 kW. Legacy air cooling doesn’t hold up well there. You need liquid cooling that can pull heat directly off the chips, and a facility design built around thermal and power constraints from day one.
That likely means some mix of:
direct-to-chipcold plates on GPUs and CPUs- rear-door heat exchangers for residual hot spots
- in some halls, possibly
single-phase immersionfor denser deployments
If TCS gets the execution right, these won’t resemble standard data centers with a few GPU pods wedged in. The operating model is different. Modular power blocks, liquid loops, fast interconnects, lots of storage, and very little room for sloppy operations.
That’s also where the project starts to look technically serious. AI data centers rise or fall on the messy parts: heat rejection, network jitter, storage throughput, and whether tenants can scale from a few hundred GPUs to a few thousand without strange performance cliffs.
Why India now
Some of the answer is simple. Demand is there.
Microsoft announced a $3 billion India investment plan earlier in 2025. Google outlined $15 billion in Andhra Pradesh. AWS is already on a $12.7 billion investment track through 2030. Hyperscalers, domestic operators, and large enterprises are all seeing the same thing. Indian demand for cloud and AI compute is rising, and data residency rules have made offshore deployment a less comfortable default.
The DPDP Act and sector-specific compliance pressure add to that. For financial services, healthcare, telecom, and public-sector workloads, local training and fine-tuning are easier to justify than shipping regulated datasets abroad and arguing the controls are equivalent.
Timing matters too. GPU supply is still tight. If you can lock in land, power, cooling, and financing now, you’re better placed when customers stop debating AI capacity and start competing for it.
The stack matters as much as the buildings
TCS says HyperVault will work with hyperscalers and AI companies to design, deploy, and operate these clusters. That’s important, because the value here probably won’t come from real estate and electrical gear alone.
Most new Indian capacity over the next five years is expected to be colocation, with hyperscalers reserving a smaller share for dedicated AI builds. So the business model is likely to lean heavily on leased capacity. But AI tenants aren’t just renting power and floor space. They need a system that actually works under load.
That means:
400Gnetworking today, with800Gmoving into real deployments through 2026- either
InfiniBand NDRor high-end Ethernet withRoCEv2 - low tail latency across large leaf-spine fabrics
- storage that can keep training jobs fed instead of turning checkpoints into a bottleneck
- scheduler and topology awareness across thousands of GPUs
For tenants, the real test is operational. Can a 4,000-GPU training run stay up for days without congestion, thermal issues, storage stalls, or odd cross-rack behavior?
That’s a tougher business than traditional colo. It’s also an area where TCS has a plausible opening. The company already has deep enterprise relationships and a services arm that can package migration, MLOps, model ops, and managed infrastructure into one contract. Plenty of operators can lease racks. Far fewer can sell an end-to-end AI deployment without sounding speculative.
The hard limits: power and water
This project is ambitious, and the constraints are obvious.
Power comes first. A 1.2 GW target is serious infrastructure anywhere. These sites will need large-grid interconnects, probably 132 kV or 220 kV, on-site substations, and modular buildouts in the 40 to 60 MW range. Battery energy storage systems can help with ride-through and smoothing, but backup generation still matters.
In India, that often still means diesel. It works, but it’s a weak long-term answer on both cost and emissions. Gas turbines, biofuels, and better storage look appealing in planning documents. They’re harder in procurement.
Water is the second constraint, and probably the more politically sensitive one. Conventional cooling designs can consume huge amounts of water, which becomes a problem fast in cities like Mumbai or Bengaluru. If HyperVault is serious about scale, it will need closed-loop cooling, heavy use of dry coolers, and probably treated or reclaimed wastewater instead of potable supply. Otherwise the backlash is easy to predict.
This is where a lot of AI data center announcements get ahead of reality. Capital helps. It doesn’t solve grid access, permits, land assembly, transmission capacity, or local scrutiny.
What developers and platform teams should watch
If you expect to deploy on Indian AI infrastructure over the next year or two, the practical details matter more than the headlines.
Fabric choice is still unsettled
Some tenants will get InfiniBand, others will end up on Ethernet with RoCEv2.
InfiniBand still has the cleaner reputation for large training clusters and collective-heavy workloads. Ethernet is more flexible, fits broader tooling better, and is often easier to defend strategically. But Ethernet for AI needs careful tuning. ECN, PFC, queue management, and congestion control settings can be the difference between acceptable performance and a very expensive mess.
If your stack assumes the network will sort itself out, fix that now.
Scheduler topology matters at scale
Once jobs spread across 1,000 to 8,000 GPUs, placement logic starts affecting both runtime and cost. Slurm, Kubernetes with Kueue, or vendor schedulers can all do the job, but they need awareness of rack locality, NVLink islands, and failure domains. Bad placement burns money. Eventually somebody in finance notices.
Storage throughput needs real numbers
AI storage planning often stays too vague. The useful question is whether your pipeline can sustain the feed rate your GPUs actually need.
For high-throughput training, a reasonable target is roughly 2 to 4 GB/s per GPU aggregate, depending on workload shape. That usually points to a hot path built on NVMe-oF with RDMA, plus object storage for datasets and durable checkpoints. If active data lives in object storage and you’re hoping the system hides the latency, there’s a good chance you’re paying premium GPU prices for idle time.
Residency and auditability are part of the architecture
In regulated sectors, keeping data ingestion, feature stores, training, fine-tuning, and inference in-country matters. So does proving lineage, encryption, and access isolation. That affects architecture directly. Your Kafka or Pulsar clusters, metadata plane, key management, and observability stack all need to line up with that boundary.
Less glamorous than model size. Usually more important for getting a deployment approved.
TCS is trying to move up the stack
TCS isn’t approaching HyperVault like a passive landlord. It’s trying to occupy a larger slice of the AI infrastructure chain: facilities, operations, enterprise integration, and managed services.
That puts pressure on local colo firms, and it also creates a workable partnership model for hyperscalers. Microsoft, Google, or AWS can still reserve pods or place dedicated clusters while TCS handles some of the integration and customer-facing operational work around them.
That model fits India. Enterprises want local capacity, but many also want someone else to absorb the complexity of stitching together infrastructure, compliance, and ML operations.
If HyperVault stays on schedule and solves the power and water problem, India gets something it has lacked for years: meaningful domestic AI compute at industrial scale. If those basics slip, the project becomes another reminder that AI infrastructure is mostly an energy and systems engineering business, with software layered on top.
What to watch
The harder part is not the headline capacity number. It is whether the economics, supply chain, power availability, and operational reliability hold up once teams try to use this at production scale. Buyers should treat the announcement as a signal of direction, not proof that cost, latency, or availability problems are solved.
Useful next reads and implementation paths
If this topic connects to a real workflow, these links give you the service path, a proof point, and related articles worth reading next.
Fix pipelines, data quality, cloud foundations, and reporting reliability.
How pipeline modernization cut reporting delays by 63%.
AI infrastructure is now big enough to bend energy planning around it. The International Energy Agency pegs 2025 data center investment at $580 billion, about $40 billion more than global spending on new oil supply. The number that matters after that...
Anthropic says it will spend $50 billion on U.S. data centers with Fluidstack, with the first facilities in Texas and New York due online in 2026. The number is huge, but the more telling part is the partner and the model behind the deal. Until now, ...
The latest AI buildout in the US is converging on a blunt answer to a blunt constraint: large models need huge amounts of electricity, and gas is fast to deploy. That explains the wave of projects in Texas, Louisiana, and Tennessee. Poolside and Core...