Why is Intel cutting its Foundry workforce?

To reduce management overhead, improve execution speed, and refocus on core technology areas.

What triggered Nvidia’s H20 revenue hit?

US export controls on the China-targeted H20 chip forced a $4.5 billion charge and an $8 billion Q2 shortfall.

How is AMD addressing its product lineup?

By making targeted acquisitions to quickly fill strategic hardware gaps.

Artificial Intelligence June 23, 2025

US Chip Market in H1 2025: Intel Cuts, Nvidia Caps, AMD Deals

H1 2025 chip market: Intel cuts deep, Nvidia gets boxed in, AMD buys for the stack

The first half of 2025 has made the US chip market look a lot less tidy than the AI boom narrative suggested. Intel is cutting deep while trying to restore some internal discipline under Lip-Bu Tan. Nvidia is still the core supplier for AI infrastructure, but US export controls have turned one of its China chips into a real financial problem. AMD keeps filling gaps with targeted acquisitions instead of waiting to build everything itself.

For teams that build ML systems, run platform engineering, or make long-term hardware bets, these moves connect. AI compute is getting more fragmented, more political, and a lot more dependent on software portability than many buyers wanted to admit a year ago.

Intel is trying to buy time

Intel’s changes are messy, but they may matter most over the next few years.

Under new CEO Lip-Bu Tan, the company made four senior leadership hires, including a new Chief Revenue Officer and engineering talent. At the same time, Intel plans to cut 15% to 20% of Intel Foundry staff starting in July. It’s also considering a sale of its networking and edge businesses, which reportedly brought in $5.4 billion in 2024 revenue.

The logic is pretty clear. Intel has too many layers, too many side bets, and too little room for execution misses. Foundry, process technology, packaging, AI accelerators, and x86 all need attention. No company manages that cleanly with a bloated org chart.

The cuts are ugly. They also fit the problem Tan seems to be solving. In chip development, management overhead slows validation loops, complicates sign-off, and creates distance between design teams and the people making schedule decisions. When a product slips, the damage compounds fast.

For developers, the useful question is whether a flatter Intel can ship better tooling and a cleaner accelerator story. If it can, oneAPI and OpenVINO become easier to take seriously in mixed-vendor environments. If it can’t, Intel stays where it’s been for a while: interesting hardware, too much friction.

There’s also the foundry angle. If Intel Foundry cuts hard and narrows focus, some customers will read that as discipline. Others will read it as instability and keep defaulting to TSMC or Samsung. Engineering teams won’t make that call alone, but they’ll deal with the fallout if capacity, packaging access, or roadmap confidence shifts underneath them.

Nvidia's China problem is now a developer problem

Nvidia’s H20 export trouble is the clearest sign yet that hardware policy now reaches directly into engineering decisions.

New US licensing restrictions on the H20, Nvidia’s China-targeted AI chip, forced the company to take a $4.5 billion charge in Q1. It also projected an $8 billion Q2 revenue hit and left Chinese revenue forecasts out. Jensen Huang has been unusually direct about the effect of that kind of policy volatility on supply planning and R&D.

He’s right to complain. This is a systems problem.

For years, most teams treated GPU supply as a budget and availability issue. Now there’s a policy layer sitting on top of the stack. If you operate across regions, or your customers do, hardware selection has legal and logistical constraints that software teams can’t wave away.

The obvious result is more pressure to support multiple backends for training and inference. The harder part is that a lot of companies still overstate how portable their stack really is. Exporting ONNX and hoping for decent runtime behavior doesn’t cut it when kernels, graph optimizers, memory behavior, quantization support, and operator coverage all vary in production-relevant ways.

A simple ONNX Runtime setup can smooth over some of that:

import onnxruntime as ort

providers = [
"CUDAExecutionProvider",
"ROCMExecutionProvider",
"CPUExecutionProvider"
]

session = ort.InferenceSession("model.onnx", providers=providers)
inputs = {session.get_inputs()[0].name: input_tensor}
outputs = session.run(None, inputs)

That abstraction helps. It doesn’t solve the hard part.

Anyone who’s benchmarked the same model across CUDA, ROCm, and CPU backends already knows the pattern. Operator support differs. Memory copies appear in annoying places. Precision modes don’t line up neatly. A backend that passes functional tests can still collapse on throughput or tail latency.

So yes, Nvidia’s export restrictions are a business story. They’re also a warning for teams with lazy infrastructure assumptions. If your AI platform is still wired around CUDA-specific choices in CI, packaging, and runtime orchestration, you’ve got fragility baked in.

AMD is buying the parts that matter

AMD’s M&A activity has been less dramatic and a lot more coherent.

It picked up the team from Untether AI, adding inference-chip expertise. It acquired Brium to improve cross-platform AI software optimization. And it bought Enosemi for silicon photonics R&D aimed at future high-bandwidth interconnects.

That lines up with where the next fight is heading.

First, inference. Training gets the headlines, but enterprise AI is increasingly an inference economics problem. Latency, cost per token, memory footprint, and deployment flexibility matter more than benchmark theater. Untether AI’s talent matters because inference hardware has a different set of priorities from giant training parts. Data movement and locality often matter more than peak compute.

Second, software. Brium may be the least flashy deal here and the most immediately useful. Cross-platform optimization has gone from nice to have to mandatory. Model stacks are getting less Nvidia-exclusive, partly because of cost, partly because of supply, and partly because buyers are tired of having no fallback. If AMD can improve compiler, runtime, and graph optimization paths across frameworks, that matters more than another round of open-ecosystem messaging.

Third, interconnects. Enosemi points to a longer-term bet on silicon photonics. That matters because the next scaling wall in AI systems isn’t just compute. It’s moving data across packages, boards, and nodes without blowing out power budgets and synchronization overhead.

The source material points to 2x to 3x gains in distributed training synchronization from photonic-style interconnect improvements. That figure depends heavily on workload and system design, so take it as directional. The basic premise holds. Electrical links are becoming a bigger bottleneck as chiplet designs spread and clusters get denser. Optical signaling inside or between packages is one of the more credible answers on the table.

Most developers won’t touch silicon photonics directly anytime soon. Cluster architects and systems teams will. Application teams will still feel it in collective ops performance, sharding strategies, and what counts as a normal model-parallel design.

The post-CUDA story is real, but messy

There’s a temptation to say the industry is finally moving past CUDA. That goes too far.

CUDA still dominates. Nvidia still has the strongest mix of hardware, software maturity, and developer mindshare. If supply, geography, and budget weren’t constraints, most serious AI teams would still prefer to deploy on Nvidia.

What has changed is that the market is finally paying for heavy dependence on one vendor stack. That cost shows up in export controls, margin pressure, hardware shortages, and the engineering rework that comes when a second backend gets added late.

For technical teams, the priority shifts are pretty plain:

Benchmark on at least two accelerator paths before you need them.
Treat operator coverage and quantization behavior as release criteria.
Keep model export paths honest. Test the exported artifact, not just the training stack.
Build CI around backend variance instead of a single golden GPU box.
Know which parts of your stack are actually portable and which depend on vendor-specific kernels.

If you’re using lower-level libraries directly, the same rule applies. oneDNN, MIOpen, and other backend-specific optimizations are worth it when they cut real cost or latency. The trade-off is maintenance. Abstractions are nice until they leave performance on the table. Hand-tuned paths are nice until a small infra team has to support three hardware targets.

That’s the balance now.

What technical leaders should do with this

Three responses are worth taking seriously.

First, audit hardware lock-in honestly. Not in slide decks. In runtime tests, deployment manifests, and benchmark data. If your inference service claims backend portability, prove it under load.

Second, separate framework portability from production portability. A model that runs on multiple backends in a lab is not the same as a service that meets latency, observability, and rollback requirements across those backends.

Third, watch interconnect roadmaps, not just accelerator specs. A cluster’s useful life is increasingly shaped by memory bandwidth, packaging, and node-to-node communication. Raw TOPS figures don’t tell you enough.

Intel, Nvidia, and AMD are responding to the same pressure from different directions. Intel is cutting structure and narrowing focus. Nvidia is dealing with a political ceiling on part of its market. AMD is buying missing pieces in software, inference, and interconnect technology.

That doesn’t make hardware choices simpler. It does make one point hard to miss: teams that keep their stack portable, benchmarked, and adaptable will have a much easier time than teams still acting like the AI infrastructure market is stable.

Keep going from here

Useful next reads and implementation paths

If this topic connects to a real workflow, these links give you the service path, a proof point, and related articles worth reading next.

Relevant service

Data engineering and cloud

Build the data and cloud foundations that AI workloads need to run reliably.

Related proof

Cloud data pipeline modernization

How pipeline modernization cut reporting delays by 63%.

Intel plans AI GPUs to challenge Nvidia's grip on accelerator supply

Intel CEO Lip-Bu Tan said this week that Intel will start producing GPUs for the AI market Nvidia currently dominates. That matters for an obvious reason: demand still exceeds supply. It matters for another one too. A credible new GPU vendor could pu...

Nvidia resumes H20 GPU sales to China after U.S. export filing

Nvidia is resuming H20 AI GPU sales to China after filing with the U.S. Commerce Department. That reverses a position from just weeks ago, when China had effectively dropped out of Nvidia’s near-term revenue picture. The policy shift matters on its o...

CES 2026 points to edge AI as Nvidia, AMD, and Razer push on-device compute

CES 2026 had the usual stack of gadgets. The more useful signal came from somewhere else. AI is moving deeper into machines with hard latency limits, bad connectivity, safety constraints, and users who won't wait around for cloud round-trips. Nvidia,...