GV leads Blacksmith's $10M Series A four months after its seed
Google Ventures has led another round in Blacksmith just four months after leading the startup’s $3.5 million seed. The new raise is a $10 million Series A, and the timing matters almost as much as the number. Investors usually move this quickly when...
Blacksmith’s fast follow from GV points to a real bottleneck: CI can’t keep up with AI-heavy software teams
Google Ventures has led another round in Blacksmith just four months after leading the startup’s $3.5 million seed. The new raise is a $10 million Series A, and the timing matters almost as much as the number. Investors usually move this quickly when customers are showing up fast, or when they think a narrow infrastructure opening won’t stay narrow for long.
Blacksmith says it now has more than 700 customers and grew annual recurring revenue from $1 million to $3.5 million in that stretch. Its pitch is pretty blunt by developer tools standards: keep GitHub Actions, change a line in your config, run jobs on Blacksmith’s high-clock bare-metal CPUs, and get builds that finish up to 2x faster at up to 75% lower compute cost.
Those are large claims. They also map cleanly to a problem plenty of engineering teams now have.
AI coding tools are driving up commit volume, PR churn, and automated refactors. That looks great until the CI queue starts backing up and everyone is waiting on tests. Then the slowdown isn’t code generation. It’s validation.
Why this is landing now
For a long time, CI was treated as commodity plumbing. GitHub Actions, CircleCI, Buildkite, GitLab CI. Pick one. The workflow layer got the attention, while the underlying compute was mostly treated as interchangeable.
Blacksmith is betting that assumption is starting to break. Its view is that the next big gain in CI comes from compute choices most platforms have abstracted away. Instead of generic cloud VMs, it runs jobs on bare-metal systems with gaming-grade CPUs, chips tuned for strong single-thread and mixed-thread performance.
That may sound old-school. It also makes sense.
A lot of CI work is still CPU-bound. Compilers, test runners, bundlers, packaging, JVM warmup, Python environment setup, TypeScript builds, Rust compilation. These jobs often don’t scale cleanly across more cores, and they care a lot about clock speed, cache behavior, and consistency from run to run.
Generic cloud VMs are flexible. They’re often less good at low jitter. Virtualization overhead, shared tenancy, and noisy neighbors all add variance. For many production workloads, that’s fine. For CI, it’s a headache. A build that usually takes six minutes but sometimes takes eleven is often worse than one that reliably takes eight.
Plenty of teams miss that. Average speed matters. Predictability matters just as much.
Why the pitch works
Blacksmith isn’t asking teams to throw out GitHub Actions or retrain everyone on a new CI DSL. That alone explains a lot of the traction.
The usual setup is to keep the existing workflow and redirect execution to Blacksmith-managed runners. In practice, that often means changing runs-on from something like ubuntu-latest to a self-hosted runner label.
A simplified example:
jobs:
build:
runs-on: [self-hosted, blacksmith, x86_64-16c]
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
- run: npm ci && npm test
That’s an easier sell than replacing the whole CI platform.
It also lets Blacksmith ride on top of GitHub’s workflow model, secrets handling, and developer familiarity. Migration friction is a big deal in this market. Engineers will accept modest improvements for a one-line change. Most won’t sit through a six-week platform rewrite unless the payoff is overwhelming.
So the wedge is well chosen: execution first, analytics later.
Bare metal still matters
The technical thesis here isn’t exotic. It’s just the sort of thing cloud-heavy infrastructure planning tends to gloss over.
Bare metal gives tighter control over scheduling, hardware configuration, local disk, and runtime isolation. For short-lived, bursty CI jobs, that can turn into practical gains:
- better single-core and turbo-clock performance for compilers and bundlers
- less scheduler interference from neighboring tenants
- steadier wall-clock times across repeated runs
- better economics if hardware utilization stays high
That last point matters. Blacksmith CEO Aditya Jayaprakash told TechCrunch that bare metal gives the company much better control over its economics than relying on hyperscalers. That’s believable. If you own or lock in the hardware and keep it busy with lots of small CI jobs, your cost structure can beat cloud list pricing by a wide margin.
The trade-off is operational pain. Running a bare-metal fleet well is harder than spinning up cloud instances. Provisioning, image hygiene, capacity planning, hardware failures, scheduling. All of that lands on the provider. That’s also why this can work as a startup business. Most teams don’t want that job.
Where the gains are real, and where they probably aren’t
The best fit is fairly obvious.
Blacksmith should do well on:
- CPU-bound builds and test suites
- large monorepos with frequent incremental compiles
- JavaScript and TypeScript projects with heavy bundling
- Rust, Go, Java, Ruby, and C/C++ workloads where compiler or test runtime dominates
- teams with enough CI volume that shaving minutes turns into real developer time and lower spend
The fit is weaker for jobs dominated by network or storage bottlenecks. If your pipeline spends most of its time pulling giant artifacts, waiting on remote registries, or pushing container layers over slow links, faster CPUs won’t fix much. Same for GPU-heavy workflows. Blacksmith hasn’t positioned itself around training or inference, and there’s no reason to pretend this is a general answer for ML infrastructure.
That doesn’t hurt the core case. It keeps the pitch grounded.
AI coding is changing the shape of CI demand
The funding headline is one thing. The underlying shift is more interesting.
AI-assisted development changes the profile of CI demand. Teams using coding agents or aggressive autocomplete tools tend to produce more changes, more often. They also generate a lot of low-confidence churn: refactors, broad edits, generated tests, dependency updates, and quick follow-up fixes. That pushes build volume up and raises the need for stronger validation.
So the bottleneck moves.
For years, developer speed was mostly framed around writing code. Now a lot of teams are limited by review throughput, test throughput, and merge confidence. You can generate ten candidate patches in minutes. You still need to know which ones are safe.
That’s why CI infrastructure looks a lot less boring than it used to.
The next layer is visibility
Blacksmith is also pushing test analytics and broader CI observability. That’s the right move.
Raw speed helps, but only up to a point. Teams eventually want to know which tests are flaky, which jobs dominate the critical path, where caches are missing, and why p95 build times keep creeping up. GitHub Actions exposes some of this, but not enough for large engineering orgs.
If Blacksmith can pair faster execution with useful analysis, it has a stronger moat than bare-metal runners alone. CI data is valuable when it’s surfaced well. It shows where to shard tests, which stages need cache tuning, and whether a monorepo is becoming structurally slow.
That’s also where the field gets tougher. CircleCI, Buildkite, GitLab, and Datadog-adjacent tooling all know this. Speed gets you in the door. Staying power usually comes from insight.
What teams should check before switching
The one-line migration pitch is appealing, but teams should still profile their pipelines before moving anything.
Start with a basic question: is the pipeline actually CPU-bound? Look at p50 and p95 times by stage, including compile, test, packaging, and artifact upload. If the long pole is network I/O or registry pulls, the results may be disappointing.
Then check whether your workflow can even use faster hardware well. Plenty of teams have slow CI because of poor job topology, not weak compute. Missing caches, serialized stages, oversized test suites, and bad concurrency settings can waste any runner.
A few practical checks:
- use ephemeral runners where possible
- key dependency caches correctly by lockfile or build graph
- enable test sharding for large suites
- cancel superseded runs on active branches
- prefer short-lived OIDC credentials over long-lived secrets
- lock down egress and keep runner images immutable
Security needs attention too. Bare metal doesn’t automatically make anything safer. Isolation still depends on job sandboxing, machine reset hygiene, and credential handling. Self-hosted-style runners have a long history of turning into quiet security problems when teams get sloppy.
What GV is betting on
GV’s fast follow suggests two things.
First, Blacksmith has enough commercial traction to look like a real company, not a small optimization layer with a good demo. Seven hundred customers and $3.5 million ARR this early will get investors moving.
Second, CI performance is a serious wedge again. That’s a sensible read of the market. As software teams generate more code with AI tools, more of the cost shifts to proving that code won’t break production.
That puts pressure on every incumbent CI vendor. Performance can’t stay vague. Cost per build can’t either. If a startup can slot underneath GitHub Actions, run jobs faster on high-clock bare metal, and add the visibility teams actually need, the old "CI is good enough" line starts to look flimsy.
For engineering leaders, the takeaway is straightforward: if code generation got faster but merge time still drags, stop staring at the editor. Look at the runners.
Useful next reads and implementation paths
If this topic connects to a real workflow, these links give you the service path, a proof point, and related articles worth reading next.
Build AI-backed products and internal tools around clear product and delivery constraints.
How analytics infrastructure reduced decision lag across teams.
AI can write code faster than most teams can safely ship it. That gap costs real money. Harness has raised $240 million in a Series E at a $5.5 billion valuation, with $200 million in primary capital led by Goldman Sachs and a planned $40 million ten...
Greptile, a startup building AI-assisted code review, is reportedly raising a $30 million Series A led by Benchmark at a $180 million valuation. For a company founded in 2023, that’s fast. It also points to a specific shift in the market. AI coding c...
Peter Sarlin is back with a familiar thesis from the Silo AI years before AMD bought the company for $665 million. Build the layer enterprises will need before the underlying hardware is fully ready. This time, the hardware is quantum. Sarlin’s new s...