Google Cloud’s AI security problem shows why platforms need guardrails
Google Cloud COO Francis de Souza has been telling companies to treat AI security as a platform problem, not a cleanup task. He’s right. The timing is uncomfortable. Over the past several weeks, The Register has documented cases where Google Cloud de...
Google’s AI security advice is right. Its cloud billing mess shows how hard this gets
Google Cloud COO Francis de Souza has been telling companies to treat AI security as a platform problem, not a cleanup task. He’s right. The timing is uncomfortable.
Over the past several weeks, The Register has documented cases where Google Cloud developers were hit with five-figure bills after attackers used exposed API keys to call Gemini models. Some developers said they never knowingly enabled Gemini access on those keys. In at least two cases, Google refunded the charges after press coverage. The company also told The Register it doesn’t plan to change its automatic billing tier upgrades, which can raise spending ceilings based on account history.
That matters because Google is warning enterprises, correctly, that AI expands the attack surface across models, prompts, agents, data pipelines, and stale internal repositories. Meanwhile, parts of its own platform show how messy AI security gets when permissions, billing, credential scope, and product defaults shift faster than users can track.
Senior engineers should treat this as a warning and a checklist.
AI security now includes the model surface
De Souza’s strongest point is also the least controversial: AI systems widen the set of things that need protection.
Enterprise security used to center on users, devices, applications, databases, networks, and cloud resources. AI adds several pressure points:
- Model endpoints and inference APIs
- Training and fine-tuning datasets
- Retrieval pipelines and vector stores
- Prompts, system instructions, and tool definitions
- Agents with access to internal systems
- Generated outputs that may expose sensitive data
- API keys and service accounts with newly expanded product access
That last item is where Google’s current problem sits.
In older web apps, an API key exposed in frontend code was already risky, but some services were built around constrained public keys. Google Maps is the familiar example. Developers routinely placed Maps keys in browser-facing code because Google’s own setup guidance historically allowed that pattern, assuming restrictions such as HTTP referrers, API scopes, and quota controls were configured properly.
The risk changes when that same credential can call expensive AI services.
According to The Register, some Google Cloud users had keys originally deployed for Maps that later became capable of accessing Gemini APIs after Google expanded their scope. Attackers found the keys, made unauthorized Gemini calls, and ran up large bills quickly. Rod Danan, CEO of Prentus, said his account was charged $10,138 in roughly 30 minutes. Sydney-based developer Isuru Fonseka reportedly faced about AUD $17,000 in charges despite believing he had a $250 spending cap.
That’s a cloud control-plane problem with an AI price tag.
Billing controls are security controls
Cloud teams often treat billing controls as finance plumbing. With AI workloads, that’s a mistake.
Inference APIs can burn money at machine speed. If a compromised key can submit high-volume requests to a costly model, a spending cap becomes a practical security control. The same goes for quotas, per-service limits, anomaly detection, and automatic shutdown rules.
The reported Google cases are uncomfortable because users believed limits existed, while Google’s automated systems apparently upgraded billing tiers based on account history. Effective ceilings could rise as high as $100,000 without explicit user approval, according to The Register.
Google’s stated rationale is service continuity. It prioritizes preventing outages over enforcing stated budget preferences.
There is a real trade-off. Automatic tier increases can protect production systems during legitimate traffic spikes. A retailer during a launch event or a logistics platform during peak routing loads may not want a hard cap to kill service. AI inference changes the downside. An attacker can cause damage just by spending your money.
For teams running AI APIs in production, the safer default is strict:
- Separate keys by product and environment.
- Use hard quotas where the provider supports them.
- Alert on unusual token volume, request rate, region, model family, and caller identity.
- Avoid shared credentials across unrelated services.
- Treat budget-limit changes as security-relevant events.
- Require explicit approval for tier upgrades on accounts tied to AI services.
The operational question is blunt: can the cloud stop an attacker from turning your account into their inference cluster?
Revocation speed matters now
A second issue is more technical and more damning if the findings hold up.
Security firm Aikido found that deleted Google API keys may remain usable for up to 23 minutes while revocation propagates through Google’s infrastructure, according to The Register. Aikido researcher Joseph Leon said request success during that period was inconsistent, but sometimes more than 90% of calls still authenticated. Attackers could use that window to keep calling Gemini and potentially access files or cached conversation data.
Credential revocation is one of those details most developers assume cloud providers solved years ago. At hyperscale, nothing is instant everywhere. Still, 23 minutes is a long time when automated abuse is already underway.
Aikido’s comparison makes the issue harder to dismiss. Leon reported that Google service account API credentials revoke in about five seconds, while Gemini’s newer AQ-prefixed key format takes about a minute. Both operate at Google scale. That suggests the longer delay for older Google API keys is not an unavoidable law of distributed systems.
Propagation delays happen. Large cloud platforms cache authorization decisions for performance and reliability. If every request had to synchronously check a central authority, latency would suffer and outages could cascade. Caching is reasonable engineering.
Revocation semantics still need to match the risk. A public-facing Maps key from 2018 that can now reach Gemini should not behave like a low-risk static token. If deletion doesn’t mean fast denial, the UI and docs should say so plainly. The platform should also provide emergency kill switches that bypass normal cache TTLs.
Developers should assume revocation is not always immediate, even when the console implies it is. Incident response should include:
- Disable the affected API or project, not just the key, if abuse is active.
- Rotate credentials and remove broad product access.
- Check logs for calls during the post-deletion window.
- Review cached AI artifacts, uploaded files, and conversation history.
- Contact provider support quickly if charges are accumulating.
- Prefer service accounts or newer credential formats where revocation behavior is better documented.
A deleted key that still works for minutes is a bad surprise. During an AI abuse event, minutes are expensive.
Agents will find forgotten data
De Souza also raised a quieter risk that deserves more attention: internal agents can surface data security teams forgot existed.
A company may have old SharePoint servers, stale file shares, abandoned project buckets, or legacy Confluence spaces with permissive access controls. Before AI agents, those repositories might have been practically invisible. Nobody searched them. Nobody linked to them. The exposure existed, but discovery was low.
Agents change that. Give an enterprise assistant broad search rights and tool access, and it can connect data across systems humans rarely visit. That can be useful. It can also turn neglected access-control decisions into active leakage.
“Connect the chatbot to internal knowledge” is a risky product spec unless the permission model is clean. Retrieval-augmented generation systems often respect source permissions, but only if identity, indexing, ACL synchronization, and result filtering are implemented correctly. Even then, summarization can blur boundaries. A model may not show the original document, but it can still synthesize sensitive facts from it.
The security work starts before deployment:
- Inventory old repositories before indexing them.
- Reconcile access groups that haven’t been reviewed in years.
- Decide which systems agents can query, not just which users can ask questions.
- Log tool calls and retrieval results.
- Test with adversarial prompts and overprivileged accounts.
- Set retention rules for prompts, outputs, uploaded files, and retrieved context.
Enterprise AI makes least privilege harder because agents are designed to act across boundaries. Sloppy permissions become visible fast.
Machine-speed defense needs tight limits
De Souza argues that defenders need AI-native, agentic systems because attacks now move too fast for human-led response. He cited a striking number: the average time between initial breach and handoff to the next attack stage has dropped from eight hours to 22 seconds.
If that figure is even directionally right, traditional security operations workflows are in trouble. A ticket queue, a Slack escalation, and a human analyst triaging dashboards can’t match automated credential abuse, lateral movement, and data staging at that pace.
Agentic defense systems can help. They can correlate signals across identity, network, endpoint, SaaS, and cloud logs. They can quarantine workloads, revoke tokens, block egress, open incidents, and generate summaries for humans. Used carefully, they reduce response time and analyst fatigue.
The danger is overdelegation. An autonomous security agent with broad authority can break production, lock out legitimate users, delete evidence, or be manipulated through poisoned inputs. Security automation has always carried that risk. AI increases both the speed and the ambiguity.
A safer model is supervised autonomy with firm guardrails:
- Pre-approved actions for high-confidence cases, such as disabling a known leaked key.
- Human approval for destructive or business-impacting steps.
- Immutable audit logs for every automated action.
- Simulation and dry-run modes.
- Tight scoping by environment and asset class.
- Rollback paths.
“Human oversight” only works if operators get enough context and time to intervene. If an AI security tool takes ten actions in three seconds and explains them afterward, that’s not oversight in any meaningful engineering sense.
The platform defaults are the hard part
The awkward lesson from Google’s Gemini API billing incidents is that AI security depends heavily on platform defaults. Developers can follow reasonable practices and still get caught by a provider changing service scope, billing behavior, or revocation semantics.
Teams still have to protect keys. Public API keys should be restricted aggressively. Secrets shouldn’t be reused casually. Cloud projects need alerts and quotas. But providers also need to treat AI access as a high-risk capability, especially when enabling it on credentials created for older, cheaper, more constrained services.
For technical leaders, the practical move is to stop assuming cloud consoles reflect the security model you think you have. Verify it.
Check which APIs every key can call. Confirm whether spending caps are hard or advisory. Test revocation behavior. Move sensitive workloads to service accounts with narrow IAM roles. Keep AI services in separate projects or accounts where possible. Watch billing events like security events, because with AI APIs they often are.
Useful next reads and implementation paths
If this topic connects to a real workflow, these links give you the service path, a proof point, and related articles worth reading next.
Fix pipelines, data quality, cloud foundations, and reporting reliability.
How pipeline modernization cut reporting delays by 63%.
Google has launched an AI Futures Fund, a new program for startups building AI products. The funding gets the headline, but the practical value is lower down the stack: cloud credits, access to DeepMind models, and support from Google’s research and ...
TechCrunch Disrupt 2025 is putting two parts of the AI market next to each other, and the pairing makes sense. One is Greenfield Partners with its “AI Disruptors 60” list, a snapshot of startups across AI infrastructure, applications, and go-to-marke...
Runpod says it has reached a $120 million annual revenue run rate, with 500,000 developers on the platform and infrastructure across 31 regions. For a company that started in 2021 from a Reddit post and some reused crypto mining gear, that's a sharp ...