Artificial intelligence June 5, 2026

Meta fixes Instagram account hijack flaw in its AI support chatbot

Instagram has fixed a security issue that let attackers hijack accounts by persuading Meta’s AI-powered support assistant to add a new email address and start a password reset flow, according to TechCrunch and public posts from affected users. The wo...

Meta fixes Instagram account hijack flaw in its AI support chatbot

Meta’s AI support bot reportedly helped hackers take over Instagram accounts

Instagram has fixed a security issue that let attackers hijack accounts by persuading Meta’s AI-powered support assistant to add a new email address and start a password reset flow, according to TechCrunch and public posts from affected users.

The worrying part is that the attack didn’t appear to require access to the victim’s existing email inbox. It depended on convincing the support system to trust an attacker-controlled address.

Security researcher Jane Wong said her Instagram account was among those taken over. “The password got changed without my knowledge and I was getting different password reset attempts throughout yesterday,” Wong wrote on X. “Quite concerning.”

Meta spokesperson Andy Stone said Monday that the issue had been fixed. Meta did not immediately respond to TechCrunch’s request for comment, and it’s still unclear how many accounts were improperly accessed.

The reported attack path

A video posted on X showed what appeared to be a step-by-step account takeover. TechCrunch said it verified that the attacker’s public email mailbox, visible in the video, received the verification code shown during the flow.

The sequence reportedly worked like this:

  1. The attacker used a VPN to spoof the target’s presumed location.
  2. They opened a chat with Meta AI Support Assistant.
  3. They asked the assistant to add a new email address to the victim’s Instagram account.
  4. The assistant sent a verification code to the attacker-provided email address.
  5. The attacker gave that code back to the chatbot.
  6. The chatbot surfaced a “Reset Password” button.
  7. The attacker set a new password and took control of the account.

The VPN detail matters. Large consumer platforms use location, device fingerprints, IP reputation, behavioral patterns, login history, and account metadata to score risk. If an attacker can make the request look geographically plausible, some automated defenses may treat the session as less suspicious.

The larger failure is in the support workflow. If the assistant can attach a new email address to an account without proof of control over an existing trusted factor, the account recovery flow has a broken trust boundary.

Basic account security rule: an untrusted channel shouldn’t be able to promote itself into a trusted recovery method.

When a chatbot can change account state

A bad chatbot answer is annoying. A support chatbot with permission to mutate account state is part of the security boundary.

That distinction matters for any company adding AI to customer support. Many deployments start with low-risk retrieval: answer questions, summarize policies, link to help pages. The risk changes sharply when the assistant can perform actions through internal tools, such as:

  • changing an email address
  • resetting a password
  • disabling two-factor authentication
  • issuing refunds
  • escalating privileges
  • modifying billing details
  • accessing private account metadata

At that point, the model is sitting in front of privileged APIs. The chatbot’s natural-language interface becomes another control plane. It needs the same rigor as an admin console or backend service, not the looser guardrails of a FAQ bot.

The Instagram incident appears to sit in that dangerous middle ground: an AI support agent was exposed to users, while also having enough authority to affect account recovery. If the reported flow is accurate, the model didn’t need a sci-fi “jailbreak.” It followed a bad business process.

That’s a common shape for these failures. The AI gets attention because it’s visible, but the deeper problem is usually permissions, state transitions, and identity assurance.

The trust model looks backwards

Account recovery has to answer a messy question: who owns the account when the normal login path fails?

Platforms usually rely on a stack of signals:

  • access to the current email or phone number
  • previously used devices
  • session cookies
  • government ID checks in some cases
  • social graph signals
  • historical login locations
  • two-factor authentication recovery codes
  • support review for high-risk cases

None of these are perfect. Email accounts get compromised. Phones get lost. People travel. Devices break. Attackers know enough OSINT to mimic victims. Good recovery systems still avoid granting trust based only on a newly introduced factor.

The alleged Instagram flow did something risky: it treated proof of access to the attacker’s own email inbox as meaningful evidence. A verification code sent to a new address only proves the requester controls that address. It says nothing about their relationship to the Instagram account.

That’s a classic confused-deputy problem, with an AI assistant acting as the deputy. The system was asked to help recover an account, then apparently executed a step that turned an attacker-supplied identifier into a recovery credential.

For developers, this should feel familiar. It’s the same class of mistake as accepting a user-provided user_id in an API request and trusting it without checking authorization against the current session. The interface is conversational. The bug is old.

AI agents need hard authorization boundaries

Many AI product teams are building “agentic” support systems: models that can look up account state, call tools, make changes, and complete tasks. The pitch is obvious. Human support is expensive and slow. Automation can cut ticket volume and handle common requests instantly.

There’s real value there. But if the AI layer can call privileged tools, prompts can’t be the main security control.

Prompt instructions can shape behavior. They can’t enforce authorization. A model can misunderstand intent, follow the wrong workflow, over-trust user input, or get steered into a risky branch by persuasive language. Even without malicious prompt injection, support conversations are full of ambiguous requests.

A safer architecture is boring and strict:

  • The model can propose an action, but a policy engine decides whether it’s allowed.
  • Sensitive mutations require verified authentication state outside the chat.
  • New recovery factors can’t be added without proving control of an existing trusted factor.
  • Risk scoring happens server-side, not inside the model’s reasoning.
  • Tool calls use scoped permissions, not a broad internal support credential.
  • High-risk flows require step-up verification or human review.
  • Every tool call is logged with the user, session, model output, tool input, and policy decision.

Treat the chatbot as an interface, not an authority.

The model shouldn’t decide that a user “sounds legitimate.” It should collect information, pass it to deterministic systems, and get a yes or no from services built for identity verification and fraud detection.

The VPN detail shows how fragile signals can be

The reported use of a VPN to match the target’s presumed location is another useful warning. Location-based risk signals help, but attackers adapt quickly.

If the attacker knows a victim lives in New York, London, Mumbai, or São Paulo, they can choose an exit node nearby. Residential proxy networks make this harder to detect because traffic may come from consumer IP ranges rather than obvious data center infrastructure.

That doesn’t make geolocation useless. It means location should be one weak signal among many. A request from the “right” city to add a new account email still deserves scrutiny if it comes from an unfamiliar browser, an unusual ASN, a fresh session, or a support flow that bypasses normal login.

For high-value actions, platforms need defense in depth. A plausible IP address should never be enough to lower friction all the way to account takeover.

What engineering teams should take from this

This incident matters for teams wiring large language models into support, billing, DevOps, internal tooling, or admin workflows.

The useful lesson isn’t to avoid AI support bots. It’s to give them constrained capabilities and explicit state machines around dangerous operations.

If your assistant can call tools, audit the tool surface:

  • Can it change credentials or recovery methods?
  • Can it view secrets, tokens, invoices, addresses, or private content?
  • Can it impersonate internal staff workflows?
  • Can it combine harmless actions into a harmful sequence?
  • Does the backend enforce authorization, or does it trust the assistant’s request?
  • Are model decisions reproducible enough for incident response?
  • Can users manipulate tool arguments through natural language?

That last point is easy to underestimate. In traditional web apps, user input usually lands in forms with typed fields and validation. In chat interfaces, the user can mix intent, data, persuasion, fake context, and instructions in one blob of text. The model then turns that into structured tool calls.

That translation layer is powerful. It’s also a new attack surface.

For security reviews, teams should test AI support systems like workflow engines. Don’t stop at prompt red-teaming. Red-team outcomes. Try to get the bot to perform state-changing operations using partial evidence, invented urgency, conflicting identity claims, or attacker-controlled contact details. Then verify that backend policy blocks the action even if the model tries.

Meta fixed this issue, but the pattern will repeat

Meta says the Instagram issue is fixed. That’s good, but the lack of public detail leaves important questions unanswered: whether the flaw was limited to one recovery path, how long it existed, how many users were affected, and what controls now prevent a similar abuse path.

For affected users, the usual advice applies: check account email and phone settings, review active sessions, enable two-factor authentication, and store recovery codes somewhere safe. That won’t protect against every support-side failure, but it raises the cost for attackers.

For platform teams, the harder work sits behind the UI. AI agents shouldn’t be able to turn attacker-supplied contact information into trusted recovery credentials. If they can, the model is carrying risk that belongs in policy code, authentication services, and fraud systems.

Support automation scales. Security failures scale with it.

Keep going from here

Useful next reads and implementation paths

If this topic connects to a real workflow, these links give you the service path, a proof point, and related articles worth reading next.

Relevant service
AI automation services

Design AI workflows with review, permissions, logging, and policy controls.

Related proof
Marketplace fraud detection

How risk scoring helped prioritize suspicious marketplace activity.

Related article
Meta AI’s share feature can publish private chatbot chats to a public feed

Meta built a chatbot app for one-on-one AI conversations, then added a sharing flow that can publish those conversations to a public feed tied to a user’s Instagram identity. TechCrunch surfaced it, and the problem is as bad as it sounds. People are ...

Related article
The security startups from Startup Battlefield that actually track new attack surfaces

TechCrunch’s Startup Battlefield surfaced a useful cluster of security companies this week, and the pattern is clear. The better ones aren’t slapping AI onto old product categories. They’re built around a simpler fact: models, agents, and synthetic m...

Related article
Anthropic embeds engineers at NSA to deploy Mythos for cyber operations

Anthropic has reportedly sent roughly half a dozen engineers to the National Security Agency to help the agency use Mythos, the company’s frontier cybersecurity model. That detail, first reported by the Financial Times and cited by TechCrunch, matter...