On 10 July 2025 security researcher Marco Figueroa published a proof-of-concept showing that a single booby-trapped email can hijack Google Gemini’s “Summarise this email” pane and insert fake warnings that look as if they came from Google itself (https://www.0din.ai/blog/phishing-for-gemini).
The exploit works by hiding extra instructions in white-on-white text or zero-pixel fonts inside the email’s HTML. Those words are invisible to the recipient but still processed by the language model, which treats them as top-priority commands.
Because Gmail passes the raw HTML to Gemini, the hidden block becomes part of the prompt. In Figueroa’s test a harmless newsletter was turned into a counterfeit security alert urging readers to call a bogus support number, demonstrating how the attack can succeed without links, attachments or malicious code.
Google says it has not detected large-scale abuse but is rolling out layered defences that include prompt-injection classifiers, reinforcement prompts telling Gemini to ignore suspicious text, and prominent yellow banners when risky content is blocked (https://security.googleblog.com/2025/06/mitigating-prompt-injection-attacks.html).
Industry analysts note that the incident illustrates a broader challenge for all AI-assisted products: any text fed to a model can double as executable instructions. How effectively vendors blunt that dual use may shape trust in everyday AI tools going forward.
Hi, this is a comment.
To get started with moderating, editing, and deleting comments, please visit the Comments screen in the dashboard.
Commenter avatars come from Gravatar.