Hi, I’m Max from the OpenAI security team. We appreciate the security research here, and it’s unfort...

maxburkhardt • today at 1:44 AM • 14 replies • view on HN

Hi, I’m Max from the OpenAI security team. We appreciate the security research here, and it’s unfortunate this one slipped through a crack in our disclosure pipeline. As we’re now aware of this report, we’ve taken immediate steps to protect users against potential attacks in this area by removing the model’s ability to generate Apps Script code, which should eliminate the risk to users of ChatGPT for Google Sheets. We’re taking a close look at how this feature interacts with Google Sheets APIs and re-evaluating our sandboxing approach to make sure this product is as resistant as possible against prompt injection attacks. More broadly, we’ll be doing a re-review of similar functionality in other surfaces to make sure that our defenses are consistent and effective across the board.

Replies

_verandaguy • today at 5:21 PM

It would be good to understand how exactly a frontier lab is approaching "removing the model's ability" to do a thing.

There's an ocean of difference between e.g. preventing the model from routing to something at the firewall level and just updating the prompt (especially given models' historically poor understanding of negative prompts, relatively speaking).

lionkor • today at 9:34 AM

Hi Max, thanks for replying here!

These "defenses", are they "just" long sentences in the prompt begging the AI to not follow through with stuff like this? Or is it more like sub-agents running in sandboxes?

blitzar • today at 5:48 AM

Oops I did it again ...

We're Sorry

➕ show 1 reply

jappgar • today at 11:07 AM

Is the disclosure pipeline monitored by chatgpt?

dogleash • today at 5:12 PM

>this one slipped through a crack

Oh, whoopsie!

da_grift_shift • today at 6:19 AM

>We appreciate the security research here

>it’s unfortunate this one slipped through a crack in our disclosure pipeline

>As we’re now aware of this report

This isn't the first time. https://x.com/PhilipTsukerman/status/1988634162773778501 https://x.com/_xpn_/status/1986382527817564437

What very likely happened here is you received good faith security research by email and you forced the researcher to submit through HackerOne or Bugcrowd or whatever, which mandates their compliance with Platform Terms and Disclosure Terms and Codes of Conduct and whatnot.

The SECURITY.md files in your GitHub repos only mention the email address. Can researchers like this one report issues via email and get a response, or not?

    May 08, 2026    PromptArmor discloses to OpenAI via email
    May 08, 2026    OpenAI sends an automated reply, confirming the intended reporting channel
    May 08, 2026    PromptArmor confirms email preference
    May 12, 2026    PromptArmor follows up
    May 18, 2026    PromptArmor follows up

bgro • today at 3:37 PM

How does this slip through the cracks? This is exactly the type of stuff I constantly find at work. Even when I’m trying to actively not find it. I don’t understand how other devs ship a high risk feature then don't test it or think about it in any capacity other than their one happy path.

I keep trying to explain this to devs but there’s nothing out there except screaming over me about how great leetcode is or more recently it’s how great various AI uses are. Just completely ignorant isolated screaming to dismiss people like me putting in the work fix slop that steals all attention praise and career advancement or even getting through the slop hiring process.

This is directly caused by slop leetcode style hiring.

I have no doubt this finding is just the tip of the iceberg.

➕ show 1 reply

altmanaltman • today at 7:33 AM

So if it wasn't for Hacker News and you randomly chancing upon it, your users would not have been protected against potential attacks? That's a pretty bad look, especially given that OpenAI ignored their initial disclosure via the channels the company provided.

That doesn't sound like a one-trillion-dollar company is supposed to operate, does it?

➕ show 1 reply

bflesch • today at 9:18 AM

When I reported to you, I received zero reaction. The security@ is a joke, you'll receive an AI word soup.

Enjoy your Ferrari though

➕ show 2 replies

user3939382 • today at 5:41 AM

> removing the model’s ability to generate Apps Script code

I use this feature with my agents on a daily basis so hopefully you develop a more surgical approach to security here and restore this

➕ show 1 reply

throw7 • today at 4:33 PM

[flagged]

hansmayer • today at 7:02 AM

[dead]

alt Hacker News

Replies