I'm surprised nobody else has commented on this. This is a very straightforward and useful thin...

hiAndrewQuinn • last Thursday at 2:39 AM • 4 replies • view on HN

I'm surprised nobody else has commented on this. This is a very straightforward and useful thing for a small locally runnable model to do.

Replies

apothegm • last Thursday at 3:28 AM

And also something that it’s dangerous to try to do stochastically.

➕ show 2 replies

Fraaaank • today at 9:42 AM

From a compliance POV it's not enough. For example: "<NAME PERSON ONE> is president of the United States" is still identifiable even though the name has been redacted.

Since you can't be 100% certain that a filter redacts all personal data, you'd have to make sure that you have measures in place which allow OpenAI to legally process personal data on your behalf. Otherwise you'd technically have a data breach (from a GDPR pov).

And if OpenAI can legally process personal data on your behalf, why bother filtering if processing with filtering is also compliant?

hiAndrewQuinn • today at 3:45 AM

For the confused: this link must have gotten revived or something, I posted this comment a few days ago. Looks like it's getting the accolades I claim it deserves now.

➕ show 1 reply

ashwindharne • last Thursday at 3:08 AM

Same here, this is an incredibly useful thing to have in the toolkit

alt Hacker News

Replies