There's some interesting technical details in this release:
> Privacy Filter is a bidirectional token-classification model with span decoding. It begins from an autoregressive pretrained checkpoint and is then adapted into a token classifier over a fixed taxonomy of privacy labels. Instead of generating text token by token, it labels an input sequence in one pass and then decodes coherent spans with a constrained Viterbi procedure.
> The released model has 1.5B total parameters with 50M active parameters.
> [To build it] we converted a pretrained language model into a bidirectional token classifier by replacing the language modeling head with a token-classification head and post-training it with a supervised classification objective.
I'm no where near as smart as OpenAI of course, but I did build https://tools.nicklothian.com/webner/index.html that uses a BERT based named-entity-recognition model running in your browser to do a subset of PII redaction.
It works pretty well for the use cases I was playing with.
The OpenAI model is small enough that I might enhance my tool to use it.
On a side note, when I click the link it redirects me to machine-translated version of OpenAI website with completely botched meaning - the word “redacted” is translated to a false friend “redagować” which means to edit/refine text, not anonymize.
It would be nice if their examples weren’t mostly things that are easy to catch with regex, but it’s cool to see if released as an open, local model.
SuperagentLM made available on-edge PPI redaction models already a few years ago in sizes 20B, 3B, 200M. They still seem to be available via their legacy API - well worth checking out to compare against this one. https://docs.superagent.sh/legacy/llms/superagent-lm-redact-...
Curious how this compares to presidio which mixes regex with a model: https://microsoft.github.io/presidio/
I'm surprised nobody else has commented on this. This is a very straightforward and useful thing for a small locally runnable model to do.
Someone has created the reverse of it: https://github.com/chiefautism/privacy-parser
> The model is available today under the Apache 2.0 license on Hugging Face (opens in a new window) and Github (opens in a new window).
Bringing back the Open to OpenAI..
50M effective parameters is impressively light. Is there a similarly light model on the prompt injection side? Most of the mainstream ones seem heavier
Can someone explaon how can I reconstruct the original entities back if there are, for example, more than one person names?
This is exactly where stochastic approaches feel uncomfortable.
For anything touching security or privacy, even small inconsistencies can quickly erode trust.
I assume they use this model to be able to train new models with user data.
This is where stochastic approaches start to feel a bit uncomfortable.
Even small mistakes can make something dealing with sensitive data hard to trust. It seems useful as a first pass, but I’d probably still want some deterministic checks or a human in the loop to feel confident using it.
This looks actually useful. But can someone help me understand how you address the non-perfect scores: "Privacy Filter achieves an F1 score of 96% (94.04% precision and 98.04% recall)."
How would you actually use this if it can fail redacting 4% of the data. How do you reliably know which 4% failed?
Where's the gguf from Unsloth and co?
[dead]
Exciting! I took a look through the code and found what appear to be the entity types for future releases - this release (V2 config) supports 8 entity types, but the V4 and V7 taxonomies have >20, mostly more personal ID types. Given this is a preview release, I imagine they'll release these.
Details in my review article here: https://piieraser.ai/blog/openai-privacy-filter. Disclaimer: I also build PII detection systems.
[dead]
Working on this: https://github.com/KevinXuxuxu/anon_proxy, a sort of anonymization proxy to use with LLM providers. It does model (OpenAI privacy filter) + regex PII detection, and replaces them back-and-forth for API requests and responses. With locally hosted detection model, no PII leaves your local environment. I find it very useful especially when you're working on sensitive documents (legal, tax, immigration etc.), hope you find it helpful as well :)