logoalt Hacker News

vessenestoday at 11:52 AM0 repliesview on HN

My experience with models that can reach high 90%-ile benchmark rates on tests is that often that last few percentage is arguable, vague, and often experts would disagree. You could try it yourself by training an MNIST classifier and seeing which digits your model inevitably cannot guess -- you'll be like "...wait a minute..."

Anyway, I have no idea what the underlying data here looks like, but I bet it's pretty unusual.

When I was working on my first job out of college, we were given a large contract and told to redact with black Sharpie every name of a company; it was a basic document prep exercise ahead of a strategy session for a competitor. Standard practice was to share general information but not specific. Our redaction error rate on 200 pages of contract was ... not 100%.