Sure, but in two years AI has gone from “impressive tool, but not a replacement for knowledge workers” to “the study where it beats our highest caliber of knowledge workers may have some methodological deficits.” In another two years it’s going to be curtains.
> the study where it beats our highest caliber of knowledge workers may have some methodological deficits
The point is that if the study can't validate the claims being made then we can't actually extrapolate from that claim. What you're predicting may or may come true, but the study (which is the topic at hand) isn't useful for supporting the assertion.
> Sure, but in two years AI has gone from “impressive tool, but not a replacement for knowledge workers” to “the study where it beats our highest caliber of knowledge workers may have some methodological deficits.”
With that kind of logic ... anything is possible.
Autopilots have been able to land planes for years (decades?), and yet they still don't land passengers planes at any increased rate.
I'd say if it does have methodological deficits, it should be ignored. Measuring a length with a wet spaghetti can only result in nonsense.
>the study where it beats our highest caliber of knowledge workers may have some methodological deficits.
That isn’t even remotely what this study is looking at.
Assuming it keeps improving at the same rate, which I think we are already seeing not play out. If you compare the first six months when GPT truly hit the mainstream to the previous six months, the improvements are not nearly as evident. That isn’t to say they aren’t noticeable, I could definitely tell it’s improving, but not nearly at the pace it once was.
There’s also the fact that they can’t possibly keep improving frontier models at the same rate (I.e. training investment) when investment starts slowing down. The amount of cash being burned is completely unsustainable and you’re already seeing some pullback.
Your “some methodological deficits” is doing a lot of work.
"the study that claims it beats our highest caliber of knowledge workers has methodological deficits" ftfy
so extrapolating from that, in another two years it will continue to bamboozle
The issue is, it almost always outperforms knowledge workers.
IF the right questions are asked, and IF steered into and corrected at a few crucial points. IF not it goes off in the wrong direction really quick and that's a problem that's still mostly unsolved in the last 2 years.
And that can be catastrophic in high risk environments, like legal, medical or high risk software products where being wrong in the wrong place can mean bankruptcy or even cost a life.
I help run a few marketing websites where I let the CEO's run crazy with Claude cowork, they are making PR's like a madman, but they are not allowed to touch any of the API's & platforms where there is real user data & sensitive information.