It would have been nice to see some version of “I am very surprised by how far LLMs have come since ...

libraryofbabel • today at 1:25 AM • 7 replies • view on HN

It would have been nice to see some version of “I am very surprised by how far LLMs have come since I wrote the stochastic parrots paper, here is how I have revised my thinking.” But there is nothing like that and the author is just doubling down or trying to correct perceived “misinterpretations” of her work.

Meanwhile you have multiple Fields Medalists (Tau, Gowers) saying they’re very impressed by LLMs’ mathematical reasoning, something that the stochastic parrots thesis (if it has any empirically-predictive content at all) would predict was impossible. I doubt Tau and Gowers thought much of LLMs a few years ago either. But they changed their minds. Who do you want to listen to?

I think it’s time to retire the Stochastic Parrots metaphor. A few years ago a lot of us didn’t think LLMs would ever be capable of doing what they can do now. I certainly didn’t. But new methods of training (RLVR) changed the game and took LLMs far beyond just reducing cross entropy on huge corpuses of text. And so we changed our opinions. Shame Emily Bender hasn’t too.

Sigh.

Replies

ageedizzle • today at 4:02 AM

It's clear from this comment that you did not read the full article. If you did then you'd have seen that the author addresses this criticism you're making here.

➕ show 1 reply

marshray • today at 3:15 AM

The Parrots paper:

"Contrary to how it may seem when we observe its output, an LM is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot."

So perhaps this has always been a negative claim, about what language model AI is not.

➕ show 2 replies

mbauman • today at 1:58 AM

> stochastic parrots thesis (if it has any empirically-predictive content at all

Did you read TFA? This is precisely one of the non-questions that she answers.

➕ show 1 reply

harpiaharpyja • today at 2:26 AM

...did you read TFA?

tootie • today at 1:58 AM

She says explicitly it's not an empirical hypothesis. It's just a label for how they function. Which hasn't really changed even as they've gotten more useful. I haven't followed the full drama but this post is her saying the term has been frequently misapplied and she's basically distancing herself from some critiques that were misinterpreting her intent.

➕ show 1 reply

gessha • today at 2:53 AM

The appeal to authority is strong here. A tool stochastic parrot can be useful too.

seatsh • today at 1:39 AM

Gowers, Tao and Lichtman are especially impressed by the funding of math.inc and the AI for Math Fund, a joint venture of Renaissance Philanthropies and XTX Markets.

Renaissance Philanthropies is a front for VC companies.

They never publish allocated computational resources, prior art or any novel algorithm that is used in the LLMs. For all we know, all accounts that are known to work on math stunts get 20% of total compute.

In other words, they ignore prior art, do not investigate and just celebrate if they get a vibe math result. It isn't science, it is a disgrace.

➕ show 1 reply

alt Hacker News

Replies