> It's still a big issue that the models will make up plausible sounding but wrong or mislea...

stacktrace • last Friday at 8:51 AM • 6 replies • view on HN

> It's still a big issue that the models will make up plausible sounding but wrong or misleading explanations for things, and verifying their claims ends up taking time. And if it's a topic you don't care about enough, you might just end up misinformed.

Exactly! One important thing LLMs have made me realise deeply is "No information" is better than false information. The way LLMs pull out completely incorrect explanations baffles me - I suppose that's expected since in the end it's generating tokens based on its training and it's reasonable it might hallucinate some stuff, but knowing this doesn't ease any of my frustration.

IMO if LLMs need to focus on anything right now, they should focus on better grounding. Maybe even something like a probability/confidence score, might end up experience so much better for so many users like me.

Replies

biofox • last Friday at 10:41 AM

I ask for confidence scores in my custom instructions / prompts, and LLMs do surprisingly well at estimating their own knowledge most of the time.

➕ show 4 replies

robocat • last Friday at 9:34 AM

> wrong or misleading explanations

Exactly the same issue occurs with search.

Unfortunately not everybody knows to mistrust AI responses, or have the skills to double-check information.

➕ show 4 replies

actionfromafar • last Friday at 10:20 AM

I wonder if the only way to fix this with current LLMs, would be to generate a lot synthetic data for a select number topics you really don't want it "go off the rails" with. That synthetic data would be lots of variations on that "I don't know how to do X with Y".

➕ show 1 reply

RHSman2 • last Friday at 3:23 PM

The problem is not the intelligence of the LLM. It is the intelligence and desire to make things easy of the intelligence using them.

XCSme • last Friday at 10:37 AM

But most benchmarks are not about that...

Are there even any "hallucination" public benchmarks?

➕ show 1 reply

basisword • last Friday at 3:09 PM

I think the thing even worse than false information is the almost-correct information. You do a quick Google to confirm it's on the right page but find there's an important misunderstanding. These are so much harder to spot I think than the blatantly false.

alt Hacker News

Replies