logoalt Hacker News

Edman274yesterday at 3:26 PM1 replyview on HN

> Yes you can. The same way Wikipedia (or, way back when, a paper encyclopedia) can be used for research but you have to verify everything with other sources because it is known there are errors and deficiencies in such sources.

I think that if Wikipedia had no recommendations on good sources for their own articles and did not ever ban sources, companies would not be so sanguine about letting people use Wikipedia. There's an entire internal process associated with evaluating sources, and the expectation when using Wikipedia is that nothing written in an article is going to be sourced from the Daily Mail or Conservapedia, as an example. Also, I do think that there are companies that do have policies against talking to known liars. Given the Wikipedia bans sources and news agencies ban human sources once they've been shown to be unreliable, I don't think it's insane to then have such companies or agencies say that AI shouldn't be used because it's been shown to be unreliable. Obviously there's a balancing act of utility versus accuracy, and Ars has (probably incorrectly) decided that the utility of AI outweighs its inaccuracies.

What is frustrating is that AI cannot have a higher accuracy than the median reporter, given a little more time. AI is trained on all digitizable text, including falsehoods and inaccuracies by laypeople. Humans can look up digitizable text using search engines, too. An AI can't follow up on leads or ask anyone questions. There's no world in which synthesizing available data from digitized sources alone ends up with more accurate data than a human with a search engine and the ability to make a phone call. So allowing LLM use at all is a direct admission that seeking out the "truth" is not an important goal because it could never actually improve accuracy and could only worsen it through hallucinated, probable reporting. It's one thing when companies say that they're committed to truth and then secretly their most important overriding concern is their bottom line - it's quite another thing when a company directly says that the bottom line is their most important concern. Imagine the emperor walking through the parade, nude, saying "So what if I am nude? What are you going to do about it?"


Replies

dspillettyesterday at 5:14 PM

> companies would not be so sanguine about letting people use Wikipedia

Are companies sanguine about using Wikipedia without verification. Maybe some, but they darn well shouldn't be. And I say this as someone who uses Wikipedia for many minor things (though for anything important, I verify elsewhere).

> Also, I do think that there are companies that do have policies against talking to known liars.

No doubt most/all. But such policies will always be caviated with exceptions if the information is properly validated afterwards.

> So allowing LLM use at all is a direct admission that seeking out the "truth" is not an important goal because it could never actually improve accuracy and could only worsen it through hallucinated, probable reporting.

I'm generaly anti-LLM, but this is… ad absurdum.

There is a huge difference between lazily accepting what an LLM spews out, and using that along with other sources for further research. No good reporter will trust a single source away from exceptional circumstances, wether that source is a person or an LLM, and what would be considered “exceptional circumstances” for trusting specific meat-sourced information won't apply for an LLM-sourced summary.

If you can trust Wikipedia as a starting point, you can trust a good LLM as a starting point. Both are offering a summary of what a bunch of people on the internet have written, neither should be trusted as a reliable source.

> I don't think it's insane to then have such companies or agencies say that AI shouldn't be used because it's been shown to be unreliable

If taking an absolutist approach. I would be a little more quakified and say that LLM output should never be used without verification of all details, rather than should not be used at all. It may be the case that this verification makes using LLMs no more efficient than doing research from other sources in the first place, and I suspect that this is the case often, if when using LLMs proper time is given to verifying the output.

The problem is people musunderstanding what an LLM is: a summariser, offering access to a compressed version of its sources. If you are using them as sources rather tham summarisers then you are using them wrongly. Unfortunately, that means a great many people are using them wrongly…