As Google has been unable to keep spammy crap out of their search index since at least 2006 when we ...

ChuckMcM • yesterday at 3:37 PM • 7 replies • view on HN

As Google has been unable to keep spammy crap out of their search index since at least 2006 when we were doing Blekko I doubt they will have much success fighting this. But it is another good example that "AI" is just glorified search and there is not reasoning or thinking going on behind the covers.

Replies

keeda • yesterday at 4:57 PM

> But it is another good example that "AI" is just glorified search and there is not reasoning or thinking going on behind the covers.

I don't think that follows. This is just LLMs being, for a lack of a better word, "gullible." How is it different from a person believing whatever they read on the Internet? People fall for spam and scams all the time, doesn't mean they are just glorified searches ;-)

It does highlight the problem facing any search engine though. AI-generated spam will be much harder to defend against with traditional, statistical mechanisms. And this is before we get to the existential problem of prompt injection.

Maybe this is where news organizations can win back their proper place in their relationship with Big Tech: by becoming the sources of verified, vetted information that LLMs can trust blindly. Possibly that's what deals like the OpenAI / Atlantic one are about.

➕ show 12 replies

K0balt • yesterday at 4:25 PM

Hmm. I don’t think that novel code generation can be accounted for with glorified search.

I can have my agentic system read a few data sheets, then I explain the project requirements and have it design driver specifications, protocols, interfaces, and state machines. Taking those, develop an implementation plan. Working from that, write the skeleton of the application, then fill it in to create a functional system using a novel combination of hardware.

Done correctly, I end up with better, more maintainable, smaller code than I used to with a small team, at 1/100 the cost and 1/4 the time.

Whatever that is, it more closely resembles reasoning than search.

Unless, of course, you’d also call bare metal C development on novel hardware search, in which case I guess all dev is search?

➕ show 4 replies

ge96 • yesterday at 8:24 PM

I did notice I had made videos/reddit post about vintage lenses and I was trying to figure out how old it was. The LLM would say an age eg. "made in 1940s" and reference my post which never mentioned the manufacturer date.

➕ show 1 reply

marginalia_nu • yesterday at 5:20 PM

Google has had ample ability to address this problem, it's really not that hard. The reason it remains such a difficult problem for them to solve is that most of the things that would solve the problem would also decimate their ad revenue.

kristianp • yesterday at 9:06 PM

> "AI" is just glorified search

Google's AI overview seems to be using RAG of their search snippets that is summarised by a very fast LLM. I wouldn't call that glorified search.

dlenski • yesterday at 7:24 PM

> "AI" is just glorified search

Even aside from out-and-out spam one of the extremely frustrating things about Google's AI overviews, compared to traditional search, is that the results are presented as coherent verging on authoritative even when they're not.

If you do an "old-fashioned" (udm=14) Google search for, let's say "vendor scsi commands appotech USB NAND flash chip": https://www.google.com/search?q=vendor+scsi+commands+appotec...

… you'll see that there are only a few links, and a lot of them are people who are trying to reverse-engineer the devices' behavior, and uncertain or confused about what they're doing. You get instant feedback that you're looking a dark corner for something that has little public documentation.

If you remove that `&udm=14` and look at the AI overview, Google gives you a confident-looking reply about available tools and techniques, even though some of what it links to are bit-rotted Russian-language forums and file download sites, and other places that likely won't solve your problem in a straightforward way… because that's all that's available for Google to mine.

winddude • yesterday at 8:19 PM

unable, nah, more profitable for the ads business. Yea.

alt Hacker News

Replies