Is there any evidence or hints that these actually work? It seems pretty reasonable that any scrap...

madeofpalk • today at 12:15 PM • 8 replies • view on HN

Is there any evidence or hints that these actually work?

It seems pretty reasonable that any scraper would already have mitigations for things like this as a function of just being on the internet.

Replies

raincole • today at 1:57 PM

It might work against people just use their Mini Mac with OpenClaw to summarize news every morning, but it certainly won't work against Google.

More centralized web ftw.

➕ show 3 replies

sd9 • today at 12:18 PM

Even it did work, I just can't bring myself to care enough. It doesn't feel like anything I could do on my site would make any material difference. I'm tired.

➕ show 1 reply

xyzal • today at 3:33 PM

About two years ago, I made up reference to a nonexistent python library and put code "using" it in just 5 GitHub repos. Several months later the free ChatGPT picked it up. So IMO it works.

➕ show 1 reply

bediger4000 • today at 2:46 PM

The search engine crawlers are sophisticated enough, but Meta's are not. Neither is Anthropic's Claude crawler. Source: personal experience trying garbage generators on Yandex, Blexbot, Meta's and Anthropics crawlers.

I'm completely uncertain that the unsophisticated garbage I generated makes any difference, much less "poisons" the LLMs. A fellow can dream, can't he?

spiderfarmer • today at 1:47 PM

There are hundreds of bots using residential proxies. That is not free. Make them pay.

m00dy • today at 1:33 PM

it won't work, especially on gemini. Googlebot is very experienced when it comes to crawling. It might work for OpenAI and others maybe.

nubg • today at 12:37 PM

What kind of migitations? How would you detect the poison fountain?

➕ show 2 replies

phoronixrly • today at 12:56 PM

It does work, on two levels:

1. Simple, cheap, easy-to-detect bots will scrape the poison, and feed links to expensive-to-run browser-based bots that you can't detect in any other way.

2. Once you see a browser visit a bullshit link, you insta-ban it, as you can now see that it is a bot because it has been poisoned with the bullshit data.

My personal preference is using iocaine for this purpose though, in order to protect the entire server as opposed to a single site.

alt Hacker News

Replies